Hartmut, I like to describe you as the Senior Dev for Kafka Streams. You are often the one who goes into stuck projects and enables the team to meet the deadline after all. If you could paint your role: What actually makes a real Senior?
Hartmut: For me, it is the ability to never lose sight of the big picture, but at the same time stand deep in the engine room. It is not enough to just draw nice architecture diagrams.
In everyday project life, I am perhaps 30% architect, depending on the project phase sometimes more, sometimes less. I design topologies and discuss the data model. But the rest of the time I am hands-on.
I write code, I debug live in the environment and, even if it is not always fun, I take care of the deployment pipelines and the monitoring. It is extremely important to me to know the exact state of the environments. Especially after a deployment.
For all the love of automated tests: I like to verify and evaluate manually myself. This not only trains the handling of the systems and debugging tools, but also gives the final certainty that everything is really running as expected.
As a Senior, you have to bear this end-to-end responsibility. From the blank sheet of paper to the running application in production. You must understand how your architecture affects the infrastructure.
That sounds like the classic firefighter role. You go where it burns. How do you take the team along, who might already be frustrated?
Hartmut: Communication is almost more important here than code. I am very present in the project, hang out a lot on Slack, pull people into calls. I often see teams that are actually competent but freeze in front of the complexity of the modern infrastructure.
Usually, the teams have made a good choice with the tech stack and the architecture and are not far off. There is still some lack of experience, some corrections needed, and a mental model of how data flows in interaction with distributed systems and partitioning is missing.
Often there are also fears of contact with the infrastructure. This concerns Kafka, Kubernetes/OpenShift and (Persistent) Volumes. The developers do not dare to really touch the environment, for fear of breaking something.
And how do you resolve this blockage?
Hartmut: By doing it together – across all environments. It is not just about the development environment. In everyday project life there is always a need for more comprehensive rollouts, migrations or sometimes a complete reset of an environment.
My approach here is not mere wall-driving, but deep understanding beforehand: What is necessary? What effects does our plan have? We create a plan together that considers all details and finds a balance between effort and downtime appropriate for the project.
When such a plan works out, it is extremely fulfilling. That is for me the best way to build expertise. Besides the purely technical knowledge, it also enormously trains professionalism, responsibility for the company and team spirit.
But often there is no room for that in everyday life? How do you manage to convey the expertise to the teams when the deadline is pressing?
Hartmut: Yes, I hear that argument often, but some time is actually always there – it is just often used incorrectly.
To convey expertise, I rely on a mix of different methods: Pair programming and also pair designing are at the top. In addition, regular knowledge sharings and group reviews to spread the knowledge in the team.
And when it comes to really understanding new concepts, I find hackathons, PoCs or MVPs a good solution.
My advice to team leads is: In consultation with the teams, actively plan hackathons into the sprint. Give the team 2 to 3 days. Let them build a small application that makes the complete cut-through once. This learning effect through doing it yourself in a short time is much higher than with any theoretical reading.
Not to forget, a critical evaluation or retro. The team becomes more self-confident, learns to estimate where the journey is going and with how much effort will be required.
Sometimes it also depends on the composition of the team. Do you have an eye on that too?
Hartmut: Absolutely. Sometimes you just have to reshuffle the team a bit.
There needs to be at least one person in every Kafka project who has this inner drive for solutions – someone who really wants to dig into the technology. If this role is missing, the project stalls.
As a Senior, I look at the dynamics: Who has the drive? Who do we perhaps have to deploy differently? Sometimes it is enough to shift the roles slightly to untie the knot.
About Hartmut Armbruster: Hartmut transforms complex data streams into robust, scalable solutions. As an expert for Apache Kafka Streams and initiator of the open KSTD standard, he helps teams unleash the full power of real-time data processing. His mission: To build the bridge from vision to running real-time application.
But the reason that Kafka Streams projects are not always easy is not purely organizational. Kafka Streams is also technically not exactly easy to master. Can we do something better there as a Kafka Community?
Hartmut: The Kafka Streams DSL and the basic technical understanding form the basis, but the real sticking point is the mindset for distributed systems. You always have to think along with partitioning. Without this understanding, you run into dead ends.
Another aspect that is just as comprehensive and demanding is the deployment model and operations. What can we do better as a community?
I believe especially in the areas of deployment, configuration, highly available operations, monitoring & analysis and ultimately troubleshooting, there could be more guides and best practices. Of course, not everything can be solved across the board; the domain is too complex for that – but there is certainly room for improvement.
Do you feel the technology is perhaps even too complex for some use cases?
Hartmut: Yes, that is a point that concerns me greatly at the moment. We have a gap in the ecosystem.
On the one side, we have simple producer/consumer applications. On the other side, the heavyweights like Kafka Streams or Flink. But for many customers who have medium-sized data volumes, Kafka Streams with its state management, rebalancing and ops requirements is often overkill.
I wish for a solution in the future that lies somewhere in between: Simpler than Kafka Streams, but more powerful than a pure consumer. Maybe we will see movements in the market in the next few years. Until then, however, we must learn to master the complexity of Kafka Streams.
Let’s come to the practical things in conclusion. Do you have concrete tool recommendations for the readership?
Hartmut: (laughs) Of course KSTD (Kafka Streams Topology Design).
Your own tool. Briefly tell us why one needs it.
Hartmut: Because code often doesn’t tell the whole truth – or at least not at a glance. KSTD is a framework and design standard to graphically design and document Kafka Streams topologies.
It helps enormously when you speak with the team or architects about complex joins and data flows and look at a picture instead of Java code. It closes the gap between code and architecture board.
I usually don’t write a line of code before I am confident through the design that the topology works. Since this part takes place in the head or in Excalidraw, it saves an immense amount of time. The actual implementation then often goes surprisingly quickly and is huge fun.
And another tool which is not yours? (laughs)
Hartmut: DuckDB.
DuckDB? The in-process SQL OLAP database? For Kafka?
Hartmut: Exactly. That is my absolute insider tip for debugging. Often you have situations where you just want to know: Where is the message with ID XY? or How is the distribution of keys in this topic?. Kafka itself does not offer SELECT * WHERE ID = ...
What I do: I take the Console Consumer, pipe the output as JSON and pipe it directly into DuckDB. DuckDB allows me to then unleash SQL on the incoming stream. I can filter, aggregate, evaluate JSON fields – all ad-hoc in the console.
That is ingenious. Sort of SQL debugging on the command line without infrastructure overhead.
Hartmut: Exactly. It is the fastest way to bring light into the darkness of the data without first building up a complex infrastructure.
Alexander Kropp, a platform engineer at DataFlow Academy, discusses cloud costs, the "copy-paste problem" in tech stacks, and why platform teams are the most important tech multipliers in a company.
Read more
Whether shopping online, using self-checkout in stores, or getting home delivery: Almost every German regularly interacts with REWE. What few people know: Behind the scenes, Apache Kafka ensures everything runs smoothly. Paul Puschmann and Patrick Wegner provide insights into the technological transformation of one of Germany's largest retailers.
Read more