Written by

CEO - Jacar Systems

Passionate about technology, cloud infrastructure and artificial intelligence. Writes about DevOps, AI, platforms and software from Madrid.

Arquitectura

arquitectura cdc event streaming kafka kraft pub sub

Kafka in 2023: Event Streaming in the Enterprise

August 30, 2023 8 min read 69 reads

Table of contents

Key takeaways
KRaft: goodbye ZooKeeper
Mature usage patterns
Change Data Capture (CDC)
Event sourcing
Stream processing
Competing alternatives
What’s still hard
Conclusion

Actualizado: 2026-05-03

Apache Kafka^[1] has moved from being “the messaging system for big enterprises” to being the event backbone of modern architectures. It has crossed several important thresholds — including the complete ZooKeeper exit via KRaft^[2] — that simplify operations and unlock new usage patterns.

Key takeaways

KRaft (Kafka Raft) integrates consensus inside Kafka and removes the ZooKeeper dependency: one system to operate, faster startup, and support for millions of partitions per cluster.
The three most consolidated patterns are CDC with Debezium, event sourcing, and stream processing with Kafka Streams, ksqlDB, or Flink.
Redpanda (C++, no JVM or ZooKeeper) and Pulsar (separated storage/compute) are real alternatives with different use cases.
Schema evolution, exactly-once cross-topic, and consumer rebalancing remain the most problematic areas.
For new projects the question isn’t “Kafka or not?” but “Kafka, Pulsar, Redpanda, or managed cloud?”.

KRaft: goodbye ZooKeeper

For over a decade, Kafka depended on ZooKeeper^[3] for cluster coordination, metadata management, and leader election. This meant two clusters to operate, two failure modes to understand, and two backup systems.

KRaft (Kafka Raft) integrates consensus inside Kafka. The advantages:

One system to operate. A Kafka cluster is, finally, a Kafka cluster.
Faster startup. No ZooKeeper synchronisation at boot.
Scalable metadata. Kafka in KRaft mode supports millions of partitions per cluster, vs ~200k with ZooKeeper.
Smaller footprint. One fewer role means less memory, fewer nodes, and less configuration.

In Kafka 3.5 (June), KRaft is GA for new clusters. For existing ZooKeeper clusters, in-place migration is in beta — not yet recommended for critical production, but the direction is clear.

Mature usage patterns

Kafka is used well in several distinct patterns. The most consolidated are:

Change Data Capture (CDC)

Capture database changes and publish them as events. Debezium^[4] is the de facto standard, with connectors for PostgreSQL, MySQL, MongoDB, Oracle, and SQL Server.

A typical pattern: replicate monolith tables to new services (dual-write pattern), enabling incremental migration without changing the monolith. The database remains source of truth; Kafka distributes its changes.

Event sourcing

Store the complete change history as immutable events. Any consumer can replicate state by replaying events. Powerful pattern but demanding: requires discipline in event design and schema evolution handling.

Works well in domains with mandatory traceability — financial audit, compliance. Less suited for general CRUD where the cognitive cost isn’t recovered.

Stream processing

Continuous processing over flows with Kafka Streams^[5], ksqlDB^[6], or Apache Flink^[7]. Typical use cases: real-time fraud detection, enriching events with reference data, continuous aggregations for dashboards.

Flink wins with complex state or sophisticated time windows. Kafka Streams fits better when the pipeline lives “inside” Kafka and you don’t want an additional cluster.

CDC pipeline architecture diagram with Debezium, Kafka, and multiple microservice consumers

Competing alternatives

Three projects worth knowing:

Apache Pulsar^[8]: separated storage/compute architecture, native multi-tenancy, geo-replication. Wins on large-scale operations, but smaller ecosystem.
Redpanda^[9]: C++ rewrite of the Kafka protocol, no JVM or ZooKeeper. Claims ~10x lower latency with ~1/6 the hardware. Compatible with existing Kafka clients.
Confluent Cloud^[10] / AWS MSK^[11]: managed Kafka for those willing to pay for not operating.

Choice depends on context: for greenfield with latency requirements, Redpanda is attractive. For mature ecosystem and broad tooling, Apache Kafka still wins. For teams without infrastructure operational culture, managed cloud.

What’s still hard

Four areas Kafka still doesn’t handle cleanly:

Schema evolution. With Avro + Schema Registry it works, but incompatible changes still need careful human coordination.
Exactly-once semantics cross-topic. Transactional producers work, but performance drops significantly. Many teams accept at-least-once + consumer idempotency.
Consumer rebalancing. In topics with many partitions and dynamic consumers, rebalances can take seconds or tens of seconds.
Fine-grained retention. Retaining data per-tenant or per-event is complex with Kafka’s retention policies.

Also see the RabbitMQ vs Kafka analysis — deciding which broker fits each case remains one of the most important architectural decisions. For the observability layer over Kafka, OpenTelemetry and the Grafana stack are the natural complements.

Conclusion

Kafka is mature infrastructure for enterprise-scale event streaming. With KRaft it’s operationally simpler; with the stream-processing ecosystem consolidated, usage patterns are well documented. For new projects, the question is no longer “Kafka or not?” but “Kafka, Pulsar, Redpanda, or managed cloud?”. Each has its moment.

Was this useful?

[Total: 14 · Average: 4.5]

Post Views: 69

Written by

Javier Cañete

CEO - Jacar Systems

Passionate about technology, cloud infrastructure and artificial intelligence. Writes about DevOps, AI, platforms and software from Madrid.

Kafka in 2023: Event Streaming in the Enterprise

Key takeaways

KRaft: goodbye ZooKeeper

Mature usage patterns

Change Data Capture (CDC)

Event sourcing

Stream processing

Competing alternatives

What’s still hard

Conclusion

Related posts

Hybrid RAG in 2026: the patterns that keep winning

MCP as multi-vendor standard: patterns already mature

Skills and subagents: the agent reuse pattern

Kubernetes 1.35 GA: an operations-side balance sheet