Agentic AI Archives - Kai Waehner https://www.kai-waehner.de/blog/category/agentic-ai/ Technology Evangelist - Big Data Analytics - Middleware - Apache Kafka Mon, 02 Jun 2025 05:09:50 +0000 en-US hourly 1 https://wordpress.org/?v=6.7.2 https://www.kai-waehner.de/wp-content/uploads/2020/01/cropped-favicon-32x32.png Agentic AI Archives - Kai Waehner https://www.kai-waehner.de/blog/category/agentic-ai/ 32 32 How Penske Logistics Transforms Fleet Intelligence with Data Streaming and AI https://www.kai-waehner.de/blog/2025/06/02/how-penske-logistics-transforms-fleet-intelligence-with-data-streaming-and-ai/ Mon, 02 Jun 2025 04:44:37 +0000 https://www.kai-waehner.de/?p=7971 Real-time visibility has become essential in logistics. As supply chains grow more complex, providers must shift from delayed, batch-based systems to event-driven architectures. Data Streaming technologies like Apache Kafka and Apache Flink enable this shift by allowing continuous processing of data from telematics, inventory systems, and customer interactions. Penske Logistics is leading the way—using Confluent’s platform to stream and process 190 million IoT messages daily. This powers predictive maintenance, faster roadside assistance, and higher fleet uptime. The result: smarter operations, improved service, and a scalable foundation for the future of logistics.

The post How Penske Logistics Transforms Fleet Intelligence with Data Streaming and AI appeared first on Kai Waehner.

]]>
Real-time visibility is no longer a competitive advantage in logistics—it’s a business necessity. As global supply chains become more complex and customer expectations rise, logistics providers must respond with agility and precision. That means shifting away from static, delayed data pipelines toward event-driven architectures built around real-time data.

Technologies like Apache Kafka and Apache Flink are at the heart of this transformation. They allow logistics companies to capture, process, and act on streaming data as it’s generated—from vehicle sensors and telematics systems to inventory platforms and customer applications. This enables new use cases in predictive maintenance, live fleet tracking, customer service automation, and much more.

A growing number of companies across the supply chain are embracing this model. Whether it’s real-time shipment tracking, automated compliance reporting, or AI-driven optimization, the ability to stream, process, and route data instantly is proving vital.

One standout example is Penske Logistics—a transportation leader using Confluent’s data streaming platform (DSP) to transform how it operates and delivers value to customers.

How Penske Logistics Transforms Fleet Intelligence with Kafka and AI

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And make sure to download my free book about data streaming use cases.

Why Real-Time Data Matters in Logistics and Transportation

Transportation and logistics operate on tight margins and stricter timelines than almost any other sector. Delays ripple through supply chains, disrupting manufacturing schedules, customer deliveries, and retail inventories. Traditional data integration methods—batch ETL, manual syncing, and siloed systems—simply can’t meet the demands of today’s global logistics networks.

Data streaming enables organizations in the logistics and transportation industry to ingest and process information in real-time while the data is valuable and critical. Vehicle diagnostics, route updates, inventory changes, and customer interactions can all be captured and acted upon in real time. This leads to faster decisions, more responsive services, and smarter operations.

Real-time data also lays the foundation for advanced use cases in automation and AI, where outcomes depend on immediate context and up-to-date information. And for logistics providers, it unlocks a powerful competitive edge.

Apache Kafka serves as the backbone for real-time messaging—connecting thousands of data producers and consumers across enterprise systems. Apache Flink adds stateful stream processing to the mix, enabling continuous pattern recognition, enrichment, and complex business logic in real time.

Event-driven Architecture with Data Streaming in Logistics and Transportation using Apache Kafka and Flink

In the logistics industry, this event-driven architecture supports use cases such as:

  • Continuous monitoring of vehicle health and sensor data
  • Proactive maintenance scheduling
  • Real-time fleet tracking and route optimization
  • Integration of telematics, ERP, WMS, and customer systems
  • Instant alerts for service delays or disruptions
  • Predictive analytics for capacity and demand forecasting

This isn’t just theory. Leading logistics organizations are deploying these capabilities at scale.

Data Streaming Success Stories Across the Logistics and Transportation Industry

Many transportation and logistics firms are already using Kafka-based architectures to modernize their operations. A few examples:

  • LKW Walter relies on data streaming to optimize its full truck load (FTL) freight exchanges and enable digital freight matching.
  • Uber Freight leverages real-time telematics, pricing models, and dynamic load assignment across its digital logistics platform.
  • Instacart uses event-driven systems to coordinate live order delivery, matching customer demand with available delivery slots.
  • Maersk incorporates streaming data from containers and ports to enhance shipping visibility and supply chain planning.

These examples show the diversity of value that real-time data brings—across first mile, middle mile, and last mile operations.

An increasing number of companies are using data streaming as the event-driven control tower for their supply chains. It’s not only about real-time insights—it’s also about ensuring consistent data across real-time messaging, HTTP APIs, and batch systems. Learn more in this article: A Real-Time Supply Chain Control Tower powered by Kafka.

Supply Chain Control Tower powered by Data Streaming with Apache Kafka

Penske Logistics: A Leader in Transportation, Fleet Services, and Supply Chain Innovation

Penske Transportation Solutions is one of North America’s most recognizable logistics brands. It provides commercial truck leasing, rental, and fleet maintenance services, operating a fleet of over 400,000 vehicles. Its logistics arm offers freight management, supply chain optimization, and warehousing for enterprise customers.

Penske Logistics
Source: Penske Logistics

But Penske is more than a fleet and logistics company. It’s a data-driven operation where technology plays a central role in service delivery. From vehicle telematics to customer support, Penske is leveraging data streaming and AI to meet growing demands for reliability, transparency, and speed.

Penske’s Data Streaming Success Story

Penske explored its data streaming journey at the Confluent Data in Motion Tour. Sarvant Singh, Vice President of Data and Emerging Solutions at Penske, explains the company’s motivation clearly: “We’re an information-intense business. A lot of information is getting exchanged between our customers, associates, and partners. In our business, vehicle uptime and supply chain visibility are critical.

This focus on uptime is what drove Penske to adopt a real-time data streaming platform, powered by Confluent. Today, Penske ingests and processes around 190 million IoT messages every day from its vehicles.

Each truck contains hundreds of sensors (and thousands of sub-sensors) that monitor everything from engine performance to braking systems. With this volume of data, traditional architectures fell short. Penske turned to Confluent Cloud to leverage Apache Kafka at scale as a fully-managed, elastic SaaS to eliminate the operational burden and unlocking true real-time capabilities.

By streaming sensor data through Confluent and into a proactive diagnostics engine, Penske can now predict when a vehicle may fail—before the problem arises. Maintenance can be scheduled in advance, roadside breakdowns avoided, and customer deliveries kept on track.

This approach has already prevented over 90,000 potential roadside incidents. The business impact is enormous, saving time, money, and reputation.

Other real-time use cases include:

  • Diagnosing issues instantly to dispatch roadside assistance faster
  • Triggering preventive maintenance alerts to avoid unscheduled downtime
  • Automating compliance for IFTA reporting using telematics data
  • Streamlining repair workflows through integration with electronic DVIRs (Driver Vehicle Inspection Reports)

Why Confluent for Apache Kafka?

Managing Kafka in-house was never the goal for Penske. After initially working with a different provider, they transitioned to Confluent Cloud to avoid the complexity and cost of maintaining open-source Kafka themselves.

“We’re not going to put mission-critical applications on an open source tech,” Singh noted. “Enterprise-grade applications require enterprise level support—and Confluent’s business value has been clear.”

Key reasons for choosing Confluent include:

  • The ability to scale rapidly without manual rebalancing
  • Enterprise tooling, including stream governance and connectors
  • Seamless integration with AI and analytics engines
  • Reduced time to market and improved uptime

Data Streaming and AI in Action at Penske

Penske’s investment in AI began in 2015, long before it became a mainstream trend. Early use cases included Erica, a virtual assistant that helps customers manage vehicle reservations. Today, AI is being used to reduce repair times, predict failures, and improve customer service experiences.

By combining real-time data with machine learning, Penske can offer more reliable services and automate decisions that previously required human intervention. AI-enabled diagnostics, proactive maintenance, and conversational assistants are already delivering measurable benefits.

The company is also exploring the role of generative AI. Singh highlighted the potential of technologies like ChatGPT for enterprise applications—but also stressed the importance of controls: “Configuration for risk tolerance is going to be the key. Traceability, explainability, and anomaly detection must be built in.”

Fleet Intelligence in Action: Measurable Business Value Through Data Streaming

For a company operating hundreds of thousands of vehicles, the stakes are high. Penske’s real-time architecture has improved uptime, accelerated response times, and empowered technicians and drivers with better tools.

The business outcomes are clear:

  • Fewer breakdowns and delays
  • Faster resolution of vehicle issues
  • Streamlined operations and reporting
  • Better customer and driver experience
  • Scalable infrastructure for new services, including electric vehicle fleets

With 165,000 vehicles already connected to Confluent and more being added as EV adoption grows, Penske is just getting started.

The Road Ahead: Agentic AI and the Next Evolution of Event-Driven Architecture Powered By Apache Kafka

The future of logistics will be defined by intelligent, real-time systems that coordinate not just vehicles, but entire networks. As Penske scales its edge computing and expands its use of remote sensing and autonomous technologies, the role of data streaming will only increase.

Agentic AI—systems that act autonomously based on real-time context—will require seamless integration of telematics, edge analytics, and cloud intelligence. This demands a resilient, flexible event-driven foundation. I explored the general idea in a dedicated article: How Apache Kafka and Flink Power Event-Driven Agentic AI in Real Time.

Agentic AI with Apache Kafka as Event Broker Combined with MCP and A2A Protocol

Penske’s journey shows that real-time data streaming is not only possible—it’s practical, scalable, and deeply transformative. The combination of a data streaming platform, sensor analytics, and AI allows the company to turn every vehicle into a smart, connected node in a global supply chain.

For logistics providers seeking to modernize, the path is clear. It starts with streaming data—and the possibilities grow from there. Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And make sure to download my free book about data streaming use cases.

The post How Penske Logistics Transforms Fleet Intelligence with Data Streaming and AI appeared first on Kai Waehner.

]]>
Data Streaming Meets the SAP Ecosystem and Databricks – Insights from SAP Sapphire Madrid https://www.kai-waehner.de/blog/2025/05/28/data-streaming-meets-the-sap-ecosystem-and-databricks-insights-from-sap-sapphire-madrid/ Wed, 28 May 2025 05:17:50 +0000 https://www.kai-waehner.de/?p=7962 SAP Sapphire 2025 in Madrid brought together global SAP users, partners, and technology leaders to showcase the future of enterprise data strategy. Key themes included SAP’s Business Data Cloud (BDC) vision, Joule for Agentic AI, and the deepening SAP-Databricks partnership. A major topic throughout the event was the increasing need for real-time integration across SAP and non-SAP systems—highlighting the critical role of event-driven architectures and data streaming platforms like Confluent. This blog shares insights on how data streaming enhances SAP ecosystems, supports AI initiatives, and enables industry-specific use cases across transactional and analytical domains.

The post Data Streaming Meets the SAP Ecosystem and Databricks – Insights from SAP Sapphire Madrid appeared first on Kai Waehner.

]]>
I had the opportunity to attend SAP Sapphire 2025 in Madrid—an impressive gathering of SAP customers, partners, and technology leaders from around the world. It was a massive event, bringing the global SAP community together to explore the company’s future direction, innovations, and growing ecosystem.

A key highlight was SAP’s deepening integration of Databricks as an OEM partner for AI and analytics within the SAP Business Data Cloud—showing how the ecosystem is evolving toward more open, composable architectures.

At the same time, conversations around Confluent and data streaming highlighted the critical role real-time integration plays in connecting SAP systems (including ERP, MES, DataSphere, Databricks, etc.) with the rest of the enterprise. As always, it was a great place to learn, connect, and discuss where enterprise data architecture is heading—and how technologies like data streaming are enabling that transformation.

Data Streaming with Confluent Meets SAP and Databricks for Agentic AI at Sapphire in Madrid

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And make sure to download my free book about data streaming use cases, focusing on industry scenarios, success stories and business value.

SAP’s Vision: Business Data Cloud, Joule, and Strategic Ecosystem Moves

SAP presented a broad and ambitious strategy centered around the SAP Business Data Cloud (BDC), SAP Joule (including its Agentic AI initiative), and strategic collaborations like SAP Databricks, SAP DataSphere, and integrations across multiple cloud platforms. The vision is clear: SAP wants to connect business processes with modern analytics, AI, and automation.

SAP ERP with Business Technology Platform BTP and Joule for Agentic AI in the Cloud
Source: SAP

For those of us working in data streaming and integration, these developments present a major opportunity. Most customers I meet globally uses SAP ERP or other products like MES, SuccessFactors, or Ariba. The relevance of real-time data streaming in this space is undeniable—and it’s growing.

Building the Bridge: Event-Driven Architecture + SAP

One of the most exciting things about SAP Sapphire is seeing how event-driven architecture is becoming more relevant—even if the conversations don’t start with “Apache Kafka” or “Data Streaming.” In the SAP ecosystem, discussions often focus on business outcomes first, then architecture second. And that’s exactly how it should be.

Many SAP customers are moving toward hybrid cloud environments, where data lives in SAP systems, Salesforce, Workday, ServiceNow, and more. There’s no longer a belief in a single, unified data model. Master Data Management (MDM) as a one-size-fits-all solution has lost its appeal, simply because the real world is more complex.

This is where data streaming with Apache Kafka, Apache Flink, etc. fits in perfectly. Event streaming enables organizations to connect their SAP solutions with the rest of the enterprise—for real-time integration across operational systems, analytics platforms, AI engines, and more. It supports transactional and analytical use cases equally well and can be tailored to each industry’s needs.

Data Streaming with Confluent as Integration Middleware for SAP ERP DataSphere Joule Databricks with Apache Kafka

In the SAP ecosystem, customers typically don’t look for open source frameworks to assemble their own solutions—they look for a reliable, enterprise-grade platform that just works. That’s why Confluent’s data streaming platform is an excellent fit: it combines the power of Kafka and Flink with the scalability, security, governance, and cloud-native capabilities enterprises expect.

SAP, Databricks, and Confluent – A Triangular Partnership

At the event, I had some great conversations—often literally sitting between leaders from SAP and Databricks. Watching how these two players are evolving—and where Confluent fits into the picture—was eye-opening.

SAP and Databricks are working closely together, especially with the SAP Databricks OEM offering that integrates Databricks into the SAP Business Data Cloud as an embedded AI and analytics engine. SAP DataSphere also plays a central role here, serving as a gateway into SAP’s structured data.

Meanwhile, Databricks is expanding into the operational domain, not just the analytical lakehouse. After acquiring Neon (a Postgres-compatible cloud-native database), Databricks is expected to announce an additional own transactional OLTP solution soon. This shows how rapidly they’re moving beyond batch analytics into the world of operational workloads—areas where Kafka and event streaming have traditionally provided the backbone.

Enterprise Architecture with Confluent and SAP and Databricks for Analytics and AI

This trend opens up a significant opportunity for data streaming platforms like Confluent to play a central role in modern SAP data architectures. As platforms like Databricks expand their capabilities, the demand for real-time, multi-system integration and cross-platform data sharing continues to grow.

Confluent is uniquely positioned to meet this need—offering not just data movement, but also the ability to process, govern, and enrich data in motion using tools like Apache Flink, and a broad ecosystem of connectors, including those for transactional systems like SAP ERP, but also Oracle databases, IBM mainframe, and other cloud services like Snowflake, ServiceNow or Salesforce.

Data Products, Not Just Pipelines

The term “data product” was mentioned in nearly every conversation—whether from the SAP angle (business semantics and ownership), Databricks (analytics-first), or Confluent (independent, system-agnostic, streaming-native). The key message? Everyone wants real-time, reusable, discoverable data products.

Data Product - The Domain Driven Microservice for Data

This is where an event-driven architecture powered by a data streaming platform shines: Data Streaming connects everything and distributes data to both operational and analytical systems, with governance, durability, and flexibility at the core.

Confluent’s data streaming platform enables the creation of data products from a wide range of enterprise systems, complementing the SAP data products being developed within the SAP Business Data Cloud. The strength of the partnership lies in the ability to combine these assets—bringing together SAP-native data products with real-time, event-driven data products built from non-SAP systems connected through Confluent. This integration creates a unified, scalable foundation for both operational and analytical use cases across the enterprise.

Industry-Specific Use Cases to Explore the Business Value of SAP and Data Streaming

One major takeaway: in the SAP ecosystem, generic messaging around cutting edge technologies such as Apache Kafka does not work. Success comes from being well-prepared—knowing which SAP systems are involved (ECC, S/4HANA, on-prem, or cloud) and what role they play in the customer’s architecture. The conversations must be use case-driven, often tailored to industries like manufacturing, retail, logistics, or the public sector.

This level of specificity is new to many people working in the technical world of Kafka, Flink, and data streaming. Developers and architects often approach integration from a tool- or framework-centric perspective. However, SAP customers expect business-aligned solutions that address concrete pain points in their domain—whether it’s real-time order tracking in logistics, production analytics in manufacturing, or spend transparency in the public sector.

Understanding the context of SAP’s role in the business process, along with industry regulations, workflows, and legacy system constraints, is key to having meaningful conversations. For the data streaming community, this is a shift in mindset—from building pipelines to solving business problems—and it represents a major opportunity to bring strategic value to enterprise customers.

You are lucky: I just published a free ebook about data streaming use cases focusing on industry scenarios and business value: “The Ultimate Data Streaming Guide“.

Looking Forward: SAP, Data Streaming, AI, and Open Table Formats

Another theme to watch: data lake and format standardization. All cloud providers and data vendors like Databricks, Confluent or Snowflake are investing heavily in supporting open table formats like Apache Iceberg (alongside Delta Lake at Databricks) to standardize analytical integrations and reduce storage costs significantly.

SAP’s investment in Agentic AI through SAP Joule reflects a broader trend across the enterprise software landscape, with vendors like Salesforce, ServiceNow, and others embedding intelligent agents into their platforms. This creates a significant opportunity for Confluent to serve as the streaming backbone—enabling real-time coordination, integration, and decision-making across these diverse, distributed systems.

An event-driven architecture powered by data streaming is crucial for the success of Agentic AI with SAP Joule, Databricks AI agents, and other operational systems that need to be integrated into the business processes. The strategic partnership between Confluent and Databricks makes it even easier to implement end-to-end AI pipelines across the operational and analytical estates.

SAP Sapphire Madrid was a valuable reminder that data streaming is no longer a niche technology—it’s a foundation for digital transformation. Whether it’s SAP ERP, Databricks AI, or new cloud-native operational systems, a Data Streaming Platform connects them all in real time to enable new business models, better customer experiences, and operational agility.

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And make sure to download my free book about data streaming use cases, focusing on industry scenarios, success stories and business value.

The post Data Streaming Meets the SAP Ecosystem and Databricks – Insights from SAP Sapphire Madrid appeared first on Kai Waehner.

]]>
Agentic AI with the Agent2Agent Protocol (A2A) and MCP using Apache Kafka as Event Broker https://www.kai-waehner.de/blog/2025/05/26/agentic-ai-with-the-agent2agent-protocol-a2a-and-mcp-using-apache-kafka-as-event-broker/ Mon, 26 May 2025 05:32:01 +0000 https://www.kai-waehner.de/?p=7855 Agentic AI is emerging as a powerful pattern for building autonomous, intelligent, and collaborative systems. To move beyond isolated models and task-based automation, enterprises need a scalable integration architecture that supports real-time interaction, coordination, and decision-making across agents and services. This blog explores how the combination of Apache Kafka, Model Context Protocol (MCP), and Google’s Agent2Agent (A2A) protocol forms the foundation for Agentic AI in production. By replacing point-to-point APIs with event-driven communication as the integration layer, enterprises can achieve decoupling, flexibility, and observability—unlocking the full potential of AI agents in modern enterprise environments.

The post Agentic AI with the Agent2Agent Protocol (A2A) and MCP using Apache Kafka as Event Broker appeared first on Kai Waehner.

]]>
Agentic AI is gaining traction as a design pattern for building more intelligent, autonomous, and collaborative systems. Unlike traditional task-based automation, agentic AI involves intelligent agents that operate independently, make contextual decisions, and collaborate with other agents or systems—across domains, departments, and even enterprises.

In the enterprise world, agentic AI is more than just a technical concept. It represents a shift in how systems interact, learn, and evolve. But unlocking its full potential requires more than AI models and point-to-point APIs—it demands the right integration backbone.

That’s where Apache Kafka as event broker for true decoupling comes into play together with two emerging AI standards: Google’s Application-to-Application (A2A) Protocol and Antrophic’s Model Context Protocol (MCP) in an enterprise architecture for Agentic AI.

Agentic AI with Apache Kafka as Event Broker Combined with MCP and A2A Protocol

Inspired by my colleague Sean Falconer’s blog post, Why Google’s Agent2Agent Protocol Needs Apache Kafka, this blog post explores the Agentic AI adoption in enterprises and how an event-driven architecture with Apache Kafka fits into the AI architecture.

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And make sure to download my free book about data streaming use cases, including various AI examples across industries.

Business Value of Agentic AI in the Enterprise

For enterprises, the promise of agentic AI is compelling:

  • Smarter automation through self-directed, context-aware agents
  • Improved customer experience with faster and more personalized responses
  • Operational efficiency by connecting internal and external systems more intelligently
  • Scalable B2B interactions that span suppliers, partners, and digital ecosystems

But none of this works if systems are coupled by brittle point-to-point APIs, slow batch jobs, or disconnected data pipelines. Autonomous agents need continuous, real-time access to events, shared state, and a common communication fabric that scales across use cases.

Model Context Protocol (MCP) + Agent2Agent (A2A): New Standards for Agentic AI

The Model Context Protocol (MCP) coined by Anthropic offers a standardized, model-agnostic interface for context exchange between AI agents and external systems. Whether the interaction is streaming, batch, or API-based, MCP abstracts how agents retrieve inputs, send outputs, and trigger actions across services. This enables real-time coordination between models and tools—improving autonomy, reusability, and interoperability in distributed AI systems.

Model Context Protocol MCP by Anthropic
Source: Anthropic

Google’s Agent2Agent (A2A) protocol complements this by defining how autonomous software agents can interact with one another in a standard way. A2A enables scalable agent-to-agent collaboration—where agents discover each other, share state, and delegate tasks without predefined integrations. It’s foundational for building open, multi-agent ecosystems that work across departments, companies, and platforms.

Agent2Agent A2A Protocol by Google and MCP
Source: Google

Why Apache Kafka Is a Better Fit Than an API (HTTP/REST) for A2A and MCP

Most enterprises today use HTTP-based APIs to connect services—ideal for simple, synchronous request-response interactions.

In contrast, Apache Kafka is a distributed event streaming platform designed for asynchronous, high-throughput, and loosely coupled communication—making it a much better fit for multi-agent (A2A) and agentic AI architectures.

API-Based IntegrationKafka-Based Integration
Synchronous, blockingAsynchronous, event-driven
Point-to-point couplingLoose coupling with pub/sub topics
Hard to scale to many agentsSupports multiple consumers natively
No shared memoryKafka retains and replays event history
Limited observabilityFull traceability with schema registry & DLQs

Kafka serves as the decoupling layer. It becomes the place where agents publish their state, subscribe to updates, and communicate changes—independently and asynchronously. This enables multi-agent coordination, resilience, and extensibility.

MCP + Kafka = Open, Flexible Communication

As the adoption of Agentic AI accelerates, there’s a growing need for scalable communication between AI agents, services, and operational systems. The Model-Context Protocol (MCP) is emerging as a standard to structure these interactions—defining how agents access tools, send inputs, and receive results. But a protocol alone doesn’t solve the challenges of integration, scaling, or observability.

This is where Apache Kafka comes in.

By combining MCP with Kafka, agents can interact through a Kafka topic—fully decoupled, asynchronous, and in real time. Instead of direct, synchronous calls between agents and services, all communication happens through Kafka topics, using structured events based on the MCP format.

This model supports a wide range of implementations and tech stacks. For instance:

  • A Python-based AI agent deployed in a SaaS environment
  • A Spring Boot Java microservice running inside a transactional core system
  • A Flink application deployed at the edge performing low-latency stream processing
  • An API gateway translating HTTP requests into MCP-compliant Kafka events

Regardless of where or how an agent is implemented, it can participate in the same event-driven system. Kafka ensures durability, replayability, and scalability. MCP provides the semantic structure for requests and responses.

Agentic AI with Apache Kafka as Event Broker

The result is a highly flexible, loosely coupled architecture for Agentic AI—one that supports real-time processing, cross-system coordination, and long-term observability. This combination is already being explored in early enterprise projects and will be a key building block for agent-based systems moving into production.

Stream Processing as the Agent’s Companion

Stream processing technologies like Apache Flink or Kafka Streams allow agents to:

  • Filter, join, and enrich events in motion
  • Maintain stateful context for decisions (e.g., real-time credit risk)
  • Trigger new downstream actions based on complex event patterns
  • Apply AI directly within the stream processing logic, enabling real-time inference and contextual decision-making with embedded models or external calls to a model server, vector database, or any other AI platform

Agents don’t need to manage all logic themselves. The data streaming platform can pre-process information, enforce policies, and even trigger fallback or compensating workflows—making agents simpler and more focused.

Technology Flexibility for Agentic AI Design with Data Contracts

One of the biggest advantages of Kafka-based event-driven and decoupled backend for agentic systems is that agents can be implemented in any stack:

  • Languages: Python, Java, Go, etc.
  • Environments: Containers, serverless, JVM apps, SaaS tools
  • Communication styles: Event streaming, REST APIs, scheduled jobs

The Kafka topic is the stable data contract for quality and policy enforcement. Agents can evolve independently, be deployed incrementally, and interoperate without tight dependencies.

Microservices, Data Products, and Reusability – Agentic AI Is Just One Piece of the Puzzle

To be effective, Agentic AI needs to connect seamlessly with existing operational systems and business workflows.

Kafka topics enable the creation of reusable data products that serve multiple consumers—AI agents, dashboards, services, or external partners. This aligns perfectly with data mesh and microservice principles, where ownership, scalability, and interoperability are key.

Agent2Agent Protocol (A2A) and MCP via Apache Kafka as Event Broker for Truly Decoupled Agentic AI

A single stream of enriched order events might be consumed via a single data product by:

  • A fraud detection agent
  • A real-time alerting system
  • An agent triggering SAP workflow updates
  • A lakehouse for reporting and batch analytics

This one-to-many model is the opposite of traditional REST designs and crucial for enabling agentic orchestration at scale.

Agentic Al Needs Integration with Core Enterprise Systems

Agentic AI is not a standalone trend—it’s becoming an integral part of broader enterprise AI strategies. While this post focuses on architectural foundations like Kafka, MCP, and A2A, it’s important to recognize how this infrastructure complements the evolution of major AI platforms.

Leading vendors such as Databricks, Snowflake, and others are building scalable foundations for machine learning, analytics, and generative AI. These platforms often handle model training and serving. But to bring agentic capabilities into production—especially for real-time, autonomous workflows—they must connect with operational, transactional systems and other agents at runtime. (See also: Confluent + Databricks blog series | Apache Kafka + Snowflake blog series)

This is where Kafka as the event broker becomes essential: it links these analytical backends with AI agents, transactional systems, and streaming pipelines across the enterprise.

At the same time, enterprise application vendors are embedding AI assistants and agents directly into their platforms:

  • SAP Joule / Business AI – Embedded AI for finance, supply chain, and operations
  • Salesforce Einstein / Copilot Studio – Generative AI for CRM and sales automation
  • ServiceNow Now Assist – Predictive automation across IT and employee services
  • Oracle Fusion AI / OCI – ML for ERP, HCM, and procurement
  • Microsoft Copilot – Integrated AI across Dynamics and Power Platform
  • IBM watsonx, Adobe Sensei, Infor Coleman AI – Governed, domain-specific AI agents

Each of these solutions benefits from the same architectural foundation: real-time data access, decoupled integration, and standardized agent communication.

Whether deployed internally or sourced from vendors, agents need reliable event-driven infrastructure to coordinate with each other and with backend systems. Apache Kafka provides this core integration layer—supporting a consistent, scalable, and open foundation for agentic AI across the enterprise.

Agentic AI Requires Decoupling – Apache Kafka Supports A2A and MCP as an Event Broker

To deliver on the promise of agentic AI, enterprises must move beyond point-to-point APIs and batch integrations. They need a shared, event-driven foundation that enables agents (and other enterprise software) to work independently and together—with shared context, consistent data, and scalable interactions.

Apache Kafka provides exactly that. Combined with MCP and A2A for standardized Agentic AI communication, Kafka unlocks the flexibility, resilience, and openness needed for next-generation enterprise AI.

It’s not about picking one agent platform—it’s about giving every agent the same, reliable interface to the rest of the world. Kafka is that interface.

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And make sure to download my free book about data streaming use cases, including various AI examples across industries.

The post Agentic AI with the Agent2Agent Protocol (A2A) and MCP using Apache Kafka as Event Broker appeared first on Kai Waehner.

]]>
Databricks and Confluent Leading Data and AI Architectures – What About Snowflake, BigQuery, and Friends? https://www.kai-waehner.de/blog/2025/05/15/databricks-and-confluent-leading-data-and-ai-architectures-what-about-snowflake-bigquery-and-friends/ Thu, 15 May 2025 09:57:25 +0000 https://www.kai-waehner.de/?p=7829 Confluent, Databricks, and Snowflake are trusted by thousands of enterprises to power critical workloads—each with a distinct focus: real-time streaming, large-scale analytics, and governed data sharing. Many customers use them in combination to build flexible, intelligent data architectures. This blog highlights how Erste Bank uses Confluent and Databricks to enable generative AI in customer service, while Siemens combines Confluent and Snowflake to optimize manufacturing and healthcare with a shift-left approach. Together, these examples show how a streaming-first foundation drives speed, scalability, and innovation across industries.

The post Databricks and Confluent Leading Data and AI Architectures – What About Snowflake, BigQuery, and Friends? appeared first on Kai Waehner.

]]>
The modern data landscape is shaped by platforms that excel in different but increasingly overlapping domains. Confluent leads in data streaming with enterprise-grade infrastructure for real-time data movement and processing. Databricks and Snowflake dominate the lakehouse and analytics space—each with unique strengths. Databricks is known for scalable AI and machine learning pipelines, while Snowflake stands out for its simplicity, governed data sharing, and performance in cloud-native analytics.

This final blog in the series brings together everything covered so far and highlights how these technologies power real customer innovation. At Erste Bank, Confluent and Databricks are combined to build an event-driven architecture for Generative AI use cases in customer service. At Siemens, Confluent and Snowflake support a shift-left architecture to drive real-time manufacturing insights and medical AI—using streaming data not just for analytics, but also to trigger operational workflows across systems.

Together, these examples show why so many enterprises adopt a multi-platform strategy—with Confluent as the event-driven backbone, and Databricks or Snowflake (or both) as the downstream platforms for analytics, governance, and AI.

Data Streaming Lake Warehouse and Lakehouse with Confluent Databricks Snowflake using Iceberg and Tableflow Delta Lake

About the Confluent and Databricks Blog Series

This article is part of a blog series exploring the growing roles of Confluent and Databricks in modern data and AI architectures:

Learn how these platforms will affect data use in businesses in future articles. Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And download my free book about data streaming use cases, including technical architectures and the relation to analytical platforms like Databricks and Snowflake.

The Broader Data Streaming and Lakehouse Landscape

The data streaming and lakehouse space continues to expand, with a variety of platforms offering different capabilities for real-time processing, analytics, and storage.

Data Streaming Market

On the data streaming side, Confluent is the leader. Other cloud-native services like Amazon MSK, Azure Event Hubs, and Google Cloud Managed Kafka provide Kafka-compatible offerings, though they vary in protocol support, ecosystem maturity, and operational simplicity. StreamNative, based on Apache Pulsar, competes with the Kafka offerings, while Decodable and DeltaStream leverage Apache Flink for real-time stream processing using a complementary approach. Startups such as AutoMQ and BufStream pitch reimagining Kafka infrastructure for improved scalability and cost-efficiency in cloud-native architectures.

The data streaming landscape is growing year by year. Here is the latest overview of the data streaming market:

The Data Streaming Landscape 2025 with Kafka Flink Confluent Amazon MSK Cloudera Event Hubs and Other Platforms

Lakehouse Market

In the lakehouse and analytics platform category, Databricks leads with its cloud-native model combining compute and storage, enabling modern lakehouse architectures. Snowflake is another leading cloud data platform, praised for its ease of use, strong ecosystem, and ability to unify diverse analytical workloads. Microsoft Fabric aims to unify data engineering, real-time analytics, and AI on Azure under one platform. Google BigQuery offers a serverless, scalable solution for large-scale analytics, while platforms like Amazon Redshift, ClickHouse, and Athena serve both traditional and high-performance OLAP use cases.

The Forrester Wave for Lakehouses analyzes and explores the vendor options, showing Databricks, Snowflake and Google as the leaders. Unfortunately, it is not allowed to post the Forrester Wave, so you need to download it from a vendor.

Confluent + Databricks

This blog series highlights Databricks and Confluent because they represent a powerful combination at the intersection of data streaming and the lakehouse paradigm. Together, they enable real-time, AI-driven architectures that unify operational and analytical workloads across modern enterprise environments.

Each platform in the data streaming and Lakehouse space has distinct advantages, but none offer the same combination of real-time capabilities, open architecture, and end-to-end integration as Confluent and Databricks.

It’s also worth noting that open source remains a big – if not the biggest – competitor to all of these vendors. Many enterprises still rely on open-source data lakes built on Elastic, legacy Hadoop, or open table formats such as Apache Hudi—favoring flexibility and cost control over fully managed services.

Confluent: The Leading Data Streaming Platform (DSP)

Confluent is the enterprise-standard platform for data streaming, built on Apache Kafka and extended for cloud-native, real-time operations at global scale. The data streaming platform (DSP) delivers a complete and unified platform with multiple deployment options to meet diverse needs and budgets:

  • Confluent Cloud – Fully managed, serverless Kafka and Flink service across AWS, Azure, and Google Cloud
  • Confluent PlatformSelf-managed software for on-premises, private cloud, or hybrid environments
  • WarpStream – Kafka-compatible, cloud-native infrastructure optimized for BYOC (Bring Your Own Cloud) using low-cost object storage like S3

Together, these options offer cost efficiency and flexibility across a wide range of streaming workloads:

  • Small-volume, mission-critical use cases such as payments or fraud detection, where zero data loss, strict SLAs, and low latency are non-negotiable
  • High-volume, analytics-driven use cases like clickstream processing for real-time personalization and recommendation engines, where throughput and scalability are key

Confluent supports these use cases with:

  • Cluster Linking for real-time, multi-region and hybrid cloud data movement
  • 100+ prebuilt connectors for seamless integration with enterprise systems and cloud services
  • Apache Flink for rich stream processing at scale
  • Governance and observability with Schema Registry, Stream Catalog, role-based access control, and SLAs
  • Tableflow for native integration with Delta Lake, Apache Iceberg, and modern lakehouse architectures

While other providers offer fragments—such as Amazon MSK for basic Kafka infrastructure or Azure Event Hubs for ingestion—only Confluent delivers a unified, cloud-native data streaming platform with consistent operations, tooling, and security across environments.

Confluent is trusted by over 6,000 enterprises and backed by deep experience in large-scale streaming deployments, hybrid architectures, and Kafka migrations. It combines industry-leading technology with enterprise-grade support, expertise, and consulting services to help organizations turn real-time data into real business outcomes—securely, efficiently, and at any scale.

Databricks: The Leading Lakehouse for AI and Analytics

Databricks is the leading platform for unified analytics, data engineering, and AI—purpose-built to help enterprises turn massive volumes of data into intelligent, real-time decision-making. Positioned as the Data Intelligence Platform, Databricks combines a powerful lakehouse foundation with full-spectrum AI capabilities, making it the platform of choice for modern data teams.

Its core strengths include:

  • Delta Lake + Unity Catalog – A robust foundation for structured, governed, and versioned data at scale
  • Apache Spark – Distributed compute engine for ETL, data preparation, and batch/stream processing
  • MosaicML – End-to-end tooling for efficient model training, fine-tuning, and deployment of custom AI models
  • AI/ML tools for data scientists, ML engineers, and analysts—integrated across the platform
  • Native connectors to BI tools (like Power BI, Tableau) and MLOps platforms for model lifecycle management

Databricks directly competes with Snowflake, especially in the enterprise AI and analytics space. While Snowflake shines with simplicity and governed warehousing, Databricks differentiates by offering a more flexible and performant platform for large-scale model training and advanced AI pipelines.

The platform supports:

  • Batch and (sort of) streaming analytics
  • ML model training and inference on shared data
  • GenAI use cases, including RAG (Retrieval-Augmented Generation) with unstructured and structured sources
  • Data sharing and collaboration across teams and organizations with open formats and native interoperability

Databricks is trusted by thousands of organizations for AI workloads, offering not only powerful infrastructure but also integrated governance, observability, and scalability—whether deployed on a single cloud or across multi-cloud environments.

Combined with Confluent’s real-time data streaming capabilities, Databricks completes the AI-driven enterprise architecture by enabling organizations to analyze, model, and act on high-quality, real-time data at scale.

Stronger Together: A Strategic Alliance for Data and AI with Tableflow and Delta Lake

Confluent and Databricks are not trying to replace each other. Their partnership is strategic and product-driven.

Recent innovation: Tableflow + Delta Lake – this feature enables bi-directional data exchange between Kafka and Delta Lake.

  • Direction 1: Confluent streams → Tableflow → Delta Lake (via Unity Catalog)
  • Direction 2: Databricks insights → Tableflow → Kafka → Flink or other operational systems

This simplifies architecture, reduces cost and latency, and removes the need for Spark jobs to manage streaming data.

Confluent Tableflow for Open Table Format Integration with Databricks Snowflake BigQuery via Apache Iceberg Delta Lake
Source: Confluent

Confluent becomes the operational data backbone for AI and analytics. Databricks becomes the analytics and AI engine fed with data from Confluent.

Where needed, operational or analytical real-time AI predictions can be done within Confluent’s data streaming platform: with embedded or remote model inference, native integration for search with vector databases, and built-in models for common predictive use cases such as forecasting.

Erste Bank: Building a Foundation for GenAI with Confluent and Databricks

Erste Group Bank AG, one of the largest financial services providers in Central and Eastern Europe, is leveraging Confluent and Databricks to transform its customer service operations with Generative AI. Recognizing that successful GenAI initiatives require more than just advanced models, Erste Bank first focused on establishing a solid foundation of real-time, consistent, and high-quality data leveraging data streaming and an event-driven architecture.

Using Confluent, Erste Bank connects real-time streams, batch workloads, and request-response APIs across its legacy and cloud-native systems in a decoupled way but ensuring data consistency through Kafka. This architecture ensures that operational and analytical data — whether from core banking platforms, digital channels, mobile apps, or CRM systems — flows reliably and consistently across the enterprise. By integrating event streams, historical data, and API calls into a unified data pipeline, Confluent enables Erste Bank to create a live, trusted digital representation of customer interactions.

With this real-time foundation in place, Erste Bank leverages Databricks as its AI and analytics platform to build and scale GenAI applications. At the Data in Motion Tour 2024 in Frankfurt, Erste Bank presented a pilot project where customer service chatbots consume contextual data flowing through Confluent into Databricks, enabling intelligent, personalized responses. Once a customer request is processed, the chatbot triggers a transaction back through Kafka into the Salesforce CRM, ensuring seamless, automated follow-up actions.

GenAI Chatbot with Confluent and Databricks AI in FinServ at Erste Bank
Source: Erste Group Bank AG

By combining Confluent’s real-time data streaming capabilities with Databricks’ powerful AI infrastructure, Erste Bank is able to:

  • Deliver highly contextual, real-time customer service interactions
  • Automate CRM workflows through real-time event-driven architectures
  • Build a scalable, resilient platform for future AI-driven applications

This architecture positions Erste Bank to continue expanding GenAI use cases across financial services, from customer engagement to operational efficiency, powered by consistent, trusted, and real-time data.

Confluent: The Neutral Streaming Backbone for Any Data Stack

Confluent is not tied to a monolithic compute engine within a cloud provider. This neutrality is a strength:

  • Bridges operational systems (mainframes, SAP) with modern data platforms (AI, lakehouses, etc.)
  • An event-driven architecture built with a data streaming platform feeds multiple lakehouses at once
  • Works across all major cloud providers, including AWS, Azure, and GCP
  • Operates at the edge, on-prem, in the cloud and in hybrid scenarios
  • One size doesn’t fit all – follow best practices from microservices architectures and data mesh to tailor your architecture with purpose-built solutions.

The flexibility makes Confluent the best platform for data distribution—enabling decoupled teams to use the tools and platforms best suited to their needs.

Confluent’s Tableflow also supports Apache Iceberg to enable seamless integration from Kafka into lakehouses beyond Delta Lake and Databricks—such as Snowflake, BigQuery, Amazon Athena, and many other data platforms and analytics engines.

Example: A global enterprise uses Confluent as its central nervous system for data streaming. Customer interaction events flow in real time from web and mobile apps into Confluent. These events are then:

  • Streamed into Databricks once for multiple GenAI and analytics use cases.
  • Written to an operational PostgreSQL database to update order status and customer profiles
  • Pushed into an customer-facing analytics engine like StarTree (powered by Apache Pinot) for live dashboards and real-time customer behavior analytics
  • Shared with Snowflake through a lift-and-shift M&A use case to unify analytics from an acquired company

This setup shows the power of Confluent’s neutrality and flexibility: enabling real-time, multi-directional data sharing across heterogeneous platforms, without coupling compute and storage.

Snowflake: A Cloud-Native Companion to Confluent – Natively Integrated with Apache Iceberg and Polaris Catalog

Snowflake pairs naturally with Confluent to power modern data architectures. As a cloud-native SaaS from the start, Snowflake has earned broad adoption across industries thanks to its scalability, simplicity, and fully managed experience.

Together, Confluent and Snowflake unlock high-impact use cases:

  • Near real-time ingestion and enrichment: Stream operational data into Snowflake for immediate analytics and action.
  • Unified operational and analytical workloads: Combine Confluent’s Tableflow with Snowflake’s Apache Iceberg support through its open source Polaris catalog to bridge operational and analytical data layers.
  • Shift-left data quality: Improve reliability and reduce costs by validating and shaping data upstream, before it hits storage.

With Confluent as the streaming backbone and Snowflake as the analytics engine, enterprises get a cloud-native stack that’s fast, flexible, and built to scale. Many enterprises use Confluent as data ingestion platform for Databricks, Snowflake, and other analytical and operational downstream applications.

Shift Left at Siemens: Real-Time Innovation with Confluent and Snowflake

Siemens is a global technology leader operating across industry, infrastructure, mobility, and healthcare. Its portfolio includes industrial automation, digital twins, smart building systems, and advanced medical technologies—delivered through units like Siemens Digital Industries and Siemens Healthineers.

To accelerate innovation and reduce operational costs, Siemens is embracing a shift-left architecture to enrich data early in the pipeline before it reaches Snowflake. This enables reusable, real-time data products in the data streaming platform leveraging an event-driven architecture for data sharing with analytical and operational systems beyond Snowflake.

Siemens Digital Industries applies this model to optimize manufacturing and intralogistics, using streaming ETL to feed real-time dashboards and trigger actions like automated inventory replenishment—while continuing to use Snowflake for historical analysis, reporting, and long-term data modeling.

Siemens Shift Left Architecture and Data Products with Data Streaming using Apache Kafka and Flink
Source: Siemens Digital Industries

Siemens Healthineers embeds AI directly in the stream processor to detect anomalies in medical equipment telemetry, improving response time and avoiding costly equipment failures—while leveraging Snowflake to power centralized analytics, compliance reporting, and cross-device trend analysis.

Machine Monitoring and Streaming Analytics with MQTT Confluent Kafka and TensorFlow AI ML at Siemens Healthineers
Source: Siemens Healthineers

These success stories are part of The Data Streaming Use Case Show, my new industry webinar series. Learn more about Siemens’ usage of Confluent and Snowflake and watch the video recording about “shift left”.

Open Outlook: Agentic AI with Model-Context Protocol (MCP) and Agent2Agent Protocol (A2A)

While data and AI platforms like Databricks and Snowflake play a key role, some Agentic AI projects will likely rely on emerging, independent SaaS platforms and specialized tools. Flexibility and open standards are key for future success.

What better way to close a blog series on Confluent and Databricks (and Snowflake) than by looking ahead to one of the most exciting frontiers in enterprise AI: Agentic AI.

As enterprise AI matures, there is growing interest in bi-directional interfaces between operational systems and AI agents. Google’s A2A (Agent-to-Agent) architecture reinforces this shift—highlighting how intelligent agents can autonomously communicate, coordinate, and act across distributed systems.

Agent2Agent Protocol (A2A) and MCP via Apache Kafka as Event Broker for Truly Decoupled Agentic AI

Confluent + Databricks is an ideal combination to support these emerging Agentic AI patterns, where event-driven agents continuously learn from and act upon streaming data. Models can be embedded directly in Flink for low-latency applications or hosted and orchestrated in Databricks for more complex inference workflows.

The Model-Context-Protocol (MCP) is gaining traction as a design blueprint for standardized interaction between services, models, and event streams. In this context, Confluent and Databricks are well positioned to lead:

  • Confluent: Event-driven delivery of context, inputs, and actions
  • Databricks: Model hosting, training, inference, and orchestration
  • Jointly: Closed feedback loops between business systems and AI agents

Together with protocols like A2A and MCP, this architecture will shape the next generation of intelligent, real-time enterprise applications.

Confluent + Databricks: The Future-Proof Data Stack for AI and Analytics

Databricks and Confluent are not just partners. They are leaders in their respective domains. Together, they enable real-time, intelligent data architectures that support operational excellence and AI innovation.

Other AI and data platforms are part of the landscape, and many bring valuable capabilities.  As explored in this blog series, the true decoupling using an event-driven architecture with Apache Kafka allows using any kind of combination of vendors and cloud services. I see many enterprises using Databricks and Snowflake integrated to Confluent. However, the alignment between Confluent and Databricks stands out due to its combination of strengths:

  • Confluent’s category leadership in data streaming, powering thousands of production deployments across industries
  • Databricks’ strong position in the lakehouse and AI space, with broad enterprise adoption for analytics and machine learning
  • Shared product vision and growing engineering and go-to-market alignment across technical and field organizations

For enterprises shaping a long-term data and AI strategy, this combination offers a proven, scalable foundation—bridging real-time operations with analytical depth, without forcing trade-offs between speed, flexibility, or future-readiness.

Stay tuned for deep dives into how these platforms are shaping the future of data-driven enterprises. Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And download my free book about data streaming use cases, including technical architectures and the relation to analytical platforms like Databricks and Snowflake.

The post Databricks and Confluent Leading Data and AI Architectures – What About Snowflake, BigQuery, and Friends? appeared first on Kai Waehner.

]]>
Databricks and Confluent in the World of Enterprise Software (with SAP as Example) https://www.kai-waehner.de/blog/2025/05/12/databricks-and-confluent-in-the-world-of-enterprise-software-with-sap-as-example/ Mon, 12 May 2025 11:26:54 +0000 https://www.kai-waehner.de/?p=7824 Enterprise data lives in complex ecosystems—SAP, Oracle, Salesforce, ServiceNow, IBM Mainframes, and more. This article explores how Confluent and Databricks integrate with SAP to bridge operational and analytical workloads in real time. It outlines architectural patterns, trade-offs, and use cases like supply chain optimization, predictive maintenance, and financial reporting, showing how modern data streaming unlocks agility, reuse, and AI-readiness across even the most SAP-centric environments.

The post Databricks and Confluent in the World of Enterprise Software (with SAP as Example) appeared first on Kai Waehner.

]]>
Modern enterprises rely heavily on operational systems like SAP ERP, Oracle, Salesforce, ServiceNow and mainframes to power critical business processes. But unlocking real-time insights and enabling AI at scale requires bridging these systems with modern analytics platforms like Databricks. This blog explores how Confluent’s data streaming platform enables seamless integration between SAP, Databricks, and other systems to support real-time decision-making, AI-driven automation, and agentic AI use cases. It explores how Confluent delivers the real-time backbone needed to build event-driven, future-proof enterprise architectures—supporting everything from inventory optimization and supply chain intelligence to embedded copilots and autonomous agents.

Enterprise Application Integration with Confliuent and Databricks for Oracle SAP Salesforce Servicenow et al

About the Confluent and Databricks Blog Series

This article is part of a blog series exploring the growing roles of Confluent and Databricks in modern data and AI architectures:

Learn how these platforms will affect data use in businesses in future articles. Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And download my free book about data streaming use cases, including technical architectures and the relation to other operational and analytical platforms like SAP and Databricks.

Most Enterprise Data Is Operational

Enterprise software systems generate a constant stream of operational data across a wide range of domains. This includes orders and inventory from SAP ERP systems, often extended with real-time production data from SAP MES. Oracle databases capture transactional data critical to core business operations, while MongoDB contributes operational data—frequently used as a CDC source or, in some cases, as a sink for analytical queries. Customer interactions are tracked in platforms like Salesforce CRM, and financial or account-related events often originate from IBM mainframes. 

Together, these systems form the backbone of enterprise data, requiring seamless integration for real-time intelligence and business agility. This data is often not immediately available for analytics or AI unless it’s integrated into downstream systems.

Confluent is built to ingest and process this kind of operational data in real time. Databricks can then consume it for AI and machine learning, dashboards, or reports. Together, SAP, Confluent and Databricks create a real-time architecture for enterprise decision-making.

SAP Product Landscape for Operational and Analytical Workloads

SAP plays a foundational role in the enterprise data landscape—not just as a source of business data, but as the system of record for core operational processes across finance, supply chain, HR, and manufacturing.

On a high level, the SAP product portfolio has three categories (these days): SAP Business AI, SAP Business Data Cloud (BDC), and SAP Business Applications powered by SAP Business Technology Platform (BTP).

SAP Product Portfolio Categories
Source: SAP

To support both operational and analytical needs, SAP offers a portfolio of platforms and tools, while also partnering with best-in-class technologies like Databricks and Confluent.

Operational Workloads (Transactional Systems):

  • SAP S/4HANA – Modern ERP for core business operations
  • SAP ECC – Legacy ERP platform still widely deployed
  • SAP CRM / SCM / SRM – Domain-specific business systems
  • SAP Business One / Business ByDesign – ERP solutions for mid-market and subsidiaries

Analytical Workloads (Data & Analytics Platforms):

  • SAP Datasphere – Unified data fabric to integrate, catalog, and govern SAP and non-SAP data
  • SAP Analytics Cloud (SAC) – Visualization, reporting, and predictive analytics
  • SAP BW/4HANA – Data warehousing and modeling for SAP-centric analytics

SAP Business Data Cloud (BDC)

SAP Business Data Cloud (BDC) is a strategic initiative within SAP Business Technology Platform (BTP) that brings together SAP’s data and analytics capabilities into a unified cloud-native experience. It includes:

  • SAP Datasphere as the data fabric layer, enabling seamless integration of SAP and third-party data
  • SAP Analytics Cloud (SAC) for consuming governed data via dashboards and reports
  • SAP’s partnership with Databricks to allow SAP data to be analyzed alongside non-SAP sources in a lakehouse architecture
  • Real-time integration scenarios enabled through Confluent and Apache Kafka, bringing operational data in motion directly into SAP and Databricks environments

Together, this ecosystem supports real-time, AI-powered, and governed analytics across operational and analytical workloads—making SAP data more accessible, trustworthy, and actionable within modern cloud data architectures.

SAP Databricks OEM: Limited Scope, Full Control by SAP

SAP recently announced an OEM partnership with Databricks, embedding parts of Databricks’ serverless infrastructure into the SAP ecosystem. While this move enables tighter integration and simplified access to AI workloads within SAP, it comes with significant trade-offs. The OEM model is narrowly scoped, optimized primarily for ML and GenAI scenarios on SAP data, and lacks the openness and flexibility of native Databricks.

This integration is not intended for full-scale data engineering. Core capabilities such as workflows, streaming, Delta Live Tables, and external data connections (e.g., Snowflake, S3, MS SQL) are missing. The architecture is based on data at rest and does not embrace event-driven patterns. Compute options are limited to serverless only, with no infrastructure control. Pricing is complex and opaque, with customers often needing to license Databricks separately to unlock full capabilities.

Critically, SAP controls the entire data integration layer through its BDC Data Products, reinforcing a vendor lock-in model. While this may benefit SAP-centric organizations focused on embedded AI, it restricts broader interoperability and long-term architectural flexibility. In contrast, native Databricks, i.e., outside of SAP, offers a fully open, scalable platform with rich data engineering features across diverse environments.

Whichever Databricks option you prefer, this is where Confluent adds value—offering a truly event-driven, decoupled architecture that complements both SAP Datasphere and Databricks, whether used within or outside the SAP OEM framework.

Confluent and SAP Integration

Confluent provides native and third-party connectors to integrate with SAP systems to enable continuous, low-latency data flow across business applications.

SAP ERP Confluent Data Streaming Integration Access Patterns
Source: Confluent

This powers modern, event-driven use cases that go beyond traditional batch-based integrations:

  • Low-latency access to SAP transactional data
  • Integration with other operational source systems like Salesforce, Oracle, IBM Mainframe, MongoDB, or IoT platforms
  • Synchronization between SAP DataSphere and other data warehouse and analytics platforms such as Snowflake, Google BigQuery or Databricks 
  • Decoupling of applications for modular architecture
  • Data consistency across real-time, batch and request-response APIs
  • Hybrid integration across any edge, on-premise or multi-cloud environments

SAP Datasphere and Confluent

To expand its role in the modern data stack, SAP introduced SAP Datasphere—a cloud-native data management solution designed to extend SAP’s reach into analytics and data integration. Datasphere aims to simplify access to SAP and non-SAP data across hybrid environments.

SAP Datasphere simplifies data access within the SAP ecosystem, but it has key drawbacks when compared to open platforms like Databricks, Snowflake, or Google BigQuery:

  • Closed Ecosystem: Optimized for SAP, but lacks flexibility for non-SAP integrations.
  • No Event Streaming: Focused on data at rest, with limited support for real-time processing or streaming architectures.
  • No Native Stream Processing: Relies on batch methods, adding latency and complexity for hybrid or real-time use cases.

Confluent alleviates these drawbacks and supports this strategy through bi-directional integration with SAP Datasphere. This enables real-time streaming of SAP data into Datasphere and back out to operational or analytical consumers via Apache Kafka. It allows organizations to enrich SAP data, apply real-time processing, and ensure it reaches the right systems in the right format—without waiting for overnight batch jobs or rigid ETL pipelines.

Confluent for Agentic AI with SAP Joule and Databricks

SAP is laying the foundation for agentic AI architectures with a vision centered around Joule—its generative AI copilot—and a tightly integrated data stack that includes SAP Databricks (via OEM), SAP Business Data Cloud (BDC), and a unified knowledge graph. On top of this foundation, SAP is building specialized AI agents for use cases such as customer 360, creditworthiness analysis, supply chain intelligence, and more.

SAP ERP with Business Technology Platform BTP and Joule for Agentic AI in the Cloud
Source: SAP

The architecture combines:

  • SAP Joule as the interface layer for generative insights and decision support
  • SAP’s foundational models and domain-specific knowledge graph
  • SAP BDC and SAP Databricks as the data and ML/AI backbone
  • Data from both SAP systems (ERP, CRM, HR, logistics) and non-SAP systems (e.g. clickstream, IoT, partner data, social media) from its partnership with Confluent

But here’s the catch:  What happens when agents need to communicate with one another to deliver a workflow?  Such Agentic systems require continuous, contextual, and event-driven data exchange—not just point-to-point API calls and nightly batch jobs.

This is where Confluent’s data streaming platform comes in as critical infrastructure.

Agentic AI with Apache Kafka as Event Broker

Confluent provides the real-time data streaming platform that connects the operational world of SAP with the analytical and AI-driven world of Databricks, enabling the continuous movement, enrichment, and sharing of data across all layers of the stack.

Agentic AI with Confluent as Event Broker for Databricks SAP and Oracle

The above is a conceptual view on the architecture. The AI agents on the left side could be built with SAP Joule, Databricks, or any “outside” GenAI framework.

The data streaming platform helps connecting the AI agents with the reset of the enterprise architecture, both within SAP and Databricks but also beyond:

  • Real-time data integration from non-SAP systems (e.g., mobile apps, IoT devices, mainframes, web logs) into SAP and Databricks
  • True decoupling of services and agents via an event-driven architecture (EDA), replacing brittle RPC or point-to-point API calls
  • Event replay and auditability—critical for traceable AI systems operating in regulated environments
  • Streaming pipelines for feature engineering and inference: stream-based model triggering with low-latency SLAs
  • Support for bi-directional flows: e.g., operational triggers in SAP can be enriched by AI agents running in Databricks and pushed back into SAP via Kafka events

Without Confluent, SAP’s agentic architecture risks becoming a patchwork of stateless services bound by fragile REST endpoints—lacking the real-time responsiveness, observability, and scalability required to truly support next-generation AI orchestration.

Confluent turns the SAP + Databricks vision into a living, breathing ecosystem—where context flows continuously, agents act autonomously, and enterprises can build future-proof AI systems that scale.

Data Streaming Use Cases Across SAP Product Suites

With Confluent, organizations can support a wide range of use cases across SAP product suites, including:

  1. Real-Time Inventory Visibility: Live updates of stock levels across warehouses and stores by streaming material movements from SAP ERP and SAP EWM, enabling faster order fulfillment and reduced stockouts.
  2. Dynamic Pricing and Promotions: Stream sales orders and product availability in real time to trigger pricing adjustments or dynamic discounting via integration with SAP ERP and external commerce platforms.
  3. AI-Powered Supply Chain Optimization: Combine data from SAP ERP, SAP Ariba, and external logistics platforms to power ML models that predict delays, optimize routes, and automate replenishment.
  4. Shop Floor Event Processing: Stream sensor and machine data alongside order data from SAP MES, enabling real-time production monitoring, alerting, and throughput optimization.
  5. Employee Lifecycle Automation: Stream employee events (e.g., onboarding, role changes) from SAP SuccessFactors to downstream IT systems (e.g., Active Directory, badge systems), improving HR operations and compliance.
  6. Order-to-Cash Acceleration: Connect order intake (via web portals or Salesforce) to SAP ERP in real time, enabling faster order validation, invoicing, and cash flow.
  7. Procure-to-Pay Automation: Integrate procurement events from SAP Ariba and supplier portals with ERP and financial systems to streamline approvals and monitor supplier performance continuously.
  8. Customer 360 and CRM Synchronization: Synchronize customer master data and transactions between SAP ERP, SAP CX, and third-party CRMs like Salesforce to enable unified customer views.
  9. Real-Time Financial Reporting: Stream financial transactions from SAP S/4HANA into cloud-based lakehouses or BI tools for near-instant reporting and compliance dashboards.
  10. Cross-System Data Consistency: Ensure consistent master data and business events across SAP and non-SAP environments by treating SAP as a real-time event source—not just a system of record.

Example Use Case and Architecture with SAP, Databricks and Confluent

Consider a manufacturing company using SAP ERP for inventory management and Databricks for predictive maintenance. The combination of SAP Datasphere and Confluent enables seamless data integration from SAP systems, while the addition of Databricks supports advanced AI/ML applications—turning operational data into real-time, predictive insights.

With Confluent as the real-time backbone:

  • Machine telemetry (via MQTT or OPC-UA) and ERP events (e.g., stock levels, work orders) are streamed in real time.
  • Apache Flink enriches and filters the event streams—adding context like equipment metadata or location.
  • Tableflow publishes clean, structured data to Databricks as Delta tables for analytics and ML processing.
  • A predictive model hosted in a Databricks model detects potential equipment failure before it happens in a Flink application calling the remote model with low latency.
  • The resulting prediction is streamed back to Kafka, triggering an automated work order in SAP via event integration.

Enterprise Architecture with Confluent and SAP and Databricks for Analytics and AI

This bi-directional, event-driven pattern illustrates how Confluent enables seamless, real-time collaboration across SAP, Databricks, and IoT systems—supporting both operational and analytical use cases with a shared architecture.

Going Beyond SAP with Data Streaming

This pattern applies to other enterprise systems:

  • Salesforce: Stream customer interactions for real-time personalization through Salesforce Data Cloud
  • Oracle: Capture transactions via CDC (Change Data Capture)
  • ServiceNow: Monitor incidents and automate operational responses
  • Mainframe: Offload events from legacy applications without rewriting code
  • MongoDB: Sync operational data in real time to support responsive apps
  • Snowflake: Stream enriched operational data into Snowflake for near real-time analytics, dashboards, and data sharing across teams and partners
  • OpenAI (or other GenAI platforms): Feed real-time context into LLMs for AI-assisted recommendations or automation
  • “You name it”: Confluent’s prebuilt connectors and open APIs enable event-driven integration with virtually any enterprise system

Confluent provides the backbone for streaming data across all of these platforms—securely, reliably, and in real time.

Strategic Value for the Enterprise of Event-based Real-Time Integration with Data Streaming

Enterprise software platforms are essential. But they are often closed, slow to change, and not designed for analytics or AI.

Confluent provides real-time access to operational data from platforms like SAP. SAP Datasphere and Databricks enable analytics and AI on that data. Together, they support modern, event-driven architectures.

  • Use Confluent for real-time data streaming from SAP and other core systems
  • Use SAP Datasphere and Databricks to build analytics, reports, and AI on that data
  • Use Tableflow to connect the two platforms seamlessly

This modern approach to data integration delivers tangible business value, especially in complex enterprise environments. It enables real-time decision-making by allowing business logic to operate on live data instead of outdated reports. Data products become reusable assets, as a single stream can serve multiple teams and tools simultaneously. By reducing the need for batch layers and redundant processing, the total cost of ownership (TCO) is significantly lowered. The architecture is also future-proof, making it easy to integrate new systems, onboard additional consumers, and scale workflows as business needs evolve.

Beyond SAP: Enabling Agentic AI Across the Enterprise

The same architectural discussion applies across the enterprise software landscape. As vendors embed AI more deeply into their platforms, the effectiveness of these systems increasingly depends on real-time data access, continuous context propagation, and seamless interoperability.

Without an event-driven foundation, AI agents remain limited—trapped in siloed workflows and brittle API chains. Confluent provides the scalable, reliable backbone needed to enable true agentic AI in complex enterprise environments.

Examples of AI solutions driving this evolution include:

  • SAP Joule / Business AI – Context-aware agents and embedded AI across ERP, finance, and supply chain
  • Salesforce Einstein / Copilot Studio – Generative AI for CRM, service, and marketing automation built on top of Salesforce Data Cloud
  • ServiceNow Now Assist – Intelligent workflows and predictive automation in ITSM and Ops
  • Oracle Fusion AI / OCI AI Services – Embedded machine learning in ERP, HCM, and SCM
  • Microsoft Copilot (Dynamics / Power Platform) – AI copilots across business and low-code apps
  • Workday AI – Smart recommendations for finance, workforce, and HR planning
  • Adobe Sensei GenAI – GenAI for content creation and digital experience optimization
  • IBM watsonx – Governed AI foundation for enterprise use cases and data products
  • Infor Coleman AI – Industry-specific AI for supply chain and manufacturing systems
  • All the “traditional” cloud providers and data platforms such as Snowflake with Cortex, Microsoft Azure Fabric, AWS SageMaker, AWS Bedrock, and GCP Vertex AI

Each of these platforms benefits from a streaming-first architecture that enables real-time decisions, reusable data, and smarter automation across the business.

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And download my free book about data streaming use cases, including technical architectures and the relation to other operational and analytical platforms like SAP and Databricks.

The post Databricks and Confluent in the World of Enterprise Software (with SAP as Example) appeared first on Kai Waehner.

]]>
How Apache Kafka and Flink Power Event-Driven Agentic AI in Real Time https://www.kai-waehner.de/blog/2025/04/14/how-apache-kafka-and-flink-power-event-driven-agentic-ai-in-real-time/ Mon, 14 Apr 2025 09:09:10 +0000 https://www.kai-waehner.de/?p=7265 Agentic AI marks a major evolution in artificial intelligence—shifting from passive analytics to autonomous, goal-driven systems capable of planning and executing complex tasks in real time. To function effectively, these intelligent agents require immediate access to consistent, trustworthy data. Traditional batch processing architectures fall short of this need, introducing delays, data staleness, and rigid workflows. This blog post explores why event-driven architecture (EDA)—powered by Apache Kafka and Apache Flink—is essential for building scalable, reliable, and adaptive AI systems. It introduces key concepts such as Model Context Protocol (MCP) and Google’s Agent-to-Agent (A2A) protocol, which are redefining interoperability and context management in multi-agent environments. Real-world use cases from finance, healthcare, manufacturing, and more illustrate how Kafka and Flink provide the real-time backbone needed for production-grade Agentic AI. The post also highlights why popular frameworks like LangChain and LlamaIndex must be complemented by robust streaming infrastructure to support stateful, event-driven AI at scale.

The post How Apache Kafka and Flink Power Event-Driven Agentic AI in Real Time appeared first on Kai Waehner.

]]>
Artificial Intelligence is evolving beyond passive analytics and reactive automation. Agentic AI represents a new wave of autonomous, goal-driven AI systems that can think, plan, and execute complex workflows without human intervention. However, for these AI agents to be effective, they must operate on real-time, consistent, and trustworthy data—a challenge that traditional batch processing architectures simply cannot meet. This is where Data Streaming with Apache Kafka and Apache Flink, coupled with an event-driven architecture (EDA), form the backbone of Agentic AI. By enabling real-time and continuous decision-making, EDA ensures that AI systems can act instantly and reliably in dynamic, high-speed environments. Emerging standards like the Model Context Protocol (MCP) and Google’s Agent-to-Agent (A2A) protocol are now complementing this foundation, providing structured, interoperable layers for managing context and coordination across intelligent agents—making AI not just event-driven, but also context-aware and collaborative.

Event-Driven Agentic AI with Data Streaming using Apache Kafka and Flink

In this post, I will explore:

  • How Agentic AI works and why it needs real-time data
  • Why event-driven architectures are the best choice for AI automation
  • Key use cases across industries
  • How Kafka and Flink provide the necessary data consistency and real-time intelligence for AI-driven decision-making
  • The role of MCP, A2A, and frameworks like LangChain and LlamaIndex in enabling scalable, context-aware, and collaborative AI systems

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And make sure to download my free book about data streaming use cases.

What is Agentic AI?

Agentic AI refers to AI systems that exhibit autonomous, goal-driven decision-making and execution. Unlike traditional automation tools that follow rigid workflows, Agentic AI can:

  • Understand and interpret natural language instructions
  • Set objectives, create strategies, and prioritize actions
  • Adapt to changing conditions and make real-time decisions
  • Execute multi-step tasks with minimal human supervision
  • Integrate with multiple operational and analytical systems and data sources to complete workflows

Here is an example AI Agent dependency graph from Sean Falconer’s article “Event-Driven AI: Building a Research Assistant with Kafka and Flink“:

Example AI Agent Dependency Graph
Source: Sean Falconer

Instead of merely analyzing data, Agentic AI acts on data, making it invaluable for operational and transactional use cases—far beyond traditional analytics.

However, without real-time, high-integrity data, these systems cannot function effectively. If AI is working with stale, incomplete, or inconsistent information, its decisions become unreliable and even counterproductive. This is where Kafka, Flink, and event-driven architectures become indispensable.

Why Batch Processing Fails for Agentic AI

Traditional AI and analytics systems have relied heavily on batch processing, where data is collected, stored, and processed in predefined intervals. This approach may work for generating historical reports or training machine learning models offline, but it completely breaks down when applied to operational and transactional AI use cases—which are at the core of Agentic AI.

Why Batch Processing Fails for Agentic AI

I recently explored the Top 20 Problems with Batch Processing (and How to Fix Them with Data Streaming). And here’s why batch processing is fundamentally incompatible with Agentic AI and the real-world challenges it creates:

1. Delayed Decision-Making Slows AI Reactions

Agentic AI systems are designed to autonomously respond to real-time changes in the environment, whether it’s optimizing a telecommunications network, detecting fraud in banking, or dynamically adjusting supply chains.

In a batch-driven system, data is processed hours or even days later, making AI responses obsolete before they even reach the decision-making phase. For example:

  • Fraud detection: If a bank processes transactions in nightly batches, fraudulent activities may go unnoticed for hours, leading to financial losses.
  • E-commerce recommendations: If a retailer updates product recommendations only once per day, it fails to capture real-time shifts in customer behavior.
  • Network optimization: If a telecom company analyzes network traffic in batch mode, it cannot prevent congestion or outages before it affects users.

Agentic AI requires instantaneous decision-making based on streaming data, not delayed insights from batch reports.

2. Data Staleness Creates Inaccurate AI Decisions

AI agents must act on fresh, real-world data, but batch processing inherently means working with outdated information. If an AI agent is making decisions based on yesterday’s or last hour’s data, those decisions are no longer reliable.

Consider a self-healing IT infrastructure that uses AI to detect and mitigate outages. If logs and system metrics are processed in batch mode, the AI agent will be acting on old incident reports, missing live system failures that need immediate attention.

In contrast, an event-driven system powered by Kafka and Flink ensures that AI agents receive live system logs as they occur, allowing for proactive self-healing before customers are impacted.

3. High Latency Kills Operational AI

In industries like finance, healthcare, and manufacturing, even a few seconds of delay can lead to severe consequences. Batch processing introduces significant latency, making real-time automation impossible.

For example:

  • Healthcare monitoring: A real-time AI system should detect abnormal heart rates from a patient’s wearable device and alert doctors immediately. If health data is only processed in hourly batches, a critical deterioration could be missed, leading to life-threatening situations.
  • Automated trading in finance: AI-driven trading systems must respond to market fluctuations within milliseconds. Batch-based analysis would mean losing high-value trading opportunities to faster competitors.

Agentic AI must operate on a live data stream, where every event is processed instantly, allowing decisions to be made in real-time, not retrospectively.

4. Rigid Workflows Increase Complexity and Costs

Batch processing forces businesses to predefine rigid workflows that do not adapt well to changing conditions. In a batch-driven world:

  • Data must be manually scheduled for ingestion.
  • Systems must wait for the entire dataset to be processed before making decisions.
  • Business logic is hard-coded, requiring expensive engineering effort to update workflows.

Agentic AI, on the other hand, is designed for continuous, adaptive decision-making. By leveraging an event-driven architecture, AI agents listen to streams of real-time data, dynamically adjusting workflows on the fly instead of relying on predefined batch jobs.

This flexibility is especially critical in industries with rapidly changing conditions, such as supply chain logistics, cybersecurity, and IoT-based smart cities.

5. Batch Processing Cannot Support Continuous Learning

A key advantage of Agentic AI is its ability to learn from past experiences and self-improve over time. However, this is only possible if AI models are continuously updated with real-time feedback loops.

Batch-driven architectures limit AI’s ability to learn because:

  • Models are retrained infrequently, leading to outdated insights.
  • Feedback loops are slow, preventing AI from adjusting strategies in real time.
  • Drift in data patterns is not immediately detected, causing AI performance degradation.

For instance, in customer service chatbots, an AI-powered agent should adapt to customer sentiment in real time. If a chatbot is trained on stale customer interactions from last month, it won’t understand emerging trends or newly common issues.

By contrast, a real-time data streaming architecture ensures that AI agents continuously receive live customer interactions, retrain in real time, and evolve dynamically.

Agentic AI Requires an Event-Driven Architecture

Agentic AI must act in real time and integrate operational and analytical information. Whether it’s an AI-driven fraud detection system, an autonomous network optimization agent, or a customer service chatbot, acting on outdated information is not an option.

The Event-Driven Approach

An Event-Driven Architecture (EDA) enables continuous processing of real-time data streams, ensuring that AI agents always have the latest information available. By decoupling applications and processing events asynchronously, EDA allows AI to respond dynamically to changes in the environment without being constrained by rigid workflows.

Event-driven Architecture for Data Streaming with Apache Kafka and Flink

AI can also be seamlessly integrated into existing business processes leveraging an EDA, bridging modern and legacy technologies without requiring a complete system overhaul. Not every data source may be real-time, but EDA ensures data consistency across all consumers—if an application processes data, it sees exactly what every other application sees. This guarantees synchronized decision-making, even in hybrid environments combining historical data with real-time event streams.

Why Apache Kafka is Essential for Agentic AI

For AI to be truly autonomous and effective, it must operate in real time, adapt to changing conditions, and ensure consistency across all applications. An Event-Driven Architecture (EDA) built with Apache Kafka provides the foundation for this by enabling:

  • Immediate Responsiveness → AI agents receive and act on events as they occur.
  • High Scalability → Components are decoupled and can scale independently.
  • Fault Tolerance → AI processes continue running even if some services fail.
  • Improved Data Consistency → Ensures AI agents are working with accurate, real-time data.

To build truly autonomous AI systems, organizations need a real-time data infrastructure that can process, analyze, and act on events as they happen.

Building Event-Driven Multi-Agents with Data Streaming using Apache Kafka and Flink
Source: Sean Falconer

Apache Kafka: The Real-Time Data Streaming Backbone

Apache Kafka provides a scalable, event-driven messaging infrastructure that ensures AI agents receive a constant, real-time stream of events. By acting as a central nervous system, Kafka enables:

  • Decoupled AI components that communicate through event streams.
  • Efficient data ingestion from multiple sources (IoT devices, applications, databases).
  • Guaranteed event delivery with fault tolerance and durability.
  • High-throughput processing to support real-time AI workloads.

Apache Flink complements Kafka by providing stateful stream processing for AI-driven workflows. With Flink, AI agents can:

  • Analyze real-time data streams for anomaly detection, predictions, and decision-making.
  • Perform complex event processing to detect patterns and trigger automated responses.
  • Continuously learn and adapt based on evolving real-time data.
  • Orchestrate multi-agent workflows dynamically.

Across industries, Agentic AI is redefining how businesses and governments operate. By leveraging event-driven architectures and real-time data streaming, organizations can unlock the full potential of AI-driven automation, improving efficiency, reducing costs, and delivering better experiences.

Here are key use cases across different industries:

Financial Services: Real-Time Fraud Detection and Risk Management

Traditional fraud detection systems rely on batch processing, leading to delayed responses and financial losses.

Agentic AI enables real-time transaction monitoring, detecting anomalies as they occur and blocking fraudulent activities instantly.

AI agents continuously learn from evolving fraud patterns, reducing false positives and improving security. In risk management, AI analyzes market trends, adjusts investment strategies, and automates compliance processes to ensure financial institutions stay ahead of threats and regulatory requirements.

Telecommunications: Autonomous Network Optimization

Telecom networks require constant tuning to maintain service quality, but traditional network management is reactive and expensive.

Agentic AI can proactively monitor network traffic, predict congestion, and automatically reconfigure network resources in real time. AI-powered agents optimize bandwidth allocation, detect outages before they impact customers, and enable self-healing networks, reducing operational costs and improving service reliability.

Retail: AI-Powered Personalization and Dynamic Pricing

Retailers struggle with static recommendation engines that fail to capture real-time customer intent.

Agentic AI analyzes customer interactions, adjusts recommendations dynamically, and personalizes promotions based on live purchasing behavior. AI-driven pricing strategies adapt to supply chain fluctuations, competitor pricing, and demand changes in real time, maximizing revenue while maintaining customer satisfaction.

AI agents also enhance logistics by optimizing inventory management and reducing stock shortages.

Healthcare: Real-Time Patient Monitoring and Predictive Care

Hospitals and healthcare providers require real-time insights to deliver proactive care, but batch processing delays critical decisions.

Agentic AI continuously streams patient vitals from medical devices to detect early signs of deterioration and triggering instant alerts to medical staff. AI-driven predictive analytics optimize hospital resource allocation, improve diagnosis accuracy, and enable remote patient monitoring, reducing emergency incidents and improving patient outcomes.

Gaming: Dynamic Content Generation and Adaptive AI Opponents

Modern games need to provide immersive, evolving experiences, but static game mechanics limit engagement.

Agentic AI enables real-time adaptation of gameplay to generate dynamic environments and personalizing challenges based on a player’s behavior. AI-driven opponents can learn and adapt to individual playstyles, keeping games engaging over time. AI agents also manage server performance, detect cheating, and optimize in-game economies for a better gaming experience.

Manufacturing & Automotive: Smart Factories and Autonomous Systems

Manufacturing relies on precision and efficiency, yet traditional production lines struggle with downtime and defects.

Agentic AI monitors production processes in real time to detect quality issues early and adjusting machine parameters autonomously. This directly improves Overall Equipment Effectiveness (OEE) by reducing downtime, minimizing defects, and optimizing machine performance to ensure higher productivity and operational efficiency to ensure higher productivity and operational efficiency.

In automotive, AI-driven agents analyze real-time sensor data from self-driving cars to make instant navigation decisions, predict maintenance needs, and optimize fleet operations for logistics companies.

Public Sector: AI-Powered Smart Cities and Citizen Services

Governments face challenges in managing infrastructure, public safety, and citizen services efficiently.

Agentic AI can optimize traffic flow by analyzing real-time data from sensors and adjusting signals dynamically. AI-powered public safety systems detect threats from surveillance data and dispatch emergency services instantly. AI-driven chatbots handle citizen inquiries, automate document processing, and improve response times for government services.

The Business Value of Real-Time AI using Autonomous Agents

By leveraging Kafka and Flink in an event-driven AI architecture, organizations can achieve:

  • Better Decision-Making → AI operates on fresh, accurate data.
  • Faster Time-to-Action → AI agents respond to events immediately.
  • Reduced Costs → Less reliance on expensive batch processing and manual intervention by humans.
  • Greater Scalability → AI systems can handle massive workloads in real time.
  • Vendor Independence → Kafka and Flink support open standards and hybrid/multi-cloud deployments, preventing vendor lock-in.

Why LangChain, LlamaIndex, and Similar Frameworks Are Not Enough for Agentic AI in Production

Frameworks like LangChain, LlamaIndex, and others have gained popularity for making it easy to prototype AI agents by chaining prompts, tools, and external APIs. They provide useful abstractions for reasoning steps, retrieval-augmented generation (RAG), and basic tool use—ideal for experimentation and lightweight applications.

However, when building agentic AI for operational, business-critical environments, these frameworks fall short on several fronts:

  • Many frameworks like LangChain are inherently synchronous and follows a request-response model, which limits its ability to handle real-time, event-driven inputs at scale. In contrast, LlamaIndex takes an event-driven approach, using a message broker—including support for Apache Kafka—for inter-agent communication.
  • Debugging, observability, and reproducibility are weak—there’s often no persistent, structured record of agent decisions or tool interactions.
  • State is ephemeral and in-memory, making long-running tasks, retries, or rollback logic difficult to implement reliably.
  • Most Agentic AI frameworks lack support for distributed, fault-tolerant execution and scalable orchestration, which are essential for production systems.

That said, these frameworks like LangChain and Llamaindex can still play a valuable, complementary role when integrated into an event-driven architecture. For example, an agent might use LangChain for planning or decision logic within a single task, while Apache Kafka and Apache Flink handle the real-time flow of events, coordination between agents, persistence, and system-level guarantees.

LangChain and similar toolkits help define how an agent thinks. But to run that thinking at scale, in real time, and with full traceability, you need a robust data streaming foundation. That’s where Kafka and Flink come in.

Model Context Protocol (MCP) and Agent-to-Agent (A2A) for Scalable, Composable Agentic AI Architectures

Model Context Protocol (MCP) is one of the hottest topics in AI right now. Coined by Anthropic, with early support emerging from OpenAI, Google, and other leading AI infrastructure providers, MCP is rapidly becoming a foundational layer for managing context in agentic systems. MCP enables systems to define, manage, and exchange structured context windows—making AI interactions consistent, portable, and state-aware across tools, sessions, and environments.

Google’s recently announced Agent-to-Agent (A2A) protocol adds further momentum to this movement, setting the groundwork for standardized interaction across autonomous agents. These advancements signal a new era of AI interoperability and composability.

Together with Kafka and Flink, MCP and protocols like A2A help bridge the gap between stateless LLM calls and stateful, event-driven agent architectures. Naturally, event-driven architecture is the perfect foundation for all this. The key now is to build enough product functionality and keep pushing the boundaries of innovation.

A dedicated blog post is coming soon to explore how MCP and A2A connect data streaming and request-response APIs in modern AI systems.

Agentic AI is poised to revolutionize industries by enabling fully autonomous, goal-driven AI systems that perceive, decide, and act continuously. But to function reliably in dynamic, production-grade environments, these agents require real-time, event-driven architectures—not outdated, batch-oriented pipelines.

Apache Kafka and Apache Flink form the foundation of this shift. Kafka ensures agents receive reliable, ordered event streams, while Flink provides stateful, low-latency stream processing for real-time reactions and long-lived context management. This architecture enables AI agents to process structured events as they happen, react to changes in the environment, and coordinate with other services or agents through durable, replayable data flows.

If your organization is serious about AI, the path forward is clear:

Move from batch to real-time, from passive analytics to autonomous action, and from isolated prompts to event-driven, context-aware agents—enabled by Kafka and Flink.

As a next step, learn more about “Online Model Training and Model Drift in Machine Learning with Apache Kafka and Flink“.

Let’s connect on LinkedIn and discuss how to implement these ideas in your organization. Stay informed about new developments by subscribing to my newsletter. And make sure to download my free book about data streaming use cases.

The post How Apache Kafka and Flink Power Event-Driven Agentic AI in Real Time appeared first on Kai Waehner.

]]>
CIO Summit: The State of AI and Why Data Streaming is Key for Success https://www.kai-waehner.de/blog/2025/03/13/cio-summit-the-state-of-ai-and-why-data-streaming-is-key-for-success/ Thu, 13 Mar 2025 07:31:33 +0000 https://www.kai-waehner.de/?p=7582 The CIO Summit in Amsterdam provided a valuable perspective on the state of AI adoption across industries. While enthusiasm for AI remains high, organizations are grappling with the challenge of turning potential into tangible business outcomes. Key discussions centered on distinguishing hype from real value, the importance of high-quality and real-time data, and the role of automation in preparing businesses for AI integration. A recurring theme was that AI is not a standalone solution—it must be supported by a strong data foundation, clear ROI objectives, and a strategic approach. As AI continues to evolve toward more autonomous, agentic systems, data streaming will play a critical role in ensuring AI models remain relevant, context-aware, and actionable in real time.

The post CIO Summit: The State of AI and Why Data Streaming is Key for Success appeared first on Kai Waehner.

]]>
This week, I had the privilege of engaging in insightful conversations at the CIO Summit organized by GDS Group in Amsterdam, Netherlands. The event brought together technology leaders from across Europe and industries such as financial services, manufacturing, energy, gaming, telco, and more. The focus? AI – but with a much-needed reality check. While the potential of AI is undeniable, the hype often outpaces real-world value. Discussions at the summit revolved around how enterprises can move beyond experimentation and truly integrate AI to drive business success.

Learnings from the CIO Summit in Amsterdam by GDS Group

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And make sure to download my free book about data streaming use cases, industry success stories and business value.

Key Learnings on the State of AI

The CIO Summit in Amsterdam provided a reality check on AI adoption across industries. While excitement around AI is high, success depends on moving beyond the hype and focusing on real business value. Conversations with technology leaders revealed critical insights about AI’s maturity, challenges, and the key factors driving meaningful impact. Here are the most important takeaways.

AI is Still in Its Early Stages – Beware of the Buzz vs. Value

The AI landscape is evolving rapidly, but many organizations are still in the exploratory phase. Executives recognize the enormous promise of AI but also see challenges in implementation, scaling, and achieving meaningful ROI.

The key takeaway? AI is not a silver bullet. Companies that treat it as just another trendy technology risk wasting resources on hype-driven projects that fail to deliver tangible outcomes.

Generative AI vs. Predictive AI – Understanding the Differences

There was a lot of discussion about Generative AI (GenAI) vs. Predictive AI, two dominant categories that serve very different purposes:

  • Predictive AI analyzes historical and real-time data to forecast trends, detect anomalies, and automate decision-making (e.g., fraud detection, supply chain optimization, predictive maintenance).
  • Generative AI creates new content based on trained data (e.g., text, images, or code), enabling applications like automated customer service, software development, and marketing content generation.

While GenAI has captured headlines, Predictive AI remains the backbone of AI-driven automation in enterprises. CIOs must carefully evaluate where each approach adds real business value.

Good Data Quality is Non-Negotiable

A critical takeaway: AI is only as good as the data that fuels it. Poor data quality leads to inaccurate AI models, bad predictions, and failed implementations.

To build trustworthy and effective AI solutions, organizations need:

✅ Accurate, complete, and well-governed data

✅ Real-time and historical data integration

✅ Continuous data validation and monitoring

Context Matters – AI Needs Real-Time Decision-Making

Many AI use cases rely on real-time decision-making. A machine learning model trained on historical data is useful, but without real-time context, it quickly becomes outdated.

For example, fraud detection systems need to analyze real-time transactions while comparing them to historical behavioral patterns. Similarly, AI-powered supply chain optimization depends on up-to-the-minute logistics datarather than just past trends.

The conclusion? Real-time data streaming is essential to unlocking AI’s full potential.

Automate First, Then Apply AI

One common theme among successful AI adopters: Optimize business processes before adding AI.

Organizations that try to retrofit AI onto inefficient, manual processes often struggle with adoption and ROI. Instead, the best approach is:

1⃣ Automate and optimize workflows using real-time data

2⃣ Apply AI to enhance automation and improve decision-making

By taking this approach, companies ensure that AI is applied where it actually makes a difference.

ROI Matters – AI Must Drive Business Value

CIOs are under pressure to deliver business-driven, NOT tech-driven AI projects. AI initiatives that lack a clear ROI roadmap often stall after pilot phases.

Two early success stories for Generative AI stand out:

  • Customer support – AI chatbots and virtual assistants enhance response times and improve customer experience.
  • Software engineering – AI-powered code generation boosts developer productivity and reduces time to market.

The lesson? Start with AI applications that deliver clear, measurable business impact before expanding into more experimental areas.

Data Streaming and AI – The Perfect Match

At the heart of AI’s success is data streaming. Why? Because modern AI requires a continuous flow of fresh, real-time data to make accurate predictions and generate meaningful insights.

Data streaming not only powers AI with real-time insights but also ensures that AI-driven decisions directly translate into measurable business value:

Business Value of Data Streaming with Apache Kafka and Flink in the free Confluent eBook

Here’s how data streaming powers both Predictive and Generative AI:

Predictive AI + Data Streaming

Predictive AI thrives on timely, high-quality data. Real-time data streaming enables AI models to process and react to events as they happen. Examples include:

✔ Fraud detection: AI analyzes real-time transactions to detect suspicious activity before fraud occurs.

✔ Predictive maintenance: Streaming IoT sensor data allows AI to predict equipment failures before they happen.

✔ Supply chain optimization: AI dynamically adjusts logistics routes based on real-time disruptions.

Here is an example from Capital One bank about fraud detection and prevention in real-time, preventing $150 of fraud on average a year/customer:

Predictive AI for Fraud Detection and Prevention at Capital One Bank with Data Streaming
Source: Confluent

Generative AI + Data Streaming

Generative AI also benefits from real-time data. Instead of relying on static datasets, streaming data enhances GenAI applications by incorporating the latest information:

✔ AI-powered customer support: Chatbots analyze live customer interactions to generate more relevant responses.

✔ AI-driven marketing content: GenAI adapts promotional messaging in real-time based on customer engagement signals.

✔ Software development acceleration: AI assistants provide real-time code suggestions as developers write code.

In short, without real-time data, AI is limited to outdated insights.

Here is an example for GenAI with data streaming in the travel Industry by Expedia where 60% of travelers are self-servicing in chat, saving 40+% of variable agent cost:

Generative AI at Expedia in Travel for Customer Service with Chatbots, GenAI and Data Streaming
Source: Confluent

The Future of AI: Agentic AI and the Role of Data Streaming

As AI evolves, we are moving toward Agentic AI – systems that autonomously take actions, learn from feedback, and adapt in real time.

For example:

✅ AI-driven cybersecurity systems that detect and respond to threats instantly

✅ Autonomous supply chains that dynamically adjust based on demand shifts

✅ Intelligent business operations where AI continuously optimizes workflows

But Agentic AI can only work if it has access to real-time operational AND analytical data. That’s why data streaming is becoming a critical foundation for the next wave of AI innovation.

The Path to AI Success

The CIO Summit reinforced one key message: AI is here to stay, but its success depends on strategy, data quality, and business value – not just hype.

Organizations that:

✅ Focus on AI applications with clear business ROI

✅ Automate before applying AI

✅ Prioritize real-time data streaming

… will be best positioned to drive AI success at scale.

As AI moves towards autonomous decision-making (Agentic AI), data streaming will become even more critical. The ability to process and act on real-time data will separate AI leaders from laggards.

Now the real question: Where is your AI strategy headed? Let’s discuss!

Stay ahead of the curve! Subscribe to my newsletter for insights into data streaming and connect with me on LinkedIn to continue the conversation. And make sure to download my free book focusing on data streaming use cases, industry stories and business value.

The post CIO Summit: The State of AI and Why Data Streaming is Key for Success appeared first on Kai Waehner.

]]>