Interviews

Confluent Fueling Digital India with Real-Time Data at Scale

As companies in India accelerate their digital transformation efforts, real-time data streaming is becoming critical to gain insights and react quickly. In an interview with Rubal Sahni, Area Vice President & Country Manager – India for Confluent, we discussed how the company is empowering key industries like BFSI and internet companies with its data streaming solutions built on Apache Kafka. He outlined growth drivers like digital banking, e-commerce, and gaming, along with Confluent’s focus on boosting India’s startup ecosystem.

 

Data streaming and real-time data processing are becoming increasingly important as companies aim to gain insights and react quickly. Can you explain Confluent’s role in the data streaming space and how your technology is helping drive this evolution?

Confluent’s origin and mission is to set data in motion. In today’s competitive business landscape, where competition is plentiful and the pace of innovation continues to accelerate, there is a need to respond quickly and ideally glean insights instantaneously from real-time data.

We do this by focusing on delivering the four core capabilities of the data streaming platform. First, it starts with streaming. With streaming, data is not static or passive. It’s continuously moving, evolving, being processed and being shared in real time to power the business. Second, we make connecting to static data and historical data seamless and enable continuous movement of all data, from anywhere it is from to where it needs to go. Third, we stitch and process data from all across the organization to produce high-value data assets. Last, we make sure these high-value data assets are trustworthy by implementing quality controls and governance policies, and discoverable through catalogs.

The Confluent solution is built on Apache Kafka, which was also created by the founders of Confluent just over a decade ago while working at LinkedIn and has become the de facto technology standard for data streaming in the industry. Confluent completes and enhances the value proposition offered by Apache Kafka with a fully managed service for streaming data. Our technology incorporates the infinite capacity, on-demand scalability, and global nature of public clouds into every aspect of the platform design. This enables use cases like elastically scaling up and down to match incoming demand in real-time. The result is an enterprise-grade, cloud-native solution for managing data in motion at any scale. We call this cloud-native engine ‘Kora’.

We understand that with the increasing importance of real-time data in modern businesses, companies are leveraging distributed streaming platforms to process and analyze data streams in real time. They may be at different transition points in their cloud journey, or they may choose to stay on-premises. We continue to build on the innovation we have made in our Confluent Platform solution to provide users with a comprehensive, self-managed platform for Apache Kafka.

I mentioned earlier the importance of stream processing in the data streaming landscape. While streaming allows data to be shared across the organization, data is far more valuable when you can clean and enrich it. This is where stream processing comes in. As one of the top projects of the Apache Software Foundation, Flink has emerged as the de facto standard for stream processing. Flink’s high performance, rich feature set, and robust developer community make it one of the most popular choices for large-scale, high throughput, and low-latency stream processing. It is often used in tandem with Kafka to support companies’ mission-critical workloads. We’ve rearchitected Flink as a cloud-native service on Confluent Cloud and look to deliver a fully managed service for stream processing that makes it easier for companies to filter, join, and enrich data streams with Flink.

 

What are some of the key industries and use cases you see as important growth drivers for Confluent in India?

 India is poised for strong economic growth, with projections that GDP should increase by at least 7% in 2024-25 on the back of robust momentum. As digital transformation accelerates across industries, Confluent sees immense potential to enable real-time data streaming use cases.

In India, we are witnessing strong traction in the BFSI sector, as banks and fintechs leverage streaming for services like UPI payments, fraud detection, and stock market data analytics. Internet companies and global capability centers are also major focus areas.

Some of the key use cases we empower include digital banking, trading, payments, OTT platforms, online food delivery, e-commerce, gaming, stock trading, and more. Major Indian companies like Swiggy and Meesho are using Confluent to drive their businesses forward. Our strength is in helping our customers operate the data streaming platform at scale with the deep Kafka expertise we possess.

 

How is Confluent aiming to support the growth of India’s startup ecosystem through its data streaming solutions?

Startups, especially high-growth business ventures, often experience fluctuating demand and data volumes as they scale. User engagement and experience is paramount for them to succeed.

As ingestion rates increase significantly, these startups need to deal with varying workload peaks while maintaining the ability to quickly gain insights and improve end-user experiences. Confluent’s data platform provides the agility, scalability, elasticity and resiliency required to operate at their pace. Our infrastructure also enables confident decision making based on accurate real-time data.

Moreover, our multi-cloud architecture gives startups the ubiquity to build across environments, without restrictions.

To further boost the startup ecosystem, we initiated a global Data Streaming Challenge where we saw nearly 100 young startups from 22 countries take part, including India. The winners of this competition, who showcased innovative streaming applications built with Confluent, will be provided up to $1M in funding, marketing exposure, and pitching opportunities to VCs.

 

Data streaming and real-time data processing are becoming increasingly critical in today’s fast-paced business environment. What are some of the major trends you see shaping the data streaming space right now?

Data streaming enables the entire organization to have access to trusted real time data and to innovate faster. It also breaks down the divide between the operational and the analytical world to make better decisions. And it gives organizations the flexibility to create new data products and solve new business problems faster and in an easier way.

The beauty of creating data products is that it creates a virtuous cycle where each new product you add can immediately be mixed and matched and remixed into new data products and new solutions and new value. By continuously enriching, continuously synchronizing and continuously sharing data so all data products can have a rich, up-to-date view, the data that organizations hold will begin to have a flywheel effect. This is the change that will help organizations be more resilient.

With AI being top of mind and a priority for organizations everywhere, unsurprisingly the trends in the data streaming sector have also become deeply intertwined.

Many AI learning models today rely on large data lakes or stores — an accumulation of gathering information over multiple years, or even decades. That data is in essence ‘at rest’, which means the majority of learning models are being trained on data that is months behind the true state, and which has been formatted or organized in ways that don’t reflect modern enterprise technologies.

Data streaming stands to enable a new paradigm of AI models that can continuously learn from and respond to real-time data flows. This emerging capability will allow for more adaptable AI-driven insights and actions that adjust seamlessly based on live changes in the data.

 

How has the open-source ecosystem, especially around Apache Kafka and Apache Flink, helped drive innovation and new solutions in the streaming data space?

The open-source ecosystem, around Apache Kafka and Apache Flink, has been instrumental in advancing innovations and new solutions in the streaming data space. This is because they effectively cover the two key components of event-driven architecture – streaming and processing – making data a first class citizen in the organization.

Apache Kafka, one of the most successful open source projects, is used by over 70% of Fortune 500 companies today. Its vibrant developer community, with over 5,000 members in the Bangalore Kafka group alone, drives rapid innovation. Apache Flink is a unified stream and batch processing framework. It has been a top-five Apache project for many years and is on its way to becoming the de-facto standard for stream processing. In fact, its growth rate is following a similar trend to Kafka from four years ago. Both projects have emerged as the bedrock for efficient data streaming and stream processing at scale.

The open-source nature of Kafka and Flink lowers the entry barrier for working with streaming data, allowing companies to easily build use cases and solutions. Their wide adoption provides a standardized framework to build upon rather than reinventing the wheel. So most organizations probably already have Kafka and/or Flink in their organization.

However, there are some limitations with solely relying on open source streaming technology as well. Companies often end up spending more to efficiently manage, scale, secure and evolve the streaming infrastructure. This is where enterprise-grade streaming platforms like Confluent can complement open source with connectivity, security, management and optimization capabilities suitable for mission-critical workloads.