March 13, 2025
April 10, 2025
Pub/Sub for Tables redefines the publish-subscribe model by making tables the fundamental unit of publication and subscription, shifting the focus from event propagation to data propagation.
This article breaks down the above definition into its key components, and contrasts it with the traditional pub/sub systems like Kafka and message queues. By the end, you will see how Pub/Sub for Tables offers a more scalable, governed, and decoupled foundation for data integration.
Take Action: If you would like to deploy a Pub/Sub for Tables solution, check out what we have to offer at Tabsdata. Hint: it’s as simple as $pip install tabsdata.
The traditional publish-subscribe model decouples producers and consumers. Systems like Apache Kafka, RabbitMQ, and Google Pub/Sub enable services to exchange messages through shared topics without direct connections. This made service integration more scalable and resilient.
However, this model was designed for event delivery, not data integration. What gets published is usually a small, stateless message that signals a change. Consumers must collect, process, and reconstruct these messages to build usable datasets. That complexity lives entirely on the consumer side.
This approach is service-centric. It solves for integration between microservices, not for integrating governed datasets across a data platform.
Pub/Sub for Tables reimagines the model for data integration. Instead of publishing messages, producers publish entire tables that are structured, versioned, and ready to query. Consumers subscribe to complete datasets, not streams of fragmented events.
This turns pub/sub into a stable contract between producers and consumers. It shifts the purpose from notifying about changes to delivering high-quality data assets that are immediately usable.
Traditional pub/sub systems operate at the level of individual messages. Each message represents a discrete event, often stripped of context. While this works well for signaling that something has changed, it falls short when the goal is to deliver meaningful, complete datasets.
Pub/Sub for Tables replaces messages with tables as the core unit of integration. What gets published is not a single change event or row, but a full dataset — structured, versioned, and governed. Tables are durable and queryable, with clearly defined schemas and semantics.
This introduces a higher-level contract between producers and consumers. Instead of working with fragments, consumers interact with entire assets that are immediately useful and trustworthy.
Why does this matter for data integration?
Tables are the native language of data platforms. They allow:
By elevating the table to the fundamental unit of integration, Pub/Sub for Tables simplifies how data is shared, understood, and trusted across teams.
Across the enterprise, data comes from a wide range of sources — databases, files, APIs, event streams, and more. These sources expose data in different formats such as JSON, CSV, XML, logs, or binary records.
Regardless of the source or original format, that data must be structured before it can be used. Whether for reporting, analytics, machine learning, or other uses, the first step is to impose a schema that organizes the data into fields and records.
This is why tables are such a powerful and universal abstraction. They provide a consistent way to represent datasets across all domains, even when the original source was unstructured or semi-structured.
In the context of Pub/Sub for Tables, a “table” is a generic concept. It refers to any structured, versioned representation of data — regardless of its origin, enabling a consistent, interoperable layer for enterprise data integration]
Traditional pub/sub systems are built around event propagation. A producer emits an event to signal that something happened — a transaction was recorded, a file was uploaded, or a state changed. Consumers listen for those events and decide what to do with them. The message itself is usually small and stateless, and the value comes from triggering downstream processes or workflows.
This model works well when the goal is to react to a change. But it does not solve the problem of delivering complete, trustworthy data to the people and systems that need it.
Pub/Sub for Tables shifts the intent from event propagation to data propagation. It is not just about sending a signal that data has changed. It is about delivering the actual dataset, in full, with structure, semantics, and guarantees.
This change brings clarity to the roles involved:
This shift also clarifies what makes integration scalable. In traditional ETL, data consumers depend on custom pipelines to ingest and transform upstream data. Those pipelines are tightly coupled, hard to govern, and difficult to scale. In message-based pub/sub, the burden of state reconstruction falls on each consumer, creating duplication of logic and inconsistency across teams.
By contrast, Pub/Sub for Tables gives consumers access to ready-to-use data assets. Producers do not need to know how each consumer will use the data. Consumers do not need to know how the data was generated. Both sides operate independently, through a shared interface.
This approach turns data propagation into a governed, self-service capability. It removes ambiguity, reduces integration overhead, and helps data teams move faster with confidence.
The most powerful idea behind Pub/Sub for Tables is decoupling — not just at the technical level, but also at the organizational and operational level. This model separates the concerns of data producers and data consumers, allowing each to operate independently without relying on tightly managed pipelines or shared orchestration layers.
In traditional data integration, data engineers often need to build and maintain connection logic tailored to each source and destination system, while keeping in mind any consumer requirements. These custom pipelines are fragile, hard to govern, and expensive to operate. Any change on one side risks breaking something on the other, and even affecting unrelated workloads through the transient dependencies between pipelines and systems they integrate.
Pub/Sub for Tables breaks that dependency.
Producers are responsible for publishing well-defined tables, with schema, semantics, and metadata. They do not need to know who will consume the data or how it will be used. Consumers can subscribe to those tables through a stable interface and begin using the data immediately, without needing to ask for a custom pipeline or transformation job.
This decoupling delivers several benefits:
Most importantly, it simplifies the mental model of integration. Instead of managing pipeline complexity, teams can focus on publishing and consuming data assets. That shift is what makes Pub/Sub for Tables a true platform-level solution for data integration.
Although Pub/Sub for Tables builds on the foundational idea of publish-subscribe, it introduces a fundamentally different approach to integration. The differences lie in what gets published, how consumers interact with it, and the outcomes it enables.
Here’s how Pub/Sub for Tables (table-based system) compares to traditional pub/sub systems (message-based system):
In message-based systems, each event is an isolated signal. Consumers must process many such messages in sequence to rebuild the full picture. This creates complexity and often leads to inconsistency across consumers.
Pub/Sub for Tables changes that dynamic. It delivers complete datasets, with structure and lineage, that reflect a specific version or point in time. Consumers do not need to infer meaning or stitch data together. They receive high-quality, trustworthy assets that are immediately usable.
This shift enables more predictable behavior, better governance, and a simpler, more sustainable integration model. It reduces duplication of logic, improves auditability, and brings clarity to how data flows through the enterprise.
Data integration has become one of the most costly and complex functions inside modern data platforms. Teams are spending more time managing pipelines, debugging transformations, and navigating handoffs than actually delivering value from data. As data volumes grow and business requirements evolve, the old models are breaking down.
Pub/Sub for Tables introduces a way to simplify integration while improving control, scalability, and trust.
This model aligns directly with several critical shifts happening across enterprise data:
All of this is possible because Pub/Sub for Tables changes the shape of integration. It shifts the responsibility from orchestrating movement to publishing and subscribing to assets, with structure and guarantees built in.
Pub/Sub for Tables redefines the publish-subscribe model by making tables the fundamental unit of publication and subscription, shifting the focus from event propagation to data propagation.
This simple shift unlocks a powerful new model for data integration — one that replaces brittle pipelines and complex orchestration with stable contracts, governed assets, and self-service access. By elevating tables as the shared interface between producers and consumers, organizations can simplify integration, scale delivery, and build trust in their data.
As enterprises continue to adopt data contracts, data products, and more rigorous governance practices, Pub/Sub for Tables provides the architectural foundation needed to support that shift.
Tune into our on-going mini-series on How Pub/Sub for Tables Fixes What Data Pipelines Broke. In the coming weeks we will take a closer look at Data Contracts, Data Products and how this can serve as the backbone of a truly modern data platform.