What is Pub/Sub for Tables?

April 10, 2025

What is Pub/Sub for Tables?

By:

Arvind Prabhakar

Pub/Sub for Tables redefines the publish-subscribe model by making tables the fundamental unit of publication and subscription, shifting the focus from event propagation to data propagation.

This article breaks down the above definition into its key components, and contrasts it with the traditional pub/sub systems like Kafka and message queues. By the end, you will see how Pub/Sub for Tables offers a more scalable, governed, and decoupled foundation for data integration.

Take Action: If you would like to deploy a Pub/Sub for Tables solution, check out what we have to offer at Tabsdata. Hint: it’s as simple as $pip install tabsdata.

It Redefines the Publish-Subscribe Model

The traditional publish-subscribe model decouples producers and consumers. Systems like Apache Kafka, RabbitMQ, and Google Pub/Sub enable services to exchange messages through shared topics without direct connections. This made service integration more scalable and resilient.

However, this model was designed for event delivery, not data integration. What gets published is usually a small, stateless message that signals a change. Consumers must collect, process, and reconstruct these messages to build usable datasets. That complexity lives entirely on the consumer side.

This approach is service-centric. It solves for integration between microservices, not for integrating governed datasets across a data platform.

Pub/Sub for Tables reimagines the model for data integration. Instead of publishing messages, producers publish entire tables that are structured, versioned, and ready to query. Consumers subscribe to complete datasets, not streams of fragmented events.

This turns pub/sub into a stable contract between producers and consumers. It shifts the purpose from notifying about changes to delivering high-quality data assets that are immediately usable.

By Making Tables the Fundamental Unit

Traditional pub/sub systems operate at the level of individual messages. Each message represents a discrete event, often stripped of context. While this works well for signaling that something has changed, it falls short when the goal is to deliver meaningful, complete datasets.

Pub/Sub for Tables replaces messages with tables as the core unit of integration. What gets published is not a single change event or row, but a full dataset — structured, versioned, and governed. Tables are durable and queryable, with clearly defined schemas and semantics.

This introduces a higher-level contract between producers and consumers. Instead of working with fragments, consumers interact with entire assets that are immediately useful and trustworthy.

Why does this matter for data integration?

Tables are the native language of data platforms. They allow:

Richer contracts between producers and consumers
Predictable, self-service interfaces for downstream users
Clear schema evolution and versioning
Stronger alignment with governance, compliance, and lineage

By elevating the table to the fundamental unit of integration, Pub/Sub for Tables simplifies how data is shared, understood, and trusted across teams.

[Why Tables Are the Universal Format for Enterprise Data

Across the enterprise, data comes from a wide range of sources — databases, files, APIs, event streams, and more. These sources expose data in different formats such as JSON, CSV, XML, logs, or binary records.

Regardless of the source or original format, that data must be structured before it can be used. Whether for reporting, analytics, machine learning, or other uses, the first step is to impose a schema that organizes the data into fields and records.

This is why tables are such a powerful and universal abstraction. They provide a consistent way to represent datasets across all domains, even when the original source was unstructured or semi-structured.

In the context of Pub/Sub for Tables, a “table” is a generic concept. It refers to any structured, versioned representation of data — regardless of its origin, enabling a consistent, interoperable layer for enterprise data integration]

Shifting the Focus from Event Propagation to Data Propagation

Traditional pub/sub systems are built around event propagation. A producer emits an event to signal that something happened — a transaction was recorded, a file was uploaded, or a state changed. Consumers listen for those events and decide what to do with them. The message itself is usually small and stateless, and the value comes from triggering downstream processes or workflows.

This model works well when the goal is to react to a change. But it does not solve the problem of delivering complete, trustworthy data to the people and systems that need it.

Pub/Sub for Tables shifts the intent from event propagation to data propagation. It is not just about sending a signal that data has changed. It is about delivering the actual dataset, in full, with structure, semantics, and guarantees.

This change brings clarity to the roles involved:

Publishers own and publish datasets that are complete, versioned, and governed.
Subscribers consume those datasets through a stable interface, without needing to rebuild context or extract fragments from logs or event streams.

This shift also clarifies what makes integration scalable. In traditional ETL, data consumers depend on custom pipelines to ingest and transform upstream data. Those pipelines are tightly coupled, hard to govern, and difficult to scale. In message-based pub/sub, the burden of state reconstruction falls on each consumer, creating duplication of logic and inconsistency across teams.

By contrast, Pub/Sub for Tables gives consumers access to ready-to-use data assets. Producers do not need to know how each consumer will use the data. Consumers do not need to know how the data was generated. Both sides operate independently, through a shared interface.

This approach turns data propagation into a governed, self-service capability. It removes ambiguity, reduces integration overhead, and helps data teams move faster with confidence.

Decoupling as the Core Design Principle

The most powerful idea behind Pub/Sub for Tables is decoupling — not just at the technical level, but also at the organizational and operational level. This model separates the concerns of data producers and data consumers, allowing each to operate independently without relying on tightly managed pipelines or shared orchestration layers.

In traditional data integration, data engineers often need to build and maintain connection logic tailored to each source and destination system, while keeping in mind any consumer requirements. These custom pipelines are fragile, hard to govern, and expensive to operate. Any change on one side risks breaking something on the other, and even affecting unrelated workloads through the transient dependencies between pipelines and systems they integrate.

Pub/Sub for Tables breaks that dependency.

Producers are responsible for publishing well-defined tables, with schema, semantics, and metadata. They do not need to know who will consume the data or how it will be used. Consumers can subscribe to those tables through a stable interface and begin using the data immediately, without needing to ask for a custom pipeline or transformation job.

This decoupling delivers several benefits:

Faster onboarding for new data consumers
Reduced coordination overhead between teams
Better scalability as the number of datasets and users grows
Stronger governance and observability, since all access happens through a controlled interface

Most importantly, it simplifies the mental model of integration. Instead of managing pipeline complexity, teams can focus on publishing and consuming data assets. That shift is what makes Pub/Sub for Tables a true platform-level solution for data integration.

Key Differences from Traditional Pub/Sub Systems

Although Pub/Sub for Tables builds on the foundational idea of publish-subscribe, it introduces a fundamentally different approach to integration. The differences lie in what gets published, how consumers interact with it, and the outcomes it enables.

Here’s how Pub/Sub for Tables (table-based system) compares to traditional pub/sub systems (message-based system):

Message-based systems deliver event notifications. Table-based systems deliver complete, structured datasets.
Messages in message-based systems are stateless. Table-based systems work with versioned, stateful tables that capture a full view of the data.
Consumers in message-based systems must reconstruct state. With table-based systems, consumers query ready-to-use data.
Message-based systems focus on system integration. Table-based systems focus on data integration.
Messages are typically ephemeral or fire-and-forget. Tables in table-based systems are versioned, persistent and governed.

In message-based systems, each event is an isolated signal. Consumers must process many such messages in sequence to rebuild the full picture. This creates complexity and often leads to inconsistency across consumers.

Pub/Sub for Tables changes that dynamic. It delivers complete datasets, with structure and lineage, that reflect a specific version or point in time. Consumers do not need to infer meaning or stitch data together. They receive high-quality, trustworthy assets that are immediately usable.

This shift enables more predictable behavior, better governance, and a simpler, more sustainable integration model. It reduces duplication of logic, improves auditability, and brings clarity to how data flows through the enterprise.

Why This Matters Now

Data integration has become one of the most costly and complex functions inside modern data platforms. Teams are spending more time managing pipelines, debugging transformations, and navigating handoffs than actually delivering value from data. As data volumes grow and business requirements evolve, the old models are breaking down.

Pub/Sub for Tables introduces a way to simplify integration while improving control, scalability, and trust.

This model aligns directly with several critical shifts happening across enterprise data:

Lowering cost and improving manageability: Table-based integration removes the need for bespoke pipelines and custom logic for each consumer. It reduces operational overhead and allows data teams to scale their efforts without adding complexity.
Data contracts: By using versioned, structured tables as the unit of exchange, producers and consumers gain a shared understanding of what the data is, how it should behave, and when changes occur. This strengthens accountability and makes integration safer by default.
Data products: Tables published through Pub/Sub for Tables can be transformed within the system to create self-contained, governed data products. Each one can be versioned, monitored, and reused across many consumers, without custom pipelines or ad hoc coordination.
Governance and compliance: Built-in metadata, lineage, provenance, and access control make it easier to enforce policies and track how data moves. This reduces risk and makes audits and compliance reporting far more straightforward.
Self-service access to high-trust data: Consumers can subscribe to tables with confidence, knowing that they are working with clean, complete, and well-documented datasets. This unlocks self-service analytics and speeds up time-to-insight.

All of this is possible because Pub/Sub for Tables changes the shape of integration. It shifts the responsibility from orchestrating movement to publishing and subscribing to assets, with structure and guarantees built in.

Conclusion

Pub/Sub for Tables redefines the publish-subscribe model by making tables the fundamental unit of publication and subscription, shifting the focus from event propagation to data propagation.

This simple shift unlocks a powerful new model for data integration — one that replaces brittle pipelines and complex orchestration with stable contracts, governed assets, and self-service access. By elevating tables as the shared interface between producers and consumers, organizations can simplify integration, scale delivery, and build trust in their data.

As enterprises continue to adopt data contracts, data products, and more rigorous governance practices, Pub/Sub for Tables provides the architectural foundation needed to support that shift.

Tune into our on-going mini-series on How Pub/Sub for Tables Fixes What Data Pipelines Broke. In the coming weeks we will take a closer look at Data Contracts, Data Products and how this can serve as the backbone of a truly modern data platform.

‍