What is publisher subscriber mode?

Mondo Technology Updated on 2024-01-28

The publish-subscribe pattern (sometimes referred to as pub sub) is an architectural design pattern that enables publishers and subscribers to communicate with each other. In this arrangement, publishers and subscribers rely on messages to send messages from publishers to subscribers. Messages (events) are sent by the host (publisher) to channels that subscribers can join.

Compared to older design patterns such as message queues and events, pub sub is more versatile and extensible. The key point is that pub sub allows messages to flow between different system components without the components having to know each other's identities.

The pub sub model is often used in social networks through features such as "following". To better understand how it works, let's consider an example of a social network for sharing recipes.

In this network, users can share their own recipes and follow the recipes of other users. When sharing recipes, users can categorize them by topic, such as meals or ingredient seasonality. When one user follows another, they are subscribing to a recipe posted by that friend.

Subscribers or followers can choose to see all the recipes posted by the people they follow or only those that match their interests. They can also create filters to exclude certain types of recipes, such as those that contain certain ingredients.

Users can follow as many other users as they like, so their timeline will be filled with recipes from a variety of **. However, each recipe is only published once by the original user.

Here, the event bus is responsible for routing the message to the appropriate subscriber. It does this by keeping track of the topics that each subscriber subscribes to.

Publishers decide which topics their messages belong to, and the event bus filters messages by topic before delivering them to the relevant subscribers. For example, if a publisher sends a message for Topic A, the message will be sent to all subscribers who have subscribed to Topic A. Similarly, the message for Topic B will be given to the subscribers of Topic B.

It's important to note that if a publisher sends a message with an incorrect topic, the message will only be delivered to subscribers to that wrong topic. In the image above, if a publisher sends a message for Topic A, but mistakenly marks it as Topic B, the message will only be sent to Subscriber 2 and Subscriber 3.

The reusable arrangement of building blocks and their interconnects is the foundation of software design patterns. In UML design drawings, these modules are typically classes or objects. Modern architectural patterns, on the other hand, treat modules as larger, self-executing processes scattered across a distributed system.

To fully understand the benefits of pub sub design, you must first understand the basic patterns that form an information system and then trace its evolution into a distributed system. Information systems typically consist of a common set of software modules that are organized in this simple sequential structure.

Think of the diagram above as a simple software consisting of three parts. The input module receives user input and converts it into a message and sends it to the processing module. The data is processed by the processing module and sent as a new message to the output module. Use the output module to display the data on the user's screen.

The real world, on the other hand, is rarely so simple. In order to handle appropriately sized concurrent queries, the system will require multiple input and output modules.

At this scale, the system faces the difficulty of routing messages from the input module to the appropriate output module. The input and output modules will require an addressing mechanism to address this challenge. The processing module will process the message based on the address and route it to the relevant recipient. To solve the routing problem, all three modules work together.

The system will be able to manage thousands of concurrent connections at internet scale. Users from all over the world will use the system to send and receive messages. It must also be able to handle a large number of users from all over the world.

On the other hand, the system module will not be able to operate on such a large scale as planned.

The load is too large for the processing module to handle. Due to the large capacity and geographical distribution, the load must be distributed across multiple processing modules.

The dynamics of inputs and outputs vary on this scale. Using predefined addressing between modules adds a lot of overhead.

The first problem can be solved by using multiple processing units. This has the effect of a horizontal separation system. However, this adds to the complexity of routing. The message must now be routed to the appropriate processing module via the input module.

At the internet scale, attaching module-specific routing metadata to messages becomes a bottleneck. Under these conditions, the design of the message transmission from one module to the next requires a radical overhaul.

Low coupling on the publisher side: The publisher does not need to know the number, identity, or type of messages that the subscriber is interested in. They simply output data through the API in response to the correct event. This enables flexibility and scalability, as new subscribers can be easily added to the system without impacting publishers.

Reduce cognitive load on subscribers: Subscribers don't need to worry about the publisher's inner workings or having access to the publisher's feed**. They can only communicate with publishers using their public APIs, which simplifies their understanding of the system.

Separation of concerns: The simplicity of the pub sub schema (where data flows from publisher to subscriber in one way) allows developers to practice fine-grained separation of concerns. This means that different message types can be divided into different categories, each of which fulfills a simple purpose. For example, the data for the subject "cats" can contain information about cats, while the data for the topic "dogs" can contain information about dogs.

Improved testability: Fine-grained control of topics makes it easy to confirm that various event buses are transmitting the necessary messages.

Improved security: The pub sub architecture is ideal for assigning security principles of least privilege or information. Developers can easily create modules that subscribe to only the minimum number of message types required to run.

There is a lack of flexibility in the data sent by publishers.

The publish-subscribe method can introduce advanced semantic coupling in messages sent between publishers and subscribers. This means that the data structure of these messages is very difficult to modify once established. To change the format of a message, all subscribers must be updated to accept the new format, which can be challenging or impossible if the subscriber is external. This is a common problem with versioned APIs.

One solution to this problem is to use a versioned message format that allows subscribers to verify the format they are receiving. However, this assumes that the subscriber is using the versioned information correctly.

Another option is to use versioned endpoints, such as "apiv0" and "apiv1", to maintain backward compatibility. The downside of this approach is that it requires developers to support multiple versions, which can be very time-consuming.

Instability. One disadvantage of the publish-subscribe model is that it is difficult to determine the health of a subscriber. The publisher doesn't have complete information about the system that is listening for messages, which can cause problems.

For example, logging systems typically use a publish-subscribe model. If the logger subscribed to the "critical" message type crashes or gets stuck in an error state, you may miss important messages. Any service that relies on error warnings will not be aware of the publisher's problem.

This issue is not unique to the publish-subscribe model and can occur in any client-server system. However, one of the advantages of a publish subscription is that it allows multiple instances of the logger to run simultaneously with minimal additional system effort, allowing for a high level of redundancy.

To mitigate this risk, changes can be made to the design, such as requiring that a received message be received. This will allow publishers to receive feedback on subscriber status.

List of high-quality authors

Related Pages