登录查看更多内容

Tale of Software Architect(ure): Part 10 (Pipe-Filter Architecture)

Saiful Islam Rasel

Senior Engineer, SDE @ bKash | Ex: AsthaIT | Sports Programmer | Problem Solver | FinTech | Microservice | Java | Spring-boot | C# | .NET | PostgreSQL | DynamoDB | JavaScript | TypeScript | React.js | Next.js | Angular

发布日期: 2024年10月4日

Story:

Once upon a time, in a village, there was a great river called the River of Knowledge. This river flowed from the highest mountains, carrying messages, stories, and wisdom to the people of the land. But the river was wild and full of all sorts of information, some useful, some unnecessary, and some even misleading.

The most wise man of the village suggest a way to filter and organize the river’s knowledge, so that only the most important and relevant information reached to people. To solve this problem, village people decided to hire, a group of clever engineers who were known for their ability to transform chaos into order. So the clever engineers perform separate filtering of these knowledge and combinedly generate the expected results.

Pipe Filter Architecture:

The Pipe and Filter architecture also known as Pipeline architecture is a design pattern where a task is divided into several sequential processing steps, each of which is called a filter. These filters process data and pass the output to the next step via a pipe. Each filter operates independently, transforming input data from the previous stage before sending it along the pipe.

Key Components:

Filter: A filter performs a specific operation, transforming input data into output. Filters are independent units and do not rely on each other’s internal processes.
Pipe: A pipe transfers data from one filter to the next. It defines the flow of data through the system.
Data stream: Data moves through pipes from one filter to another, allowing continuous flow and transformation.

Context:

In real world there has some systems that process a stream of data in a series of sequential or parallel stages. Each stage performs a specific transformation or operation on the data. So we need a suitable architecture for these types of applications where:

Data flows continuously or in chunks (e.g. logs, streams, files).
The process can be broken down into smaller, independent steps.
Flexibility and reusability of these independent components are important.
Scalability is needed, especially if the data flow needs to be handled by distributed or parallel systems.

Problem:

The problem arises when you need to process large or continuous streams of data in multiple stages. Without a structured architecture, such systems can become tightly coupled, making them hard to maintain, extend, or scale. Some of the key challenges include:

Tight Coupling: When the data processing stages are not clearly separated, it becomes difficult to modify or maintain individual steps.
Low Reusability: Without clear separation, individual processing steps (like filtering, sorting, etc.) cannot easily be reused in other contexts.
Limited Scalability: In monolithic data processing workflows, it's hard to scale the system for larger datasets or parallelize stages for better performance.
Maintenance Complexity: A system where steps are interdependent and intertwined is difficult to update, extend, or debug.

Solution:

The Pipe and Filter architecture addresses these issues by breaking the data processing task into a series of independent, well-defined filters, each of which performs a single transformation on the data. These filters are connected by pipes that allow data to flow from one filter to the next.

This structure solves the problem in several ways:

Modular Design: Each filter is an independent unit, making the system easier to maintain. If a filter needs to be modified, replaced, or removed, other filters remain unaffected.
Reusability: Since filters are designed to be independent, they can be reused in different pipelines or systems with minimal adjustments. For example, a filter designed to sort data can be used in various pipelines, regardless of the specific data being processed.
Scalability: Filters can be run in parallel, allowing the system to process large datasets more efficiently. For example, multiple filters can be executed concurrently in a distributed environment to increase throughput.
Flexibility: New filters can be easily added, and the sequence of filters can be changed or rearranged to create new pipelines without affecting existing components.
Ease of Maintenance and Testing: Since each filter is a self-contained unit with well-defined input and output, it becomes easier to isolate issues during debugging. Filters can also be individually tested.

领英推荐

Event-Driven Architecture

Rocky Bhatia 1 年前

Clean, Hexagonal, and Traditional Layered Architectures

Abdallah Shaaban 10 个月前

The Principles of Clean Architecture: A Developer's…

Kayode Abereowo 8 个月前

Example Solution:

Let’s say you are building a log processing system that needs to:

Read log files.
Extract lines containing error messages.
Sort them by timestamp.
Remove duplicates.
Output the cleaned log.

Using the Pipe and Filter architecture:

You would create independent filters for each of these tasks: one for reading logs, one for extracting errors, one for sorting, and one for removing duplicates.
These filters would be connected by pipes to pass the data from one stage to the next.
Each filter can be independently developed, tested, and reused in different log processing pipelines if necessary.

Pseudocode:

# Pipe and Filter Log Processing System

# Step 1: Define individual filters (each performing a specific operation)

# Filter to read log data from a file
function read_log(file):
    for each line in file:
        yield line # "yield" simulates a stream of data through the pipe

# Filter to extract only error lines
function filter_errors(lines):
    for each line in lines:
        if "error" in line:
            yield line

# Filter to sort lines based on timestamp
function sort_by_timestamp(lines):
    sorted_lines = sort(lines, by_timestamp=True)
    for each line in sorted_lines:
        yield line

# Filter to remove duplicate lines
function remove_duplicates(lines):
    seen = set() # Keep track of unique lines
    for each line in lines:
        if line not in seen:
            seen.add(line)
            yield line

# Filter to write data to output file
function write_output(lines, output_file):
    for each line in lines:
        output_file.write(line + "\n")

# Step 2: Create the pipeline

# Main function to set up the pipe and filter architecture
function process_logs(input_file, output_file):
    # Pipeline connecting filters via pipes
    lines = read_log(input_file)
    errors = filter_errors(lines)
    sorted_errors = sort_by_timestamp(errors)
    unique_errors = remove_duplicates(sorted_errors)
    
    # Write final output to the output file
    write_output(unique_errors, output_file)

# Step 3: Use the system

# Open input and output files
input_file = open("logs.txt", "r")
output_file = open("error_logs.txt", "w")

# Run the log processing pipeline
process_logs(input_file, output_file)

# Close files
input_file.close()
output_file.close()

Summary:

The Pipe and Filter architecture is very suitable for stage by stage data processing application. Because we can easily create independent filter and join them with pipes. Also for this type of architectural application Functional Programming is a very handy choice.

This architecture brings modularity, reusability, scalability and flexibility out of the box.

Previous Parts:

Part 1: Tale of software architect(ure): Part 1 (Software Architecture and Software Design)

Part 2: Tale of software architect(ure): Part 2 (Role of Software Architect and Knowledge To Have)

Part 3: Tale of Software Architect(ure): Part 3 (Characteristics of Software Architecture)

Part 4: Tale of Software Architect(ure): Part 4 (Things Should Consider When Design/Architect a Software System)

Part 5: Tale of Software Architect(ure): Part 5 (Wrong Assumption in Software Architecture and Fallacies of Distributed Computing)

Part 6: Tale of Software Architect(ure): Part 6 (Framework for System Design Interview)

Part 7: Tale of Software Architect(ure): Part 7 (Well Known Software Architectures Styles)

Part 8: Tale of Software Architect(ure): Part 8 (Architecture Patterns and Layered Architecture)

Part 9: Tale of Software Architect(ure): Part 9 (MVC Architecture Pattern)

要查看或添加评论，请登录

Saiful Islam Rasel的更多文章

DevOps - Step By Step Learning : Part 5 (Create a REST API Server Using Golang and Gin)

2025年3月22日

DevOps - Step By Step Learning : Part 5 (Create a REST API Server Using Golang and Gin)

Story: In the last part, Rasel learned the fundamentals of Golang. Now he was thinking of building something to verify…
DevOps - Step By Step Learning : Part 4 (Basic Programming Using Golang)

2025年3月20日

DevOps - Step By Step Learning : Part 4 (Basic Programming Using Golang)

Story: The roadmap defined was done, and it's time to start learning accordingly. The first step was learning a…
DevOps - Step By Step Learning : Part 3 (A Simplified Roadmap to Start)

2025年3月18日

DevOps - Step By Step Learning : Part 3 (A Simplified Roadmap to Start)

Story: Rasel was able to convince the management and finally his team, and other teams started using Jira for project…
DevOps - Step By Step Learning : Part 2 (12 Factor Apps, DevOps 3 P's, 6 Pillars, 8 Phases)

2025年3月15日

DevOps - Step By Step Learning : Part 2 (12 Factor Apps, DevOps 3 P's, 6 Pillars, 8 Phases)

Story: You should already know about the situation and struggle that Rasel faced in the beginning of his job from the…
DevOps - Step By Step Learning : Part 1 (What it is, History, Philosophy, Practice)

2025年3月13日

DevOps - Step By Step Learning : Part 1 (What it is, History, Philosophy, Practice)

Story: In 2021, Rasel just finished his university and marked himself as a fresher :). But fortunately he managed to…
Book Review and Takeaways : ("Database Internals: A Deep Dive into How Distributed Data Systems Work")

2025年2月21日

Book Review and Takeaways : ("Database Internals: A Deep Dive into How Distributed Data Systems Work")

Recently I finished reading the book named "Database Internals: A Deep Dive into How Distributed Data Systems Work" by…

4 条评论
Journey To Database World: Part 12 (Database Internals)

2025年2月20日

Journey To Database World: Part 12 (Database Internals)

Database internals refer to how a database system works behind the scenes. Like how it store data in memory and disk…

2 条评论
Book Review and Takeaways : ("Engineers' Survival Guide: Advice, Tactics and Tricks")

2025年2月8日

Book Review and Takeaways : ("Engineers' Survival Guide: Advice, Tactics and Tricks")

Recently I finished reading the book named "Engineers' Survival Guide: Advice, Tactics and Tricks". As its sub title…
Book Review and Takeaways : ("The Phoenix Project - A Novel about IT, DevOps, and Helping Your Business Win")

2025年1月24日

Book Review and Takeaways : ("The Phoenix Project - A Novel about IT, DevOps, and Helping Your Business Win")

Recently I finished reading the book named "The Phoenix Project". As its sub title say, it is a novel about IT, DevOps…

4 条评论
Journey To Database World: Part 11 (Knowledge Should Have According to Role)

2025年1月18日

Journey To Database World: Part 11 (Knowledge Should Have According to Role)

Here, I'm trying to identify the roles at a low level. Roles and responsibilities may vary based on the needs of the…

See all articles

Tale of Software Architect(ure): Part 10 (Pipe-Filter Architecture)

Saiful Islam Rasel

Senior Engineer, SDE @ bKash | Ex: AsthaIT | Sports Programmer | Problem Solver | FinTech | Microservice | Java | Spring-boot | C# | .NET | PostgreSQL | DynamoDB | JavaScript | TypeScript | React.js | Next.js | Angular

Story:

Pipe Filter Architecture:

Context:

Problem:

Solution:

领英推荐

Example Solution:

Pseudocode:

Summary:

Previous Parts:

Saiful Islam Rasel的更多文章

社区洞察

其他会员也浏览了

Do’s and Don’ts of Clean Architecture: A Guide for Scalable Software

Clean Architecture in C# .NET

Event-driven architectures vs event-sourcing patterns

Understanding the CQRS Architecture: Command and Query Responsibility Segregation

Understanding the hexagonal architecture

Unveiling the Foundations of Software Architecture - Part 1: Layered and Component-Based Architecture

Hexagonal Architecture: A Guide to Decoupled Software Design

All you need to know about Tier Application Architecture

What Is the Operating Model That Supports the Future-Proof IT Architecture? A Platform-Centric View

Agentic AI vs. Service-Oriented Architecture: A Paradigm Shift in System Design

Story:

Pipe Filter Architecture:

Context:

Problem:

Solution:

领英推荐

Example Solution:

Pseudocode:

Summary:

Previous Parts:

Saiful Islam Rasel的更多文章

DevOps - Step By Step Learning : Part 5 (Create a REST API Server Using Golang and Gin)

DevOps - Step By Step Learning : Part 4 (Basic Programming Using Golang)

DevOps - Step By Step Learning : Part 3 (A Simplified Roadmap to Start)

DevOps - Step By Step Learning : Part 2 (12 Factor Apps, DevOps 3 P's, 6 Pillars, 8 Phases)

DevOps - Step By Step Learning : Part 1 (What it is, History, Philosophy, Practice)

Book Review and Takeaways : ("Database Internals: A Deep Dive into How Distributed Data Systems Work")

Journey To Database World: Part 12 (Database Internals)

Book Review and Takeaways : ("Engineers' Survival Guide: Advice, Tactics and Tricks")

Book Review and Takeaways : ("The Phoenix Project - A Novel about IT, DevOps, and Helping Your Business Win")

Journey To Database World: Part 11 (Knowledge Should Have According to Role)

社区洞察

其他会员也浏览了

Do’s and Don’ts of Clean Architecture: A Guide for Scalable Software

Clean Architecture in C# .NET

Event-driven architectures vs event-sourcing patterns

Understanding the CQRS Architecture: Command and Query Responsibility Segregation

Understanding the hexagonal architecture

Unveiling the Foundations of Software Architecture - Part 1: Layered and Component-Based Architecture

Hexagonal Architecture: A Guide to Decoupled Software Design

All you need to know about Tier Application Architecture

What Is the Operating Model That Supports the Future-Proof IT Architecture? A Platform-Centric View

Agentic AI vs. Service-Oriented Architecture: A Paradigm Shift in System Design