Tale of Software Architect(ure): Part 10 (Pipe-Filter Architecture)
Saiful Islam Rasel
Senior Engineer, SDE @ bKash | Ex: AsthaIT | Sports Programmer | Problem Solver | FinTech | Microservice | Java | Spring-boot | C# | .NET | PostgreSQL | DynamoDB | JavaScript | TypeScript | React.js | Next.js | Angular
Story:
Once upon a time, in a village, there was a great river called the River of Knowledge. This river flowed from the highest mountains, carrying messages, stories, and wisdom to the people of the land. But the river was wild and full of all sorts of information, some useful, some unnecessary, and some even misleading.
The most wise man of the village suggest a way to filter and organize the river’s knowledge, so that only the most important and relevant information reached to people. To solve this problem, village people decided to hire, a group of clever engineers who were known for their ability to transform chaos into order. So the clever engineers perform separate filtering of these knowledge and combinedly generate the expected results.
Pipe Filter Architecture:
The Pipe and Filter architecture also known as Pipeline architecture is a design pattern where a task is divided into several sequential processing steps, each of which is called a filter. These filters process data and pass the output to the next step via a pipe. Each filter operates independently, transforming input data from the previous stage before sending it along the pipe.
Key Components:
Context:
In real world there has some systems that process a stream of data in a series of sequential or parallel stages. Each stage performs a specific transformation or operation on the data. So we need a suitable architecture for these types of applications where:
Problem:
The problem arises when you need to process large or continuous streams of data in multiple stages. Without a structured architecture, such systems can become tightly coupled, making them hard to maintain, extend, or scale. Some of the key challenges include:
Solution:
The Pipe and Filter architecture addresses these issues by breaking the data processing task into a series of independent, well-defined filters, each of which performs a single transformation on the data. These filters are connected by pipes that allow data to flow from one filter to the next.
This structure solves the problem in several ways:
领英推荐
Example Solution:
Let’s say you are building a log processing system that needs to:
Using the Pipe and Filter architecture:
Pseudocode:
# Pipe and Filter Log Processing System
# Step 1: Define individual filters (each performing a specific operation)
# Filter to read log data from a file
function read_log(file):
for each line in file:
yield line # "yield" simulates a stream of data through the pipe
# Filter to extract only error lines
function filter_errors(lines):
for each line in lines:
if "error" in line:
yield line
# Filter to sort lines based on timestamp
function sort_by_timestamp(lines):
sorted_lines = sort(lines, by_timestamp=True)
for each line in sorted_lines:
yield line
# Filter to remove duplicate lines
function remove_duplicates(lines):
seen = set() # Keep track of unique lines
for each line in lines:
if line not in seen:
seen.add(line)
yield line
# Filter to write data to output file
function write_output(lines, output_file):
for each line in lines:
output_file.write(line + "\n")
# Step 2: Create the pipeline
# Main function to set up the pipe and filter architecture
function process_logs(input_file, output_file):
# Pipeline connecting filters via pipes
lines = read_log(input_file)
errors = filter_errors(lines)
sorted_errors = sort_by_timestamp(errors)
unique_errors = remove_duplicates(sorted_errors)
# Write final output to the output file
write_output(unique_errors, output_file)
# Step 3: Use the system
# Open input and output files
input_file = open("logs.txt", "r")
output_file = open("error_logs.txt", "w")
# Run the log processing pipeline
process_logs(input_file, output_file)
# Close files
input_file.close()
output_file.close()
Summary:
The Pipe and Filter architecture is very suitable for stage by stage data processing application. Because we can easily create independent filter and join them with pipes. Also for this type of architectural application Functional Programming is a very handy choice.
This architecture brings modularity, reusability, scalability and flexibility out of the box.