Project Scheduling: An Exploration of PERT and CPM
Gantt Chart Art generated with Adobe Firefly

Project Scheduling: An Exploration of PERT and CPM

The Statement of Work (SOW) is a formal document that defines the scope of the work to be performed in a project. It outlines the project objectives, deliverables, milestones, timeline, resources, and responsibilities of all parties involved. It serves as a contract between the client and the service provider, ensuring that both parties have a clear understanding of the project requirements and expectations.

When the SOW is decomposed from high-level objectives into actionable tasks, PERT (Program Evaluation and Review Technique) and CPM (Critical Path Method), which are project management techniques, are used to schedule, organize, and manage tasks within a project. The SOW defines the scope of work and project requirements; PERT and CPM are used in planning and optimizing project schedules, identifying critical tasks, and managing project timelines. The SOW provides the foundation for PERT and CPM by defining the tasks and activities to be included in the project schedule.

From PERT and/or CPM analysis, project managers acquire valuable insights including task durations, dependencies, and the critical path. These findings serve as inputs for creating Gantt charts. Task durations and dependencies identified through PERT and CPM methodologies are transformed into visual representations within Gantt charts, often generated using Microsoft Project. Within this software, project managers can efficiently generate, customize, and manage Gantt charts to depict task sequences, durations, dependencies, milestones, and resource allocations.

This subject matter of this article is on PERT and CPM. The article delves into their origins, development, comparison, implementation in Python, and integration with Excel. The comparison between PERT and CPM encompasses their purposes, focuses, calculation methods, use of probabilities, and complexity.

The subsequent sections illustrate how to create project networks, execute forward and backward pass calculations, identify critical paths, and visualize project networks using Python code. Additionally, the article outlines the integration of Python-based PERT/CPM calculations with Excel. It includes an example demonstrating the generation of sample data and the population of an Excel file with task durations.

Evolution and Origins

PERT and CPM are both project management techniques used to schedule, organize, and manage tasks within a project. They both originated in the late 1950s and early 1960s in the United States. They emerged independently from different organizations and industries, but they share common principles and objectives. Both methodologies aimed to address the challenges of managing complex projects by providing tools and techniques for scheduling, organizing, and controlling project activities. Over time, PERT and CPM have been widely adopted across various industries and have become fundamental tools in project management practice.

PERT was developed by the United States Navy Special Projects Office in collaboration with the consulting firm Booz Allen Hamilton and the management consulting division of Lockheed Corporation. It originated in response to the need to manage the Polaris missile submarine program, which was a complex and highly uncertain project. PERT was first implemented in 1958 to support the development of the Polaris missile system. The primary challenge was to manage the scheduling and coordination of numerous tasks involved in the project, considering the uncertainty in task durations due to technological complexity and developmental risks. The key contributors to PERT included researchers from the Navy, Booz Allen Hamilton, and Lockheed Corporation. Notable individuals involved in its development include Admiral William A. Raborn Jr., Fred T. Wilkes, and Morgan R. Walker.

CPM was developed independently by DuPont Corporation and Remington Rand (later acquired by IBM). It emerged from the need to manage large-scale construction projects more effectively. CPM originated in the late 1950s, with its first application in the construction of chemical plants by DuPont. The primary objective was to optimize the scheduling of tasks to minimize project duration and costs. The key developers of CPM were James E. Kelley Jr. and Morgan R. Walker at DuPont, who formulated the method in 1957, and later independently by Morgan R. Walker at Remington Rand. Later, Kelley and Walker collaborated on refining and disseminating the CPM methodology.

Comparison

PERT and CPM share similarities and are often used together, they have distinct differences in regards to purpose, focus, calculation method, use of probabilities, and complexity.

PERT is primarily used in projects with uncertain or variable time estimates for each activity. It is useful for projects with high uncertainty or complexity. CPM, on the other hand, is used when time estimates for each activity are known with certainty. CPM is more suitable for projects with well-defined tasks and durations.

PERT focuses on estimating the time required to complete each task within a project, taking into account uncertainty and variability. It utilizes three time estimates for each activity: optimistic (O), pessimistic (P), and most likely (M). CPM focuses on identifying the critical path in a project, which is the longest sequence of dependent tasks that determines the overall duration of the project. CPM does not account for uncertainty in task durations but rather focuses on the logical sequence of tasks.

In PERT, the expected duration of each activity is calculated using the formula: (O + 4M + P) / 6, where O is the optimistic time estimate, M is the most likely time estimate, and P is the pessimistic time estimate. In CPM, the focus is on determining the earliest start time (ES), earliest finish time (EF), latest start time (LS), latest finish time (LF), and float (slack) for each activity. This is accomplished through forward and backward pass calculations within the network diagram.

PERT uses probability distributions to model the uncertainty in task durations and calculates the probability of completing the project within a certain time frame, while CPM does not directly incorporate probabilities into its calculations. CPM assumes deterministic task durations. PERT tends to be more complex due to its consideration of uncertainty and utilization of probability distributions, whereas CPM is simpler and more straightforward, focusing on the logical sequence of tasks and their durations.

In practice, PERT and CPM are often used together, especially in larger projects. PERT can be employed initially to estimate durations and evaluate project risks, while CPM can be utilized to formulate a comprehensive schedule based on those estimates and pinpoint the critical path.

Python Examples

PERT and CPM methodologies can be implemented using common tools such as Excel or programming languages like Python as shown on the approach below, which utilizes: (1) Python Libraries, (2) Network Representation, (3) Calculations, (4) Visualization, and (5) Integration with Excel.

Python Libraries

Python offers various libraries that can be used for project management tasks. For example, NetworkX can be used to create and analyze project networks, while numpy and pandas can handle calculations and data manipulation. Below are the instructions for utilizing the mentioned Python libraries for project management tasks:

NetworkX

  • Install the library using pip: pip install networkx.
  • Import NetworkX in your Python script: import networkx as nx.
  • Create a graph to represent your project network: G = nx.Graph() or G = nx.DiGraph() for directed graphs.
  • Add nodes to represent tasks: G.add_node(task_id).
  • Add edges to represent dependencies between tasks: G.add_edge(task1_id, task2_id).
  • You can then analyze the graph using various NetworkX functions to determine critical paths, calculate metrics, and visualize the project network.

NumPy and Pandas

  • Install the libraries using pip: pip install numpy pandas.
  • Import NumPy and Pandas in your Python script: import numpy as np and import pandas as pd.
  • Use NumPy for numerical calculations and array manipulation. For example, you can use NumPy arrays to store task durations or other project data.
  • Use Pandas for data manipulation and analysis. You can create DataFrames to organize project data, perform calculations, and generate reports.
  • Utilize NumPy and Pandas functions for various project management tasks such as calculating task durations, determining project schedules, and analyzing project performance.

The following is Python code combining these libraries for project management:

import networkx as nx
import numpy as np
import pandas as pd

# Create a project network using NetworkX
G = nx.DiGraph()
G.add_node("Task1")
G.add_node("Task2")
G.add_node("Task3")
G.add_edge("Task1", "Task2")
G.add_edge("Task1", "Task3")

# Define task durations using NumPy arrays
task_durations = {"Task1": 5, "Task2": 3, "Task3": 4}  # Using a dictionary for simplicity
df_task_durations = pd.DataFrame({"Task": list(task_durations.keys()), "Duration": list(task_durations.values())})

# Calculate project schedule
project_schedule = df_task_durations["Duration"].sum()

# Print project schedule
print("Project Schedule:", project_schedule)        

Network Representation

The project can be represented as a network or graph in Python using NetworkX or similar libraries. Nodes represent tasks, and edges represent dependencies between tasks. The following instructions incorporate the use of Matplotlib for visualization.

  • Install Matplotlib using pip: pip install matplotlib.
  • Import Matplotlib in your Python script: import matplotlib.pyplot as plt.

Visualization

  • After you've created your project network using NetworkX, use Matplotlib for visualization.
  • Use nx.draw() from NetworkX to draw the graph and then plt.show() from Matplotlib to display it.
  • Optionally, you can customize the appearance of the graph using various Matplotlib functions.

The following example includes Matplotlib for visualization:

import networkx as nx
import matplotlib.pyplot as plt

# Create a directed graph to represent the project network
G = nx.DiGraph()

# Add nodes representing tasks
G.add_node("Task1")
G.add_node("Task2")
G.add_node("Task3")

# Add edges representing dependencies between tasks
G.add_edge("Task1", "Task2")
G.add_edge("Task1", "Task3")

# Visualize the project network using NetworkX and Matplotlib
nx.draw(G, with_labels=True)
plt.show()        

The image below is the project network visualization generated by using NetworkX, use Matplotlib.

Project Network Visualization

Calculations

Once the project has been represented as a network, calculations can be performed to determine critical paths, task durations, and other metrics. It will be necessary to implement algorithms for forward pass, backward pass, and identifying critical paths.

Forward Pass

Implement the forward pass algorithm to calculate the earliest start time (ES) and earliest finish time (EF) for each task in the project. Start by setting the ES of the initial task(s) to zero. For each subsequent task, calculate its ES as the maximum EF of its preceding tasks. Calculate the EF for each task as the sum of its ES and duration. Repeat this process until all tasks have been processed.

Backward Pass

Implement the backward pass algorithm to calculate the latest start time (LS) and latest finish time (LF) for each task in the project. Start by setting the LF of the final task(s) to their EF calculated in the forward pass. For each preceding task, calculate its LF as the minimum LS of its succeeding tasks. Calculate the LS for each task as the difference between its LF and duration. Repeat this process until all tasks have been processed.

Critical Path Identification

Once you have calculated the ES, EF, LS, and LF for each task, identify the critical path(s) in the project. The critical path is the longest path(s) through the project network, i.e., the path(s) with zero slack. Tasks with zero slack have their ES equal to their LS and their EF equal to their LF. Any delay in tasks on the critical path will directly impact the project's overall duration.

Task Durations and Metrics

With the ES, EF, LS, and LF calculated for each task, you can determine task durations, total project duration, and other metrics. Task duration is simply the difference between EF and ES for each task. Total project duration is the maximum EF of all tasks (or the maximum LF if using LS and LF). You can also calculate slack for each task (slack = LS - ES or slack = LF - EF), which indicates the flexibility in task scheduling without impacting project duration.

Implementation

Implement these calculations in your Python script by traversing the project network and updating task attributes accordingly. You can use NetworkX functions to iterate over nodes and edges in the graph and access task attributes stored as node attributes. Verify your calculations by comparing results with manual calculations or established project management methodologies.

By implementing these algorithms, you can effectively analyze your project network, identify critical paths, determine task durations, and assess project metrics to manage and optimize project schedules.

The following code implements the forward pass, backward pass, and critical path identification algorithms.

import networkx as nx

# Create a directed graph to represent the project network
G = nx.DiGraph()

# Add nodes representing tasks with their durations
tasks = {
    "Task1": {"duration": 5},
    "Task2": {"duration": 3},
    "Task3": {"duration": 4},
    "Task4": {"duration": 2},
    "Task5": {"duration": 6},
    "Task6": {"duration": 3}
}

# Add nodes with durations to the graph
for task, attr in tasks.items():
    G.add_node(task, duration=attr["duration"])

# Add edges representing dependencies between tasks
G.add_edge("Task1", "Task2")
G.add_edge("Task1", "Task3")
G.add_edge("Task2", "Task4")
G.add_edge("Task3", "Task4")
G.add_edge("Task3", "Task5")
G.add_edge("Task4", "Task6")
G.add_edge("Task5", "Task6")

# Perform forward pass
for node in nx.topological_sort(G):
    es = max([0] + [G.nodes[predecessor]['EF'] for predecessor in G.predecessors(node)], default=0)
    G.nodes[node]['ES'] = es
    G.nodes[node]['EF'] = es + G.nodes[node]['duration']

# Perform backward pass
for node in reversed(list(nx.topological_sort(G))):
    lf = min([G.nodes[successor]['LS'] for successor in G.successors(node)], default=G.nodes[node]['EF'])
    G.nodes[node]['LF'] = lf
    G.nodes[node]['LS'] = lf - G.nodes[node]['duration']

# Identify critical path
critical_path = [node for node in G.nodes if G.nodes[node]['ES'] == G.nodes[node]['LS']]

# Print task durations, ES, EF, LS, LF, and critical path
print("Task\tDuration\tES\tEF\tLS\tLF")
for node in G.nodes:
    print(f"{node}\t{G.nodes[node]['duration']}\t\t{G.nodes[node]['ES']}\t{G.nodes[node]['EF']}\t{G.nodes[node]['LS']}\t{G.nodes[node]['LF']}")
print("Critical Path:", critical_path)        

The forward pass is performed in the first loop after the graph creation. It traverses the graph in topological order using nx.topological_sort(G). For each node in the topological order, it calculates the earliest start time (ES) based on the maximum EF of its predecessors. The calculated ES is stored in the node attribute 'ES', and the earliest finish time (EF) is calculated as the sum of ES and duration.

The backward pass is performed in the second loop, which iterates over the topological order in reverse. For each node, it calculates the latest finish time (LF) based on the minimum LS of its successors. The latest start time (LS) is then calculated, and the values are stored in the respective node attributes 'LS' and 'LF'.

After performing both passes, the critical path is identified by checking nodes where ES equals LS, indicating zero slack. These nodes are appended to the critical_path list.

The code then prints task durations, ES, EF, LS, LF, and the critical path.

Task    Duration    ES  EF  LS  LF
Task1   5           0   5   0   5
Task2   3           5   8   10  13
Task3   4           5   9   5   9
Task4   2           9   11  13  15
Task5   6           9   15  9   15
Task6   3           15  18  15  18
Critical Path: ['Task1', 'Task3', 'Task5', 'Task6']        

Visualization

The project network and critical paths can be visualized using libraries like matplotlib or plotly. The following code uses Matplotlib to visualize the project network and highlight the critical path.

import networkx as nx
import matplotlib.pyplot as plt

# Create a directed graph to represent the project network
G = nx.DiGraph()

# Add nodes representing tasks with their durations
tasks = {
    "Task1": {"duration": 5},
    "Task2": {"duration": 3},
    "Task3": {"duration": 4},
    "Task4": {"duration": 2},
    "Task5": {"duration": 6},
    "Task6": {"duration": 3}
}

# Add nodes with durations to the graph
for task, attr in tasks.items():
    G.add_node(task, duration=attr["duration"])

# Add edges representing dependencies between tasks
G.add_edge("Task1", "Task2")
G.add_edge("Task1", "Task3")
G.add_edge("Task2", "Task4")
G.add_edge("Task3", "Task4")
G.add_edge("Task3", "Task5")
G.add_edge("Task4", "Task6")
G.add_edge("Task5", "Task6")

# Perform forward pass
for node in nx.topological_sort(G):
    es = max([0] + [G.nodes[predecessor]['EF'] for predecessor in G.predecessors(node)], default=0)
    G.nodes[node]['ES'] = es
    G.nodes[node]['EF'] = es + G.nodes[node]['duration']

# Perform backward pass
for node in reversed(list(nx.topological_sort(G))):
    lf = min([G.nodes[successor]['LS'] for successor in G.successors(node)], default=G.nodes[node]['EF'])
    G.nodes[node]['LF'] = lf
    G.nodes[node]['LS'] = lf - G.nodes[node]['duration']

# Identify critical path
critical_path = [node for node in G.nodes if G.nodes[node]['ES'] == G.nodes[node]['LS']]

# Visualize the project network
pos = nx.spring_layout(G)  # Define layout for the nodes
nx.draw(G, pos, with_labels=True, node_size=800, node_color='skyblue', font_size=10, font_weight='bold')  # Draw nodes
nx.draw_networkx_edges(G, pos, width=1.0, alpha=0.5)  # Draw edges
nx.draw_networkx_nodes(G, pos, nodelist=critical_path, node_color='r', node_size=800)  # Highlight critical path nodes in red
plt.title('Project Network')  # Set title for the plot
plt.show()  # Show the plot        

In this code nx.spring_layout() is used to define the layout for the nodes. nx.draw() is used to draw the nodes with labels, and nx.draw_networkx_edges() is used to draw the edges. The critical path nodes are highlighted in red using nx.draw_networkx_nodes(). As shown on the below image, this code displays a visualization of the project network with critical paths highlighted in red.

Critical Path of Project Network

Integration with Excel

Python libraries like openpyxl or xlrd can be utilized to read from and write to Excel files, allowing the integration of the Python-based PERT/CPM calculations with Excel for visualization or further analysis.

The following is a guide that outlines a step-by-step process for integrating Python-based PERT/CPM calculations with Excel using libraries such as openpyxl or xlrd.

  • Install the required libraries (openpyxl and xlrd) using pip: pip install openpyxl xlrd.
  • Use openpyxl to read data from Excel files: 1. Import the library, 2. Open the Excel file using openpyxl.load_workbook(), 3. Access the specific worksheet within the workbook, and 4. Iterate through rows and columns to read data.
  • Similarly, use openpyxl to write data to Excel files: 1. Open the workbook and create a new worksheet if needed, 2. Write data to specific cells, and 3. Save the workbook.
  • Integrate PERT/CPM calculations with the data read from or written to Excel. Utilize the data from Excel as input for your calculations or write the results back to Excel.
  • Once the calculations are done and data is in Excel, use Excel's built-in visualization tools or further analyze the data. Create Gantt charts, critical path diagrams, or other visual representations of your project schedule directly within Excel.

By following this guide, you can effectively combine the capabilities of Python for complex calculations with Excel's powerful visualization and analysis features for efficient project management and analysis.

Excel Examples

Below is a Python code example to generate sample data and populate an Excel file (input.xlsx) with tasks and their durations. You can then use the data in this Excel file as input for PERT/CPM calculations.

import openpyxl
import random

# Function to generate sample data
def generate_sample_data(num_tasks):
    data = [["Task", "Duration"]]
    for i in range(1, num_tasks + 1):
        task_name = f"Task{i}"
        duration = random.randint(1, 10)  # Random duration between 1 and 10 days
        data.append([task_name, duration])
    return data

# Function to write data to Excel file
def write_data_to_excel(file_path, data):
    workbook = openpyxl.Workbook()
    worksheet = workbook.active
    for row_idx, row in enumerate(data, start=1):
        for col_idx, value in enumerate(row, start=1):
            worksheet.cell(row=row_idx, column=col_idx, value=value)
    workbook.save(file_path)

# Example usage
if __name__ == "__main__":
    num_tasks = 10  # Number of tasks
    data = generate_sample_data(num_tasks)
    write_data_to_excel("input.xlsx", data)
    print(f"Sample data for {num_tasks} tasks has been written to input.xlsx.")        

In this code:

  • The generate_sample_data() function generates sample data for tasks with random durations.
  • The write_data_to_excel() function writes the generated sample data to an Excel file named input.xlsx.
  • In the if __name__ == "__main__": block, the code generates sample data for a specified number of tasks (in this case, 10 tasks) and writes it to the input.xlsx file.

Adjust the num_tasks variable to specify the desired number of tasks. After running this code, you'll have an Excel file (input.xlsx) populated with sample task data.

The below code demonstrates how to read data from an Excel file, perform PERT/CPM calculations, and write the results back to another Excel file.

import openpyxl

# Function to perform PERT/CPM calculations
def perform_pert_cpm(tasks, predecessors, successors):
    es = {}
    ef = {}
    ls = {}
    lf = {}

    # Calculate ES and EF for each task
    for task, duration in tasks.items():
        es[task] = max([ef.get(pred, 0) for pred in predecessors.get(task, [])], default=0)
        ef[task] = es[task] + duration

    # Calculate LF and LS for each task in reverse topological order
    for task in reversed(topological_order):
        lf[task] = min([ls.get(succ, ef[task]) for succ in successors.get(task, [])], default=ef[task])
        ls[task] = lf[task] - tasks[task]

    return es, ef, ls, lf

# Load data from Excel file
def load_data_from_excel(file_path):
    workbook = openpyxl.load_workbook(file_path)
    worksheet = workbook.active
    tasks = {}
    predecessors = {}
    successors = {}
    topological_order = []

    # Skip header row
    iter_rows = iter(worksheet.iter_rows(values_only=True))
    next(iter_rows)

    # Read task data from Excel
    for row in iter_rows:
        task = row[0]
        duration = int(row[1])  # Convert duration to integer
        dependencies = row[2:]

        tasks[task] = duration

        if dependencies:
            predecessors[task] = dependencies
            for dep in dependencies:
                if dep not in successors:
                    successors[dep] = []
                successors[dep].append(task)

    # Generate topological order
    for task in tasks:
        if task not in predecessors:
            topological_order.append(task)

    return tasks, predecessors, successors, topological_order

# Write results to Excel file
def write_results_to_excel(file_path, es, ef, ls, lf):
    workbook = openpyxl.Workbook()
    worksheet = workbook.active
    worksheet.append(["Task", "Duration", "ES", "EF", "LS", "LF"])

    for task in es:
        worksheet.append([task, tasks[task], es[task], ef[task], ls[task], lf[task]])

    workbook.save(file_path)

# Example usage
if __name__ == "__main__":
    input_file = 'input.xlsx'
    tasks, predecessors, successors, topological_order = load_data_from_excel(input_file)
    es, ef, ls, lf = perform_pert_cpm(tasks, predecessors, successors)
    write_results_to_excel('output.xlsx', es, ef, ls, lf)
    print("PERT/CPM calculations completed and results written to output.xlsx.")        

Conclusion

This article provides an overview of PERT and CPM. It delineates the SOW's role in defining project scope, objectives, and responsibilities, while detailing PERT and CPM as project scheduling and management tools. PERT is typically employed when dealing with uncertain time estimates, while CPM is more commonly used in projects with well-defined timeframes. The article delves into the origins, comparison, implementation, and integration of both methodologies with Python and Excel. Through Python libraries like NetworkX, NumPy, and Pandas, project management tasks are executed, visualized, and analyzed, with further exploration into integrating these processes with Excel for enhanced visualization and analysis. Python code snippets demonstrate tasks ranging from generating sample data to performing PERT/CPM calculations and reading and writing results to Excel, ultimately streamlining project management and decision-making processes.

要查看或添加评论,请登录

Manuel Soto的更多文章

  • Modeling Ontology

    Modeling Ontology

    In enterprise modeling, the modeling ontology is the structured framework that helps differentiate and organize the…

  • EA Role in System Life Cycle

    EA Role in System Life Cycle

    As the Enterprise Architect (EA) for implementing a complex system such as a space debris management and analysis…

  • NASA's Space Debris Analysis Models

    NASA's Space Debris Analysis Models

    ?? ORDEM 3.2 and LEGEND: Two Important Models Used by NASA for Space Debris Analysis ?? Overview ORDEM 3.

  • Learn SQL on Ubuntu with MySQL

    Learn SQL on Ubuntu with MySQL

    Fresh MySQL Install on Ubuntu Uninstall MySQL and then reinstall it to start with a fresh installation of MySQL. Then…

    3 条评论
  • Very Large Scale Integration

    Very Large Scale Integration

    In the early days of integrated circuit (IC) design, engineers manually drew designs due to the limited complexity of…

  • Threat Analysis using UAF and SE Handbook Technical Processes

    Threat Analysis using UAF and SE Handbook Technical Processes

    ?? Understand the UAF Basics: ?? Familiarize yourself with the UAF concepts, elements, and relationships. ?? Review the…

  • Kessler Syndrome Use-Cases and Corresponding UAF Perspectives

    Kessler Syndrome Use-Cases and Corresponding UAF Perspectives

    1. Military Operations and National Security Stakeholders Scenario: A nation-state deliberately worsens the Kessler…

  • Software Engineering Balance: Avoiding Over and Under Engineering

    Software Engineering Balance: Avoiding Over and Under Engineering

    In software development, striking the right balance between over-engineering and under-engineering is crucial for…

  • Stakeholder Identification, System Boundary Definition, and Lifecycle Management

    Stakeholder Identification, System Boundary Definition, and Lifecycle Management

    When creating a solution using the INCOSE Systems Engineering Handbook and the UAF Domain Metamodel, the process…

  • UAF View Specifications

    UAF View Specifications

    In the UAF Grid (v1.2), View Specifications are comprised of viewpoints, which are the rows in the grid, and aspects…

社区洞察