Streamlining Stormwater: A Guide to Automating SWMM with GIS and Python

6 min readMar 22, 2024

In the intricate dance of civil engineering and water management, time is a resource almost as precious as water itself. This post explores an alternative approach that leverages Python and GIS for automating the Storm Water Management Model (SWMM), highlighting its potential impact on urban infrastructure planning. Our journey will not only introduce a streamlined process for data preparation but also explore the broader implications of model automation in shaping the cities of tomorrow.

The Challenge:

Data preparation, especially for extensive urban or regional datasets, remains a complex and time-consuming task, even with advanced commercial software and GIS extensions. This article uses a sewer infrastructure dataset from the City of Vancouver as a case study to address these challenges.

A Step-by-Step Guide

GIS File Preparation

In this example, we will use the sewer infrastructure dataset publicly available from the City of Vancouver’s open data portal. Please note that the sewer main “from” and “to” information somehow exits in a separate file different from the invert elevations, and the lines seem to be offset by a few meters from each other. The process begins with consolidating sewer infrastructure data, addressing discrepancies such as misaligned lines and separate files. I snapped the polylines from each features and spatially joined the two features together to create the ‘sewermains_joined.geojson’ below.

2. Sewer Network Creation.

Once you have created the Sewermains_joined file created and with the attributes ‘faclityid’, ‘frommh’, ‘tomh’ and the invert elevations ready, we can proceed to the next step of creating the sewer network. In this example, we use Neworkx to generate the network logic between each node. As the code example below, we load the frommh and tomh from the geojson file: Each node represents a manhole in the sewer network. ‘Edge’ represents the conduit (sewer pipes) and tells the program the “direction” of the conduits(i.e upstream vs downstream). Please note “occasionally” there are ‘dead ends’ in the City’s GIS data. For now, we save in a separate list. We can delve into solving these discrepencies in future discussions.

import networkx as n
import geopandas as gpd
import pickle
import json

gdf = gpd.read_file('sewermains_joined.geojson')

# Create a directed graph
G = nx.DiGraph()

# Initialize list to store facility IDs of dead-end nodes
dead_end_nodes = []

# Iterate over the GeoDataFrame
for idx, row in gdf.iterrows():
    # Get the 'frommh' and 'tomh' of the line
    frommh = row['frommh']
    tomh = row['tomh']

    # Check that 'frommh' and 'tomh' are not None
    if frommh is not None and tomh is not None:
        # Add the 'frommh' and 'tomh' as nodes
        G.add_node(frommh)
        G.add_node(tomh)

        # Add the edge between 'frommh' and 'tomh'
        G.add_edge(frommh, tomh)
    else:
        # If either 'frommh' or 'tomh' is None, store the facility ID
        dead_end_nodes.append(row['facilityid'])

with open("network.pickle", "wb") as f:
    pickle.dump(G, f)

3. Network Extraction

Once you have the network diagram ready for the entire City. Let’s pick a location to extract a subset of thenetwork by specifying an outlet nodeID. For this example, I selected a node on the 1800mm trunk sewer at the intersection of Adanac and Rupert St. The upstream sewershed appears have experienced repeated street flooding according to historical 311 records. In the example code below, I first extracted all nodes that are upstream to the ‘128138’ in the network , then exported the features in a separate JSON file.


with open("network.pickle", "rb") as f:
    G = pickle.load(f)
gdf = gpd.read_file('sewermains_joined.geojson')

# Specify the node for which you want to find all upstream nodes
node = '128138'  # Replace with your node

# Find all upstream nodes
upstream_nodes = nx.ancestors(G, node)


# Initialize a list to store the upstream features
upstream_features = []

# Iterate over the GeoDataFrame
for idx, row in gdf.iterrows():
    # Get the 'frommh' and 'tomh' of the line
    frommh = row['frommh']
    tomh = row['tomh']

    # Check if either 'frommh' or 'tomh' is in the upstream nodes
    if frommh in upstream_nodes or tomh in upstream_nodes:
        # If so, append the row to the upstream features list
        upstream_features.append(row)

# Convert the list to a GeoDataFrame
upstream_features_gdf = gpd.GeoDataFrame(pd.concat(upstream_features, axis=1).T)

# Set the active geometry column
upstream_features_gdf.set_geometry('geometry', inplace=True)
# Set the CRS of the GeoDataFrame
upstream_features_gdf.set_crs(gdf.crs, inplace=True)
# Save the upstream features to a new GeoJSON file
upstream_features_gdf.to_file("upstream_features1.geojson", driver='GeoJSON')

Here is what the output sewer mains (highlighted in yellow) look like compared to the rest of the sewer network. The red dot is the outlet node. Theoretically, we can create any subset of the sewer network by specifying an outlet node anywhere in the City’s sewer network or subdivide the City’s sewer network into several sewersheds using the Python library’s built-in functions.

4. From GIS to SWMM:

At this stage, if you have a cleaned sewer manhole and sewer mains layer, you can easily import them into the PCSWMM software and generate the inp file. However, if you would like to explore how to programmatically create the inp from scratch for automation purposes, you can look at the below code snippet for “brute-force” writing the nodes into the .inp model file. Alternatively, you can take a look at the Pyswmm library at this link if you want interact with an eixsting model file programatically.


vertice_df = merged_df[['Name','X','Y']].copy() 

# Drop rows with NaN in 'Name' column
vertice_df = vertice_df.dropna(subset=['Name'])
# Round the 'x' and 'y' coordinates to 4 digits
vertice_df['X'] = vertice_df['X'].round(4)
vertice_df['Y'] = vertice_df['Y'].round(4)
# Get unique rows based on the 'column_name' column (in-place)
vertice_df.drop_duplicates(subset='Name', inplace=True)

# print(vertice_df)
# Define the header
header = "\n[COORDINATES]\n;;Node           X-Coord            Y-Coord"
line = ";;-------------- ------------------ ------------------"

# Format the DataFrame to align with the header
formatted_vt_df = vertice_df.apply(lambda x: "{:<16} {:<18} {:<18}".format(*x), axis=1)

# Combine the header, line, and formatted DataFrame
table =  '\n'+ header + '\n' + line + '\n' + '\n'.join(formatted_vt_df)

# Open the file in read mode
with open('base.inp', 'r') as file:
    lines = file.readlines()

# Keep only the first 47 lines, which are the model headers
lines = lines[:47]

# Open the file in write mode and write the first 47 lines
with open('base.inp', 'w') as file:
    file.writelines(lines)

# Open the INP file in append mode
with open('base.inp', 'a') as file:
    # Write the table to the file
    file.write(table)

5. Model Verification:

Once you have completed the above process, you can verify your model file by opening it in PCSWMM(commercial) or SWMM (free). If you have done it right, the program should open without any error messages. Now, you can start reviewing (if any) anomalies in your elevation and pipe network data.

Implications and Benefits:

The automation of SWMM model files offers several advantages for urban water management:

Efficiency: Automated processes reduce manual errors and save time, particularly in handling complex or large-scale models.
Scalability: Easier management of extensive datasets and multiple model scenarios.
Data Integration: Improved decision-making through the incorporation of real-time and external data sources.
Customization and Flexibility: Tailored solutions and workflows can be developed to meet specific needs.
Advanced Analysis: Facilitated model calibration and optimization for more effective stormwater infrastructure management.
Possibility of AI and LLM integration. With the ability of interacting directly with engineering software, the intergration with Large Language Model (LLM) such as GPT leads to the potential of improved accessbility and efficency for professionals and public.

Conclusion:

The integration of GIS and Python automation with SWMM presents a forward-thinking approach to urban water management. By streamlining data preparation and model generation, we can enhance the efficiency, accuracy, and scalability of stormwater management projects. This methodology not only addresses current challenges but also sets the stage for future advancements in urban infrastructure planning.

Contact Information:

For further information or to discuss potential collaborations, please contact eric.zhu@urbanlytics.ca. Additionally, explore the SWMM code repository by USEPA on GitHub for more insights into the methodologies behind SWMM.