POSEIDON: AI For Water Quality

Our Workflow for Generating Daily Concentration and Load Data (WQPredict)

We start by collecting water quality data from DataStream, an open-access platform that standardizes and combines records from community groups, researchers, and government agencies across Canada. Using the DataStream API, we download nutrient concentration data for phosphorus and nitrogen, then filter the raw data to remove duplicates and non-relevant sites so each observation accurately represents one location and day.

The data is then cleaned and harmonized to ensure consistency, with invalid or incomplete records removed and all units converted to comparable forms. Only valid results from rivers and streams after 1980 are kept to build a reliable national dataset.

Next, we use the open-source tool MGHydro to delineate watersheds for each monitoring site, outlining the land area that drains into that point and connecting water quality data to its surrounding landscape. Each site is then linked to the nearest streamflow gauge from HYDAT, Canada’s national river database, keeping only gauges within 5 km and with strong watershed overlap to ensure reliable data matches.

We prepare and run the WRTDS-K model, which estimates daily and annual nutrient concentrations and loads in rivers by combining water quality and flow data. After modeling, we check for accuracy and keep only reliable results, summarizing them into annual nutrient and flow trends. Finally, the processed results are uploaded to WQPredict on the POSEIDON portal, providing continuous, accessible nutrient data for hundreds of streams across Canada.