Analysis Tools¶
Tools for data retrieval, watershed characterisation, and hydrological analysis.
Source-specific vs source-agnostic
Data tools (source-specific) fetch from a particular data system and are honest about their limits: delineate_watershed (USGS NLDI), fetch_streamflow_data (USGS NWIS), fetch_forcing_data (GridMET / CONUS only).
Analysis tools (source-agnostic) work on any data already in the session: extract_hydrological_signatures, extract_geomorphic_parameters, compute_twi, create_cn_grid.
For data not covered by built-in tools (GRDC, CWC, BOM, user CSV, remote sensing), write a Python script via mcp_python and store the result in the session with session.set(slot, data).
delineate_watershed¶
Delineate the upstream watershed for a USGS gauge using NHDPlus and the USGS NLDI API.
After delineation, the gauge ID is stored in session.site_id so all downstream tools resolve it automatically — you do not need to pass gauge_id again.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
session_id | str | Yes | Research session identifier — any string (slug, UUID, gauge ID used as shorthand) |
gauge_id | str | No | 8-digit USGS station number, e.g. "01031500". Resolved from session.site_id if omitted. |
workspace_dir | str | No | Absolute path to workspace — all output files saved here automatically. Pass once; remembered for all subsequent tools. |
Returns: Watershed area (km²), gauge coordinates, HUC-02 code, bounding box. Geometry is stored in session and saved to disk — never passed through the LLM context.
Files saved automatically (when workspace_dir set):
| File | Description |
|---|---|
watershed_<gauge_id>.geojson | Full watershed polygon (WGS84) |
watershed_<gauge_id>_map.png | Boundary map with gauge location marker |
Data source: USGS NLDI / NHDPlus via pynhd
Examples:
# New study — create a named session
delineate_watershed('piscataquis-snowmelt-2020', gauge_id='01031500',
workspace_dir='/path/to/workspace')
# gauge ID used directly as session_id (backward compatible)
delineate_watershed('01031500', workspace_dir='/path/to/workspace')
fetch_streamflow_data¶
Retrieve daily discharge time series from the USGS National Water Information System (NWIS).
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
session_id | str | Yes | Research session identifier |
gauge_id | str | No | 8-digit USGS station number. Resolved from session.site_id if omitted (set by delineate_watershed). |
start_date | str | Yes | ISO date, e.g. "2000-01-01" |
end_date | str | Yes | ISO date, e.g. "2020-12-31" |
interval | str | No | "daily" (default) or "hourly" |
Returns: Daily discharge array (m³/s), date range, record count, missing-data statistics.
Files saved automatically (when workspace set):
| File | Description |
|---|---|
streamflow_<gauge_id>.json | Full time series with dates and discharge values |
hydrograph_<gauge_id>.png | Daily hydrograph with 30-day rolling mean overlay |
Data source: USGS NWIS via dataretrieval
extract_hydrological_signatures¶
Compute 15+ flow statistics from the session's streamflow record.
Requires: delineate_watershed (and ideally fetch_streamflow_data) to have been called for this session first.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
session_id | str | Yes | Research session identifier |
start_date | str | No | Analysis start (default: "1989-10-01", CAMELS period) |
end_date | str | No | Analysis end (default: "2009-09-30", CAMELS period) |
Signatures returned:
| Signature | Description |
|---|---|
baseflow_index (BFI) | Fraction of streamflow from baseflow (Eckhardt filter) |
runoff_ratio | Mean annual runoff / mean annual precipitation |
q_mean | Mean daily discharge (mm/day) |
q_cv | Coefficient of variation of daily discharge |
q5, q95 | High-flow (5% exceedance) and low-flow (95% exceedance) |
slope_fdc | Slope of FDC between Q33 and Q66 — flashiness indicator |
high_q_freq | Days per year above 9× median flow |
low_q_freq | Days per year below 0.2× mean flow |
high_q_dur | Mean duration of high-flow events (days) |
low_q_dur | Mean duration of low-flow events (days) |
zero_q_freq | Fraction of days with zero flow |
hfd_mean | Half-flow date — day of year by which 50% of annual flow has passed |
stream_elas | Streamflow elasticity to precipitation |
Files saved automatically (when workspace set):
| File | Description |
|---|---|
signatures_<session_id>.json | All computed signatures with metadata |
fdc_<session_id>.png | Log-scale flow duration curve + signature summary table |
extract_geomorphic_parameters¶
Compute 28 basin morphometry metrics from the watershed geometry and DEM.
Requires: delineate_watershed to have been called for this session first.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
session_id | str | Yes | Research session identifier |
dem_resolution | int | No | DEM resolution in metres (default: 30) |
Selected parameters:
| Parameter | Description |
|---|---|
area_km2 | Watershed area |
perimeter_km | Watershed perimeter |
mean_elevation_m | Basin-averaged elevation (3DEP) |
mean_slope_deg | Basin-averaged slope |
relief_m | Max − min elevation |
elongation_ratio | Circularity measure |
drainage_density | Total stream length / area |
form_factor | Area / (main channel length²) |
Data source: 3DEP 10m DEM via py3dep
compute_twi¶
Compute the Topographic Wetness Index (TWI) raster from the 3DEP DEM.
TWI = ln(α / tan(β)) where α is specific catchment area and β is local slope.
Requires: delineate_watershed to have been called for this session first.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
session_id | str | Yes | Research session identifier |
resolution | int | No | DEM resolution in metres (default: 30) |
create_map | bool | No | Generate PNG + interactive HTML map (default: True) |
Returns: TWI statistics (mean, std, percentiles, high-wetness area fraction). Raster and map files written to workspace.
Data source: 3DEP via py3dep + xrspatial
create_cn_grid¶
Generate a NRCS Curve Number grid by combining NLCD land cover and Polaris soil texture data.
Requires: delineate_watershed to have been called for this session first.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
session_id | str | Yes | Research session identifier |
year | int | No | NLCD land cover year (default: 2019) |
resolution | int | No | Grid resolution in metres (default: 30) |
create_map | bool | No | Generate PNG + interactive HTML map (default: True) |
Returns: Mean CN, area-weighted CN statistics by land cover class, soil group percentages.
Data sources: NLCD (land cover) + Polaris (soil texture) via pygeohydro
fetch_forcing_data¶
Retrieve basin-averaged daily climate data from the GridMET dataset (CONUS only).
Requires: delineate_watershed to have been called for this session first.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
session_id | str | Yes | Research session identifier |
start_date | str | Yes | ISO date |
end_date | str | Yes | ISO date |
variables | list[str] | No | Subset of GridMET variables (default: all). Options: pr, tmmx, tmmn, srad, vs, rmax, rmin, pet, erc |
Variables returned: precipitation, tmax, tmin, reference ET (PET), solar radiation, wind speed, humidity.
Data source: GridMET via pygridmet
Note
GridMET covers the contiguous United States (CONUS) only. For other regions, retrieve forcing data via mcp_python using ERA5, MSWEP, or other global datasets and store in the session.
get_library_reference¶
Look up field-name gotchas, API quirks, unit assumptions, and copy-paste code patterns for a core hydrological Python library.
| Parameter | Type | Required | Description |
|---|---|---|---|
library | str | Yes | Library name (case-insensitive). See supported libraries below. |
Supported libraries:
| Library | Purpose |
|---|---|
pynhd | NLDI watershed polygons and NHD data |
pygeohydro | USGS NWIS streamflow and NLCD land cover |
pygridmet | GridMET daily climate (precipitation, temperature) |
py3dep | 3DEP elevation (DEM) access |
hydrofunctions | Simple NWIS streamflow client |
pysheds | DEM-based flow direction, accumulation, TWI |
rasterio | Raster I/O, masking, reprojection |
xarray | N-dimensional labeled arrays for gridded data |
When to use: Call this before writing any Python script that uses one of these libraries. It returns the exact field names, CRS requirements, unit conventions, and common mistakes — preventing hallucination errors in generated scripts.
Returns: library, purpose, field_mappings (function-level notes), gotchas (list), common_patterns (copy-paste snippets).
Community plugins can extend this via the aihydro.knowledge entry point — see Plugin Guide.