pywatershed.analysis.time_stats#

Time series statistical functions for pywatershed analysis.

This module provides statistical functions designed to work with xarray DataArrays, particularly for hydrological time series analysis. These functions can be used with CustomOutput or independently for analysis.

All functions accept xarray DataArrays and return xarray DataArrays, preserving coordinates and metadata where appropriate.

Examples

>>> import xarray as xr
>>> from pywatershed.analysis.time_stats import (
...     mean,
...     seven_day_mean_water_year_min,
... )
>>>
>>> # Load some streamflow data
>>> flow_data = xr.open_dataset("streamflow.nc")["seg_outflow"]
>>>
>>> # Calculate mean over time
>>> mean_flow = mean(flow_data, dim="time")
>>>
>>> # Calculate 7-day low flows for each water year
>>> low_flows = seven_day_mean_water_year_min(flow_data)

Functions

max_5day(da)

Calculate maximum resampled to 5-day periods.

max_yearly(da)

Calculate maximum resampled to yearly frequency.

mean(da[, dim, skipna, keep_attrs])

Calculate mean along specified dimension.

mean_monthly(da)

Calculate mean resampled to monthly frequency.

median(da[, dim, skipna, keep_attrs])

Calculate median along specified dimension.

median_by_month(da)

Calculate median grouped by month.

median_monthly(da)

Calculate median resampled to monthly frequency.

percentile_resamp_enclosing(q, freq)

Create a percentile function with specific quantile and resample frequency.

rolling_mean_seasonal_min_max(da[, ...])

Calculate seasonal extremes of rolling mean.

seasonal_min_max(da[, seasons, stat_name])

Calculate seasonal extremes with dates of occurrence.

seven_day_mean_calendar_year_max(da[, ...])

Calendar year 7-day high flow statistic.

seven_day_mean_calendar_year_min(da[, ...])

Calendar year 7-day low flow statistic.

seven_day_mean_water_year_max(da[, ...])

Water year 7-day high flow statistic.

seven_day_mean_water_year_min(da[, ...])

Water year 7-day low flow statistic.

std(da[, dim, skipna, keep_attrs])

Calculate standard deviation along specified dimension.

pywatershed.analysis.time_stats.max_5day(da)[source]#

Calculate maximum resampled to 5-day periods.

Resamples data to 5-day periods and calculates the maximum for each period. Useful for identifying short-term peaks.

Parameters:

da (xr.DataArray) – Input data array with time dimension

Return type:

DataArray

Returns:

xr.DataArray – Maximum values for each 5-day period

Examples

>>> # Get peak flow for each 5-day period
>>> five_day_peaks = max_5day(streamflow_data)
pywatershed.analysis.time_stats.max_yearly(da)[source]#

Calculate maximum resampled to yearly frequency.

Resamples data to year start frequency and calculates the maximum for each year. Returns annual peak values.

Parameters:

da (xr.DataArray) – Input data array with time dimension

Return type:

DataArray

Returns:

xr.DataArray – Annual maximum values

Examples

>>> # Get annual peak flow
>>> annual_peaks = max_yearly(streamflow_data)
pywatershed.analysis.time_stats.mean(da, dim=None, *, skipna=None, keep_attrs=None, **kwargs)[source]#

Calculate mean along specified dimension.

Parameters:
  • da (xr.DataArray) – Input data array

  • dim (str, optional) – Dimension(s) over which to calculate mean

  • skipna (bool, optional) – Whether to skip NaN values

  • keep_attrs (bool, optional) – Whether to preserve attributes

  • **kwargs – Additional keyword arguments passed to xarray mean

Return type:

DataArray

Returns:

xr.DataArray – Mean values

pywatershed.analysis.time_stats.mean_monthly(da)[source]#

Calculate mean resampled to monthly frequency.

Resamples data to month start frequency and calculates mean for each month. Returns a time series with one value per month.

Parameters:

da (xr.DataArray) – Input data array with time dimension

Return type:

DataArray

Returns:

xr.DataArray – Monthly mean values

Examples

>>> # Get mean flow for each month in the time series
>>> monthly_means = mean_monthly(streamflow_data)
pywatershed.analysis.time_stats.median(da, dim=None, *, skipna=None, keep_attrs=None, **kwargs)[source]#

Calculate median along specified dimension.

Parameters:
  • da (xr.DataArray) – Input data array

  • dim (str, optional) – Dimension(s) over which to calculate median

  • skipna (bool, optional) – Whether to skip NaN values

  • keep_attrs (bool, optional) – Whether to preserve attributes

  • **kwargs – Additional keyword arguments passed to xarray median

Return type:

DataArray

Returns:

xr.DataArray – Median values

pywatershed.analysis.time_stats.median_by_month(da)[source]#

Calculate median grouped by month.

Groups data by calendar month and calculates the median for each month across all years. Useful for understanding seasonal patterns.

Parameters:

da (xr.DataArray) – Input data array with time dimension

Return type:

DataArray

Returns:

xr.DataArray – Median values for each calendar month (1-12)

Examples

>>> # Get median flow for each month across all years
>>> monthly_median = median_by_month(streamflow_data)
pywatershed.analysis.time_stats.median_monthly(da)[source]#

Calculate median resampled to monthly frequency.

Resamples data to month start frequency and calculates median for each month. Returns a time series with one value per month.

Parameters:

da (xr.DataArray) – Input data array with time dimension

Return type:

DataArray

Returns:

xr.DataArray – Monthly median values

Examples

>>> # Get median flow for each month in the time series
>>> monthly_series = median_monthly(streamflow_data)
pywatershed.analysis.time_stats.percentile_resamp_enclosing(q, freq)[source]#

Create a percentile function with specific quantile and resample frequency.

This factory function returns a closure that calculates percentiles over resampled time periods.

Parameters:
  • q (float) – Quantile to compute, value between 0 and 1 (e.g., 0.95 for 95th percentile)

  • freq (str) – Resampling frequency string (e.g., ‘1D’ for daily, ‘1MS’ for monthly, ‘5D’ for 5-day periods, ‘1YS’ for yearly)

Return type:

callable

Returns:

callable – A function that takes an xr.DataArray and returns percentile values resampled at the specified frequency

Examples

>>> # Create a function for 95th percentile by month
>>> p95_monthly = percentile_resamp_enclosing(q=0.95, freq="1MS")
>>> result = p95_monthly(streamflow_data)
>>>
>>> # Create a function for 10th percentile by year
>>> p10_yearly = percentile_resamp_enclosing(q=0.10, freq="1YS")
>>> result = p10_yearly(streamflow_data)
>>>
>>> # Create a dictionary of percentile functions for common quantiles
>>> quantiles = [1, 5, 10, 25, 50, 90, 95, 99]
>>> percentile_funcs = {
...     f"q{qq}": percentile_resamp_enclosing(q=qq / 100, freq="1MS")
...     for qq in quantiles
... }
>>> monthly_median = percentile_funcs["q50"](streamflow_data)
>>> monthly_95th = percentile_funcs["q95"](streamflow_data)
pywatershed.analysis.time_stats.rolling_mean_seasonal_min_max(da, time_window=7, min_periods=None, center=True, seasons='ONDJFMAMJJAS', stat_name='max')[source]#

Calculate seasonal extremes of rolling mean.

Computes rolling mean, then finds seasonal min/max. Used for hydrological statistics like 7-day low/high flows.

Parameters:
  • da (xr.DataArray) – Input data array with time dimension

  • time_window (int, default 7) – Size of the rolling window in time steps (typically days)

  • min_periods (int, optional) – Minimum number of observations required to have a value

  • center (bool, default True) – Whether to center the rolling window

  • seasons (str or list, default "ONDJFMAMJJAS") – Season string(s) defining period(s). Default: water year. Can be a list for multiple seasons.

  • stat_name (str, default "max") – Statistic to calculate: “max” or “min”

Return type:

DataArray

Returns:

xr.DataArray – Seasonal extreme values of the rolling mean with dates

See also

seven_day_mean_calendar_year_max

Calendar year 7-day high flows

seven_day_mean_water_year_min

Water year 7-day low flows

pywatershed.analysis.time_stats.seasonal_min_max(da, seasons='ONDJFMAMJJAS', stat_name='max')[source]#

Calculate seasonal extremes with dates of occurrence.

Finds min/max over custom seasons and returns both values and dates when extremes occurred.

Parameters:
  • da (xr.DataArray) – Input data array with time dimension

  • seasons (str or list, default "ONDJFMAMJJAS") – Season string(s) defining period(s). Default: water year (Oct-Sep). Use “JFMAMJJASOND” for calendar year (Jan-Dec). Can be a list for multiple seasons.

  • stat_name (str, default "max") – Statistic to calculate: “max” or “min”

Return type:

DataArray

Returns:

xr.DataArray – Seasonal extreme values with time coordinate replaced by dates when extremes occurred

Examples

>>> # Water year maximum
>>> seasonal_max = seasonal_min_max(flow_data, stat_name="max")
>>> # Calendar year minimum
>>> seasonal_min = seasonal_min_max(
...     flow_data, seasons="JFMAMJJASOND", stat_name="min"
... )
pywatershed.analysis.time_stats.seven_day_mean_calendar_year_max(da, time_window=7, min_periods=None, center=True)[source]#

Calendar year 7-day high flow statistic.

Computes highest 7-day average for each calendar year (Jan-Dec). Common for characterizing annual high flow conditions.

Parameters:
  • da (xr.DataArray) – Input data array with time dimension (typically daily streamflow)

  • time_window (int, default 7) – Size of the rolling window in days

  • min_periods (int, optional) – Minimum number of observations required to have a value

  • center (bool, default True) – Whether to center the rolling window

Return type:

DataArray

Returns:

xr.DataArray – Annual maximum 7-day mean values with dates when they occurred

Examples

>>> # Calculate 7-day high flows for each calendar year
>>> high_flows = seven_day_mean_calendar_year_max(streamflow_data)
pywatershed.analysis.time_stats.seven_day_mean_calendar_year_min(da, time_window=7, min_periods=None, center=True)[source]#

Calendar year 7-day low flow statistic.

Computes lowest 7-day average for each calendar year (Jan-Dec). Related to “7Q10” statistic used for water quality standards.

Parameters:
  • da (xr.DataArray) – Input data array with time dimension (typically daily streamflow)

  • time_window (int, default 7) – Size of the rolling window in days

  • min_periods (int, optional) – Minimum number of observations required to have a value

  • center (bool, default True) – Whether to center the rolling window

Return type:

DataArray

Returns:

xr.DataArray – Annual minimum 7-day mean values with dates when they occurred

Notes

The 7-day low flow is an important metric for water quality regulations and ecological flow requirements. The “7Q10” statistic (10-year, 7-day low flow) is commonly used as a design flow for wastewater discharge permits.

Examples

>>> # Calculate 7-day low flows for each calendar year
>>> low_flows = seven_day_mean_calendar_year_min(streamflow_data)
pywatershed.analysis.time_stats.seven_day_mean_water_year_max(da, time_window=7, min_periods=None, center=True)[source]#

Water year 7-day high flow statistic.

Computes highest 7-day average for each water year (Oct-Sep). Common in hydrological analysis and ecological flow assessments.

Parameters:
  • da (xr.DataArray) – Input data array with time dimension (typically daily streamflow)

  • time_window (int, default 7) – Size of the rolling window in days

  • min_periods (int, optional) – Minimum number of observations required to have a value

  • center (bool, default True) – Whether to center the rolling window

Return type:

DataArray

Returns:

xr.DataArray – Water year maximum 7-day mean values with dates when they occurred

Notes

Water year runs from October 1 through September 30, which better captures the hydrologic cycle in snow-dominated watersheds.

Examples

>>> # Calculate 7-day high flows for each water year
>>> high_flows = seven_day_mean_water_year_max(streamflow_data)
pywatershed.analysis.time_stats.seven_day_mean_water_year_min(da, time_window=7, min_periods=None, center=True)[source]#

Water year 7-day low flow statistic.

Computes lowest 7-day average for each water year (Oct-Sep). Used for low flow analysis and drought assessment.

Parameters:
  • da (xr.DataArray) – Input data array with time dimension (typically daily streamflow)

  • time_window (int, default 7) – Size of the rolling window in days

  • min_periods (int, optional) – Minimum number of observations required to have a value

  • center (bool, default True) – Whether to center the rolling window

Return type:

DataArray

Returns:

xr.DataArray – Water year minimum 7-day mean values with dates when they occurred

Notes

Water year runs from October 1 through September 30. Using water year for low flow analysis is common in regions where summer low flows are the critical period for aquatic ecosystems.

Examples

>>> # Calculate 7-day low flows for each water year
>>> low_flows = seven_day_mean_water_year_min(streamflow_data)
pywatershed.analysis.time_stats.std(da, dim=None, *, skipna=None, keep_attrs=None, **kwargs)[source]#

Calculate standard deviation along specified dimension.

Parameters:
  • da (xr.DataArray) – Input data array

  • dim (str, optional) – Dimension(s) over which to calculate standard deviation

  • skipna (bool, optional) – Whether to skip NaN values

  • keep_attrs (bool, optional) – Whether to preserve attributes

  • **kwargs – Additional keyword arguments passed to xarray std

Return type:

DataArray

Returns:

xr.DataArray – Standard deviation values