pywatershed.base.DatasetDict#

class pywatershed.base.DatasetDict(dims=None, coords=None, data_vars=None, metadata=None, encoding=None, validate=True)[source]#

DatasetDict: a data model following NetCDF-like conventions

This is the core class in the data model adopted by pywatershed.

The DatasetDict handles dimensions, coordinates, data, and metadata in a way like NetCDF and xarray and provides invertible mappings between both the netCDF4 and xarray Python packages.

Where metadata is typically stored on a variable in NetCDF and in xarray, a DatasetDict maintains metadata in dictionary collocated with coordinate and data variables. The data model is a DatasetDict with dims, coords, data_vars, and metadata keys. The dims track the length of each dimension. The coordinates are the discrete locations along dimensions or sets of dimensions. The data_vars contain the data located on dims and coordinates. The metadata describes the relationship between both coords and data_vars and their dims. Together the coords and data_vars are the variables of the DatasetDict. All keys in the variables must be present in the metadata dictionary and each ke contains two more keys: dims and attrs. The dims is a tuple of the variable’s dimensions and attrs are more general attributes.

When a NetCDF file is read from disk, it has encoding properties that may come along. Alternatively, encodings may be specified before writing to file.

Parameters:
  • dims (Optional[dict[int]]) – A dictionary of pairs of dim_names: dim_len where dim_len is an integer value.

  • coords (Optional[dict]) – A dictionary of pairs of coord_names: coord_data where coord_data is an np.ndarray.

  • data_vars (Optional[dict]) – A dictionary of pairs of var_names: var_data where coord_data is an np.ndarray.

  • metadata (Optional[dict]) –

    For all names in coords and data_vars, metadata entries with the required fields:

    • dims: tuple of names in dim,

    • attrs: dictionary whose values may be strings, ints, floats

    The metadata argument may also contain a special global key paired with a dictionary of global metadata of arbitrary name and values of string, integer, or float types.

  • encoding (Optional[dict]) – (to document)

  • validate (bool) – A bool that defaults to True, enforcing the consistency of the supplied dictionaries

Examples

>>> from pprint import pprint
>>> import pywatershed as pws
>>> import numpy as np
>>> coords = {
...     "time": np.arange(
...         "2005-02-01", "2005-02-03", dtype="datetime64[D]"
...     ),
...     "space": np.arange(3),
... }
>>> dims = {"ntime": len(coords["time"]), "nspace": len(coords["space"])}
>>> data = {"precip": 10 * np.random.rand(dims["ntime"], dims["nspace"])}
>>> metadata = {
...     "time": {"dims": ("ntime",), "attrs": {"description": "days"}},
...     "space": {
...         "dims": ("nspace",),
...         "attrs": {"description": "points of interest"},
...     },
...     "precip": {
...         "dims": (
...             "ntime",
...             "nspace",
...         ),
...         "attrs": {
...             "description": "precipitation rate of all phases",
...             "units": "mm/day",
...         },
...     },
... }
>>> dd = pws.base.DatasetDict(
...     dims=dims, coords=coords, data_vars=data, metadata=metadata
... )
>>> dd.dims.keys()
dict_keys(['ntime', 'nspace'])
>>> dd.variables.keys()
dict_keys(['time', 'space', 'precip'])
>>> ds = dd.to_xr_ds()
>>> print(ds)
<xarray.Dataset>
Dimensions:  (ntime: 2, nspace: 3)
Coordinates:
    time     (ntime) datetime64[ns] 2005-02-01 2005-02-02
    space    (nspace) int64 0 1 2
Dimensions without coordinates: ntime, nspace
Data variables:
    precip   (ntime, nspace) float64 8.835 5.667 9.593 7.239 3.92 0.4195
__init__(dims=None, coords=None, data_vars=None, metadata=None, encoding=None, validate=True)[source]#

Methods

__init__([dims, coords, data_vars, ...])

drop_var(var_names)

Drop variables

from_dict(dict_in[, copy])

Return this class from a passed dictionary. :type dict_in: :param dict_in: a dictionary from which to create an instance of this class :type copy: :param copy: boolean if the passed dictionary should be deep copied.

from_ds(ds)

Get this class from a dataset (nc4 or xarray).

from_netcdf(nc_file[, use_xr, encoding])

Load this class from a netcdf file.

merge(*dd_list[, copy, del_global_src])

Merge a list of this class in to a single instance

rename_dim(name_maps[, in_place])

Rename dimensions.

rename_var(name_maps[, in_place])

Rename variables.

subset(keys[, copy, keep_global, ...])

Subset a DatasetDict to keys in data_vars or coordinates

subset_on_coord(coord_name, where)

Subset DatasetDict to a np.where along a named coordinate in-place

to_nc4_ds(filename)

Export to a netcdf file via netcdf4

to_netcdf(filename[, use_xr])

Write parameters to a netcdf file

to_xr_dd()

Export to an xarray DatasetDict (xr.Dataset.to_dict()).

to_xr_ds()

Export to an xarray Dataset

validate()

Check that a DatasetDict is internally consistent.

Attributes

coords

Return the coordinates

data

dims, coords, data_vars, metadata, encoding

data_vars

Return the data_vars.

dims

Return the dimensions

encoding

Return the encoding

metadata

Return the metadata

spatial_coord_names

Return the spatial coordinate names.

variables

Return coords and data_vars together

property coords: dict#

Return the coordinates

property data: dict#

dims, coords, data_vars, metadata, encoding

Parameters:

copy – boolean if a deepcopy is desired

Returns:

A dict of dicts containing all the data

Type:

Return a dict of dicts

property data_vars: dict#

Return the data_vars.

property dims: dict#

Return the dimensions

drop_var(var_names)[source]#

Drop variables

property encoding: dict#

Return the encoding

classmethod from_dict(dict_in, copy=False)[source]#

Return this class from a passed dictionary. :type dict_in: :param dict_in: a dictionary from which to create an instance of this

class

Parameters:

copy – boolean if the passed dictionary should be deep copied

Returns:

A object of this class.

classmethod from_ds(ds)[source]#

Get this class from a dataset (nc4 or xarray).

classmethod from_netcdf(nc_file, use_xr=False, encoding=False)[source]#

Load this class from a netcdf file.

Return type:

DatasetDict

classmethod merge(*dd_list, copy=True, del_global_src=True)[source]#

Merge a list of this class in to a single instance

Parameters:
  • dd_list – a list of object of this class

  • copy – boolean if a deep copy of inputs is desired

  • del_global_src – boolean to delete encodings’ global source attribute prior to merging (as these often conflict)

Returns:

An object of this class.

property metadata: dict#

Return the metadata

rename_dim(name_maps, in_place=True)[source]#

Rename dimensions.

rename_var(name_maps, in_place=True)[source]#

Rename variables.

property spatial_coord_names: dict#

Return the spatial coordinate names. :param None:

Returns:

Dictionary of spatial coordinates with names.

subset(keys, copy=False, keep_global=False, keep_global_metadata=None, keep_global_encoding=None, strict=False)[source]#

Subset a DatasetDict to keys in data_vars or coordinates

Parameters:
  • keys (Iterable) – Iterable to subset on

  • copy (bool) – bool to copy the input or edit it

  • keep_global (bool) – bool that sets both keep_global_metadata and keep_global_encoding

  • keep_global_metadata (Optional[bool]) – bool retain the global metadata in the subset

  • keep_global_encoding (Optional[bool]) – bool retain the global encoding in the subset

Return type:

DatasetDict

Returns:

A subset Parameter object on the passed keys.

subset_on_coord(coord_name, where)[source]#

Subset DatasetDict to a np.where along a named coordinate in-place

Parameters:
  • coord_name (str) – string name of a coordinate

  • where (ndarray) – the result of an np.where along that coordinate (or likewise constructed)

Return type:

None

Returns:

None

to_nc4_ds(filename)[source]#

Export to a netcdf file via netcdf4

Return type:

None

to_netcdf(filename, use_xr=False)[source]#

Write parameters to a netcdf file

Return type:

None

to_xr_dd()[source]#

Export to an xarray DatasetDict (xr.Dataset.to_dict()).

Return type:

dict

to_xr_ds()[source]#

Export to an xarray Dataset

Return type:

Dataset

validate()[source]#

Check that a DatasetDict is internally consistent.

Return type:

None

Returns:

None

property variables: dict#

Return coords and data_vars together