pywatershed.base.DatasetDict#
- class pywatershed.base.DatasetDict(dims=None, coords=None, data_vars=None, metadata=None, encoding=None, validate=True)[source]#
DatasetDict: a data model following NetCDF-like conventions
This is the core class in the data model adopted by pywatershed.
The DatasetDict handles dimensions, coordinates, data, and metadata in a way like NetCDF and xarray and provides invertible mappings between both the netCDF4 and xarray Python packages.
Where metadata is typically stored on a variable in NetCDF and in xarray, a DatasetDict maintains metadata in dictionary collocated with coordinate and data variables. The data model is a DatasetDict with dims, coords, data_vars, and metadata keys. The dims track the length of each dimension. The coordinates are the discrete locations along dimensions or sets of dimensions. The data_vars contain the data located on dims and coordinates. The metadata describes the relationship between both coords and data_vars and their dims. Together the coords and data_vars are the variables of the DatasetDict. All keys in the variables must be present in the metadata dictionary and each ke contains two more keys: dims and attrs. The dims is a tuple of the variable’s dimensions and attrs are more general attributes.
When a NetCDF file is read from disk, it has encoding properties that may come along. Alternatively, encodings may be specified before writing to file.
- Parameters:
dims (
Optional
[dict
[int
]]) – A dictionary of pairs of dim_names: dim_len where dim_len is an integer value.coords (
Optional
[dict
]) – A dictionary of pairs of coord_names: coord_data where coord_data is an np.ndarray.data_vars (
Optional
[dict
]) – A dictionary of pairs of var_names: var_data where coord_data is an np.ndarray.For all names in coords and data_vars, metadata entries with the required fields:
dims: tuple of names in dim,
attrs: dictionary whose values may be strings, ints, floats
The metadata argument may also contain a special global key paired with a dictionary of global metadata of arbitrary name and values of string, integer, or float types.
validate (
bool
) – A bool that defaults to True, enforcing the consistency of the supplied dictionaries
See also
Examples
>>> from pprint import pprint >>> import pywatershed as pws >>> import numpy as np >>> coords = { ... "time": np.arange( ... "2005-02-01", "2005-02-03", dtype="datetime64[D]" ... ), ... "space": np.arange(3), ... } >>> dims = {"ntime": len(coords["time"]), "nspace": len(coords["space"])} >>> data = {"precip": 10 * np.random.rand(dims["ntime"], dims["nspace"])} >>> metadata = { ... "time": {"dims": ("ntime",), "attrs": {"description": "days"}}, ... "space": { ... "dims": ("nspace",), ... "attrs": {"description": "points of interest"}, ... }, ... "precip": { ... "dims": ( ... "ntime", ... "nspace", ... ), ... "attrs": { ... "description": "precipitation rate of all phases", ... "units": "mm/day", ... }, ... }, ... } >>> dd = pws.base.DatasetDict( ... dims=dims, coords=coords, data_vars=data, metadata=metadata ... ) >>> dd.dims.keys() dict_keys(['ntime', 'nspace']) >>> dd.variables.keys() dict_keys(['time', 'space', 'precip']) >>> ds = dd.to_xr_ds() >>> print(ds) <xarray.Dataset> Dimensions: (ntime: 2, nspace: 3) Coordinates: time (ntime) datetime64[ns] 2005-02-01 2005-02-02 space (nspace) int64 0 1 2 Dimensions without coordinates: ntime, nspace Data variables: precip (ntime, nspace) float64 8.835 5.667 9.593 7.239 3.92 0.4195
- __init__(dims=None, coords=None, data_vars=None, metadata=None, encoding=None, validate=True)[source]#
Methods
__init__
([dims, coords, data_vars, ...])drop_var
(var_names)Drop variables
from_dict
(dict_in[, copy])Return this class from a passed dictionary. :type dict_in: :param dict_in: a dictionary from which to create an instance of this class :type copy: :param copy: boolean if the passed dictionary should be deep copied.
from_ds
(ds)Get this class from a dataset (nc4 or xarray).
from_netcdf
(nc_file[, use_xr, encoding])Load this class from a netcdf file.
merge
(*dd_list[, copy, del_global_src])Merge a list of this class in to a single instance
rename_dim
(name_maps[, in_place])Rename dimensions.
rename_var
(name_maps[, in_place])Rename variables.
subset
(keys[, copy, keep_global, ...])Subset a DatasetDict to keys in data_vars or coordinates
subset_on_coord
(coord_name, where)Subset DatasetDict to a np.where along a named coordinate in-place
to_nc4_ds
(filename)Export to a netcdf file via netcdf4
to_netcdf
(filename[, use_xr])Write parameters to a netcdf file
to_xr_dd
()Export to an xarray DatasetDict (xr.Dataset.to_dict()).
to_xr_ds
()Export to an xarray Dataset
validate
()Check that a DatasetDict is internally consistent.
Attributes
Return the coordinates
dims, coords, data_vars, metadata, encoding
Return the data_vars.
Return the dimensions
Return the encoding
Return the metadata
Return the spatial coordinate names.
Return coords and data_vars together
- property data: dict#
dims, coords, data_vars, metadata, encoding
- Parameters:
copy – boolean if a deepcopy is desired
- Returns:
A dict of dicts containing all the data
- Type:
Return a dict of dicts
- classmethod from_dict(dict_in, copy=False)[source]#
Return this class from a passed dictionary. :type dict_in: :param dict_in: a dictionary from which to create an instance of this
class
- Parameters:
copy – boolean if the passed dictionary should be deep copied
- Returns:
A object of this class.
- classmethod from_netcdf(nc_file, use_xr=False, encoding=False)[source]#
Load this class from a netcdf file.
- Return type:
- classmethod merge(*dd_list, copy=True, del_global_src=True)[source]#
Merge a list of this class in to a single instance
- Parameters:
dd_list – a list of object of this class
copy – boolean if a deep copy of inputs is desired
del_global_src – boolean to delete encodings’ global source attribute prior to merging (as these often conflict)
- Returns:
An object of this class.
- property spatial_coord_names: dict#
Return the spatial coordinate names. :param None:
- Returns:
Dictionary of spatial coordinates with names.
- subset(keys, copy=False, keep_global=False, keep_global_metadata=None, keep_global_encoding=None, strict=False)[source]#
Subset a DatasetDict to keys in data_vars or coordinates
- Parameters:
keys (
Iterable
) – Iterable to subset oncopy (
bool
) – bool to copy the input or edit itkeep_global (
bool
) – bool that sets both keep_global_metadata and keep_global_encodingkeep_global_metadata (
Optional
[bool
]) – bool retain the global metadata in the subsetkeep_global_encoding (
Optional
[bool
]) – bool retain the global encoding in the subset
- Return type:
- Returns:
A subset Parameter object on the passed keys.
- subset_on_coord(coord_name, where)[source]#
Subset DatasetDict to a np.where along a named coordinate in-place