Skip to content

DatasetView

A typed handle into a single dataset within an Atlas. Mutations go through define_array / write_array / set_attribute / delete_array and are buffered into the parent atlas's in-memory state until Atlas.flush().

atlas.DatasetView

A handle to a single dataset within an Atlas store.

Holds the per-dataset array schemas and attributes. Mutations are buffered in-memory until flush() is called.

__module__ class-attribute

__module__ = 'atlas._atlas'

str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.

name property

name: str

__repr__ method descriptor

__repr__() -> str

Return repr(self).

array_fill_value method descriptor

array_fill_value(array) -> Optional[Any]

Returns the fill value for array, or None if the array doesn't exist in this dataset or was defined without one.

array_meta method descriptor

array_meta(array) -> Optional[dict[str, Any]]

Returns {"dtype", "shape", "chunk_shape", "dimension_names"} for array, or None if the array doesn't exist in this dataset.

array_stats method descriptor

array_stats(array) -> Optional[dict[str, Any]]

Returns {"row_count", "null_count", "min", "max"}, or None if the array doesn't exist in this dataset or stats haven't been computed yet (call flush() first).

attributes method descriptor

attributes() -> dict[str, Any]

Returns a dict of attribute name -> Python value.

define_array method descriptor

define_array(name: str, dtype: str, dims: Sequence[str], shape: Sequence[int], chunk_shape: Optional[Sequence[int]] = None, fill_value: Optional[Any] = None) -> None

Declare a new N-dimensional array.

Args: name: Array name (no /, no leading _, non-empty). dtype: e.g. "float32", "int64", "uint8". See the module README for the full list. dims: Named dimensions, one per axis. shape: Logical shape, one entry per axis. chunk_shape: Optional chunk shape; defaults to shape (a single chunk). fill_value: Optional scalar returned for unwritten cells. Must match the array dtype: a Python int for int/uint/timestamp arrays (range-checked), a float (or int) for float arrays, a bool for bool arrays, a str for string arrays. Raises TypeError on a mismatch and OverflowError if the value is out of range.

delete_array method descriptor

delete_array(name: str) -> None

Remove the array from this dataset (tombstone).

get_attribute method descriptor

get_attribute(key: str) -> Any

Returns the attribute value or None if not set.

list_arrays method descriptor

list_arrays() -> list[str]

Names of all arrays defined in this dataset.

read_array method descriptor

read_array(name: str, start: Optional[Sequence[int]] = None, shape: Optional[Sequence[int]] = None) -> Optional[NDArray[Any]]

Read an array. If start and shape are omitted, reads the full array. Returns None if the array doesn't exist in this dataset.

read_arrays method descriptor

read_arrays(names: Sequence[str], start: Optional[Sequence[int]] = None, shape: Optional[Sequence[int]] = None) -> dict[str, Optional[NDArray[Any]]]

Bulk-read multiple arrays from this dataset in one PyO3 call. Returns {name: ndarray | None}None for arrays not in this dataset. Same start / shape apply to every array.

Fast path for "give me these N variables, optionally sliced" — skips the Python-side xr.Dataset construction and dask graph build that [to_xarray] pays per dataset, while still doing one Rust round-trip per variable. Use this from dask workers (or any per-dataset loop) where the natural xarray API's overhead dominates over the actual I/O cost — the gridded benchmark goes from ~7.8s to <3s by switching the dask branch to call this instead of to_xarray(name).isel(...).load().

set_attribute method descriptor

set_attribute(key: str, value: Any, dtype: Optional[str] = None) -> None

Set a typed attribute.

Type is inferred from the Python type by default. Pass dtype to force a narrower variant (e.g. dtype="int8").

write_array method descriptor

write_array(name: str, start: Sequence[int], data: NDArray[Any]) -> None

Write a numpy array at the given starting index.

The numpy dtype must match the stored dtype and the array must be C-contiguous.