Bulk reads
Atlas exposes four APIs that read multiple things in one PyO3 round-trip. Each one trades a different chunk of Python-side overhead for raw Rust throughput.
The four APIs at a glance
| API | Returns | Use when |
|---|---|---|
Atlas.to_xarray_many(names, concat_dim, parallel=True) |
xr.Dataset of shape (N, *original) |
You want xarray ergonomics on top of a fleet of identically-shaped datasets. |
Atlas.read_array_across(array, names, start, shape) |
list[np.ndarray | None] |
Per-dataset numpy arrays; some datasets may not declare array. |
Atlas.read_array_across_stacked(array, names, start, shape) |
np.ndarray of shape (N, *slice) |
You'll immediately stack — skip the np.stack copy. Errors if any listed dataset doesn't declare array. |
DatasetView.read_arrays(names, start, shape) |
dict[str, np.ndarray | None] |
Many arrays out of one dataset (e.g. inside a dask worker). |
Cross-dataset, one variable: read_array_across / _stacked
The classic "give me the same slice of temperature from every dataset"
query:
names = atlas.list_datasets() # ["sensor_000", "sensor_001", ...]
# Per-dataset list, with None for missing entries.
per_ds = atlas.read_array_across("temperature", names, start=[0], shape=[24])
# [np.ndarray, np.ndarray, None, np.ndarray, ...]
# Pre-stacked into one (N, *slice_shape) array; faster on the stacking path.
stacked = atlas.read_array_across_stacked("temperature", names, start=[0], shape=[24])
# stacked.shape == (len(names), 24)
What makes these fast:
- One PyO3 round-trip for N datasets, not N.
- One
RwLock::readguard on the shared physical file for the whole batch — atlas's defining shared-array layout makes this cheap. - N per-dataset reads dispatched concurrently on the tokio runtime via a
JoinSetcapped atnum_cpusin-flight tasks. The GIL is released for the duration.
read_array_across_stacked additionally pre-allocates the output
buffer in Rust and writes each row in as the per-dataset task completes,
so you skip the ~N × per_dataset_size of memory copies that
np.stack(read_array_across(...)) would do. The trade-off: it errors if
any listed dataset doesn't declare the array (there's no positional
"missing" sentinel in the stacked representation).
Cross-dataset, all variables: to_xarray_many
This is the atlas-native equivalent of xr.open_mfdataset(...):
combined = atlas.to_xarray_many(
["jan_2024", "feb_2024", "mar_2024"],
concat_dim="month",
)
# Each variable comes back shape (3, *original), eager numpy.
# Coords + dataset-level attrs are taken from the first dataset.
Internally to_xarray_many calls Atlas.read_array_across once per
variable, so you get the same shared-file-handle / tokio fan-out as the
single-variable bulk APIs but with the xarray ergonomics.
Constraints:
- Variable names and dtypes must match across every listed dataset.
- The result is eager numpy. Wrap with
.chunk(...)downstream if you need dask laziness. - The
parallelparameter is accepted for API compatibility but the bulk path is always taken; it doesn't switch implementations.
Per-dataset multi-array: view.read_arrays
Inside a dask worker (or any hot loop over one dataset), the bottleneck is
usually the per-call to_xarray + dask-graph build, not the I/O. Skip
both:
view = atlas.open_dataset("jan_2024")
result = view.read_arrays(["temperature", "pressure"], start=[0, 0], shape=[4, 8])
# {"temperature": np.ndarray, "pressure": np.ndarray}
read_arrays is what bench_collection.py --use-dask calls inside each
delayed task — the ~3–4× speedup over to_xarray(name).isel(...).load()
on chunked storage comes from skipping the per-chunk dask graph build.
Same start/shape applies to every array; missing arrays come back as
None (so this is safe to call against a dataset that may or may not
declare every variable).
API picker (in rough order of speed)
- Cross-dataset slice of the same vars across many datasets →
Atlas.to_xarray_many/Atlas.read_array_across_stacked(theatlas-bulkbenchmark path; one Rust call per variable). - Per-dataset slice reads inside a dask worker →
view.read_arrays(vars, start, shape)(returnsdict[str, np.ndarray]; skips the xr.Dataset + per-chunk dask graph). - Natural xarray code, per dataset →
to_xarray(name).isel(...).load(). Most ergonomic, but pays per-chunk dask graph build overhead on chunked storage.
See Benchmarks for what those differences look like on real workloads.