Dataset reference
Bases: Generic[_FormatT]
File or object-store resource that Beacon can scan directly.
The class acts as a light-weight descriptor containing the user's
original file_path plus convenience methods to inspect schema
information and kick off JSON query builders.
Source code in beacon_api/dataset.py
| |
__init__(http_session, file_path, file_format)
Create a dataset descriptor.
Args:
http_session: Session that knows how to communicate with the Beacon Node.
file_path: Absolute/relative path or URI that Beacon can read.
file_format: File format string supported by Beacon (e.g. parquet).
Source code in beacon_api/dataset.py
get_file_extension()
Return the lowercase file extension without the leading dot.
get_file_format()
get_file_name()
get_file_path()
get_schema()
Fetch the dataset schema by calling the Beacon Node.
Returns:
SchemaType: JSON-compatible schema description mirroring the
server's /api/dataset-schema payload.
Raises: RuntimeError: If the HTTP request fails. ValueError: When the response body is not valid JSON or the decoded value is not a JSON object. Exception: For unsupported field types surfaced by Beacon.
Source code in beacon_api/dataset.py
query(*, delimiter=None, statistics_columns=None, **kwargs)
Build a :class:~beacon_api.query.JSONQuery starting from this dataset.
Args: delimiter: Optional CSV delimiter override (only valid for CSV datasets). statistics_columns: Optional Zarr statistics column names (only valid for Zarr datasets). **kwargs: Additional format-specific options forwarded to the query builder.
Returns: JSONQuery: Query builder tied to this dataset source.
Raises: ValueError: If a format-specific option is passed to the wrong dataset type or the format is not supported.