Exploring the Data Lake (REST API)
Use these endpoints to discover what data is available on a running Beacon instance — without running a full query.
Concepts:
- Datasets — individual files (a single
.ncfile, a.parquetfile, a Zarr group, etc.) - Tables — named logical tables registered in Beacon, often spanning many datasets
- Schemas — Arrow field lists (name + type) describing the columns available for
selectandfilter
System info
GET /api/infoReturns Beacon version, configuration summary, and registered table count.
Datasets
List datasets
GET /api/list-datasetsOptional query parameters:
| Parameter | Description |
|---|---|
pattern | Glob to filter paths (e.g. *.nc, **/*.parquet) |
offset | Pagination offset |
limit | Pagination limit |
GET /api/list-datasets?pattern=argo/**/*.nc&limit=50&offset=0Dataset count
GET /api/total-datasetsDataset schema
Returns the Arrow schema (fields + types) for a single path:
GET /api/dataset-schema?file=argo/profile_001.ncTo infer a merged schema across multiple files using a glob:
GET /api/dataset-schema?file=argo/**/*.ncThe response contains an Arrow schema JSON. Column names are under .fields[].name.
Tables
List tables
GET /api/tablesDefault table
Beacon uses this table when a query omits from:
GET /api/default-tableTable schema
GET /api/table-schema?table_name=defaultAll tables with schemas
Convenient for UI discovery, but can be slow on large installations:
GET /api/tables-with-schemaTable configuration
Shows how a table was constructed — paths, file format, statistics settings, etc.:
GET /api/table-config?table_name=defaultFunctions
List all registered DataFusion scalar functions:
GET /api/functionsList all registered Beacon table functions (e.g. read_netcdf, read_zarr):
GET /api/table-functionsSee the Function Reference for descriptions and signatures.
Admin (file management)
File upload, download, and deletion endpoints are available under /api/admin/* and are protected by HTTP Basic Auth.
Table lifecycle is SQL-only. Create, replace, or remove tables by sending SQL DDL to the query endpoint:
POST /api/query
Content-Type: application/json
{ "sql": "CREATE EXTERNAL TABLE argo STORED AS PARQUET LOCATION 'argo/'" }POST /api/query
Content-Type: application/json
{ "sql": "DROP TABLE argo" }Browse /swagger for the full admin request and response shapes.