Skip to content

JSON Query DSL

The JSON DSL lets you express queries as a typed object — no SQL string building required. It is the preferred interface for programmatic clients and query builders.

http
POST /api/query
Content-Type: application/json

TIP

Discover available columns before querying:

  • Default table: GET /api/table-schema?table_name=default
  • From a file glob: GET /api/dataset-schema?file=argo/**/*.nc

Request shape

FieldRequiredDescription
selectYesColumns (and expressions) to return
fromNoData source — table name or inline file source
filtersNoRow filters, combined with AND by default
sort_byNoSort expressions
limitNoMaximum rows to return
offsetNoRows to skip
distinctNoDISTINCT ON expression
outputNoOutput format (default: Arrow IPC stream)

Selecting columns

Plain column

json
{ "select": ["time", "latitude", "longitude"] }

Column with alias

json
{
  "select": [
    { "column": "sea_surface_temperature", "alias": "sst" }
  ]
}

Function call

json
{
  "select": [
    { "function": "round", "args": ["temperature", { "value": 2 }], "alias": "temperature_rounded" }
  ]
}

args entries are either a column name string or a literal { "value": … } object.

Choosing the data source (from)

Query a registered table

http
POST /api/query
Content-Type: application/json

{
  "from": "default",
  "select": ["time", "temperature"],
  "limit": 100,
  "output": { "format": "csv" }
}

Use GET /api/tables to list available table names. When from is omitted, Beacon uses the configured default table.

Query files directly

Pass a format key with a paths array. Paths are resolved relative to Beacon's dataset root and support glob patterns.

NetCDF:

http
POST /api/query
Content-Type: application/json

{
  "from": { "netcdf": { "paths": ["argo/**/*.nc"] } },
  "select": ["time", "latitude", "longitude", "temperature"],
  "limit": 100,
  "output": { "format": "csv" }
}

Zarr:

http
POST /api/query
Content-Type: application/json

{
  "from": { "zarr": { "paths": ["sst/*/zarr.json"] } },
  "select": ["time", "sst"],
  "limit": 100,
  "output": { "format": "csv" }
}

To enable 1D slice pushdown for large Zarr stores, supply statistics_columns:

http
POST /api/query
Content-Type: application/json

{
  "from": {
    "zarr": {
      "paths": ["sst/*/zarr.json"],
      "statistics_columns": ["time", "latitude", "longitude"]
    }
  },
  "select": ["time", "latitude", "longitude", "sst"],
  "filters": [{ "column": "time", "min": "2025-01-01" }],
  "limit": 1000,
  "output": { "format": "csv" }
}

Parquet:

http
POST /api/query
Content-Type: application/json

{
  "from": { "parquet": { "paths": ["obs/**/*.parquet"] } },
  "select": ["time", "latitude", "longitude"],
  "limit": 100,
  "output": { "format": "csv" }
}

Other supported format keys: csv, arrow, odv.

Filters

filters is an array of filter objects. Multiple entries are combined with AND. A filter can be placed on any column in the schema.

Range (min / max)

json
{ "filters": [{ "column": "temperature", "min": 2, "max": 10 }] }

Either min or max can be omitted for an open-ended range.

Equality

json
{ "filters": [{ "column": "platform", "eq": "SHIP" }] }

AND (multiple filters)

http
POST /api/query
Content-Type: application/json

{
  "select": ["time", "latitude", "longitude", "temperature"],
  "filters": [
    { "column": "temperature", "min": 2, "max": 10 },
    { "column": "latitude", "min": -10, "max": 10 }
  ],
  "limit": 10000,
  "output": { "format": "csv" }
}

OR

Wrap OR branches in a single or filter object:

http
POST /api/query
Content-Type: application/json

{
  "select": ["time", "platform", "temperature"],
  "filters": [
    {
      "or": [
        { "column": "platform", "eq": "SHIP" },
        { "column": "platform", "eq": "BUOY" }
      ]
    }
  ],
  "limit": 1000,
  "output": { "format": "csv" }
}

GeoJSON spatial filter

Tests whether a point (lon/lat columns) falls within a GeoJSON geometry:

http
POST /api/query
Content-Type: application/json

{
  "select": ["time", "longitude", "latitude", "temperature"],
  "filters": [
    {
      "longitude_column": "longitude",
      "latitude_column": "latitude",
      "geometry": {
        "type": "Polygon",
        "coordinates": [[[4.0, 52.0], [6.0, 52.0], [6.0, 54.0], [4.0, 54.0], [4.0, 52.0]]]
      }
    }
  ],
  "limit": 10000,
  "output": { "format": "csv" }
}

Sorting and pagination

FieldDescription
sort_byArray of {"Asc": "col"} or {"Desc": "col"} objects
limitMaximum number of rows
offsetNumber of rows to skip

WARNING

sort_by enum keys are case-sensitive: "Asc" and "Desc", not "asc" / "desc".

http
POST /api/query
Content-Type: application/json

{
  "select": ["time", "temperature"],
  "sort_by": [{ "Desc": "time" }],
  "offset": 100,
  "limit": 50,
  "output": { "format": "csv" }
}

DISTINCT ON

Return one row per unique combination of the on columns:

http
POST /api/query
Content-Type: application/json

{
  "distinct": {
    "on": ["platform"],
    "select": ["platform", "time", "temperature"]
  },
  "sort_by": [{ "Desc": "time" }],
  "limit": 100,
  "output": { "format": "csv" }
}

Output formats

See Querying — Output formats for the full list. The output field is identical for JSON DSL and SQL queries.