Querying block model data
Introduction
When querying a block model, multiple endpoints need to be hit in order to complete the workflow:
Initial query request
- This starts the block model query workflow.
- This is done via the Start a block model data query endpoint. This is the initial request where the server validates that the body of the request is well formed and references an existing block model with valid columns.
- The POST request body contains details on return data format, which columns to select, and spatial bounds of the query.
- Unlike a block model update, the intitial request to start the query is all that is needed for the job to be created and eventually completed by the server.
- A successful response from the server contains a reflection of the query criteria made in the initial request, as well as a
job_url
which can be polled to check the status of the query job.
The following is an example request body for making this initial request.
{
"bbox": {
"i_minmax": {
"min": 2,
"max": 15
},
"j_minmax": {
"min": 2,
"max": 21
},
"k_minmax": {
"min": 2,
"max": 27
}
},
"columns": ["*"],
"geometry_columns": "indices",
"output_options": {
"file_format": "csv",
"column_headers": "name",
"exclude_null_rows": "true",
},
"version_uuid": "11111111-2222-3333-4444-555555555555"
}
Definitions of payload fields as follows:
- bbox (Optional)
- columns (Required)
- geometry_columns (Optional)
- output_options (Optional)
- version_uuid (Optional)
bbox
(Optional)
The field bbox
refers to the bounding box to be used for the query, and can either be defined in terms of IJK or XYZ. If bbox
is not provided, the query automatically defaults to encompassing the entire block model.
IJK bounding box
When using an IJK (block index) bounding box, it should be of the following form.
{
"i_minmax": {
"min": "int",
"max": "int"
},
"j_minmax": {
"min": "int",
"max": "int"
},
"k_minmax": {
"min": "int",
"max": "int"
}
}
- An IJK bounding box must be in bounds of the index extents of the model, and must make a non-empty selection of blocks to be considered valid by the server.
- The corners of the bounding box defined by
min_corner = (i_min, j_min, k_min)
andmax_corner = (i_max, j_max, k_max)
must followmin_corner <= max_corner
to result in a non-empty selection of blocks. - The index extents of a block model are defined in terms of the
n_blocks
of the modeldx
,dy
, anddz
, and refer to the range of valid integer values an I, J, or K index can have. For example, if we havedx = 10
,dy = 5
, anddz = 7
for some block model, then the valid range of values fori, j, k
min/maxes are as follows:0 <= i_min < dx
or0 <= i_min <= 10 - 1
0 <= j_min < dy
or0 <= j_min <= 5 - 1
0 <= k_min < dz
or0 <= k_min <= 7 - 1
XYZ bounding box
When using an XYZ (centroid coordinate) bounding box, it should be of the following form.
{
"x_minmax": {
"min": "float",
"max": "float"
},
"y_minmax": {
"min": "float",
"max": "float"
},
"z_minmax": {
"min": "float",
"max": "float"
}
}
- An XYZ bounding box should be model aligned. If the model is rotated, then the XYZ bounding box should also be rotated.
- An XYZ bounding box must have
min < max
, and at least one centroid must be contained by the bounding box. - An XYZ bounding box can exceed the bounds of the model, as we select centroids based on whether they're encapsulated by the bounding box or not.
- For more details on XYZ bounding boxes, see XYZ bounding boxes and rotations.
columns
(Required)
- The
columns
fields denotes a list of string values for extra columns of the block model to be included in the output file. - The
columns
field supports selecting columns by either their title or ID, and can also include a wildcard ("*") placeholder, which will expand to all user columns, ordered alphabetically by title, in a case-insensitive manner. - Please note that the wildcard does not cover the system column
version_id
. To includeversion_id
in the output file, you must also explicitly specify it in thecolumns
field alongside the wildcard. - The order of columns in the output file will match the order in the
columns
field. - Columns that are part of the initial "geometry" columns will be ignored if specified.
For example, given a model that has user columns A
(with a column id of d718abe4-56a5-4e27-ad51-813e69eb8aac
), B
, and C
.
The table below shows the output file columns for different model types and parameters.
Model Type | geometry_columns field | columns field | Output File Columns |
---|---|---|---|
Regular | "indices" | ["d718abe4-56a5-4e27-ad51-813e69eb8aac", "B"] | i , j , k , A , B |
Regular | "coordinates" | [] | x , y , z |
Fully sub-blocked | "indices" | ["*", "version_id"] | i , j , k , sidx , A , B , C , version_id |
Flexible | "coordinates" | ["*", "start_si"] | x , y , z , dx , dy , dz , A , B , C , start_si |
geometry_columns
(Optional)
If geometry_columns
is set to "indices" (the default), then the first columns within the output file will be i
, j
, k
, followed by the sub-block index columns (if applicable).
If geometry_columns
is set to "coordinates", then the output file will contain the columns x
, y
, z
for regular block models and x
, y
, z
, dx
, dy
, dz
for sub-blocked block models. These columns are referred to as the "geometry" columns.
output_options
(Optional)
The field ouput_options
determines the format of the file output by BMS, and each file type has additional properties that can be set by the user. ouput_options
defaults in the server to the following.
{
"file_format": "parquet",
"column_headers": "name",
"exclude_null_rows": "true"
}
The exclude_null_rows
field specifies whether rows with all null values in the queried user columns should be excluded from the output file.
- Set
exclude_null_rows
totrue
so rows with entirely null values for the specified user columns will not be included in the output — as outlined in the columns section. - Set
exclude_null_rows
tofalse
so the output query file will include the block model within the specified bbox, including rows with null values only.
By default, exclude_null_rows
is enabled (true
).
parquet
When setting the file_format
field to parquet
we have an optional field column_headers
which can be set to either name
or id
(or omitted).
{
"file_format": "parquet",
"column_headers": "string" | null,
"exclude_null_rows": "boolean" | null
}
The queried file will be output in the .parquet
file format, which is written using the following parameters:
- Row group size of 100,000.
- Compressed using Zstd.
- Parquet data page version set to
1.0
and Parquet version set to2.6
. more details in the Apache Arrow documentation. - Each of the file schema fields (or column headers) will have their
name
set to either the column ID or column title ifcolumn_headers
is set to"id"
or"name"
respectively.
CSV
When file_format
is set to csv
, we have the following fields.
{
"file_format": "csv",
"column_headers": "string" | null,
"delimiter": "string" | null,
"exclude_null_rows": "boolean" | null
}
column_headers
behaves exactly the same way as in file_format: parquet
, and will set the header row values of the CSV to the selected option.
The delimiter
field defaults to ","
, but can be set to any single character string.
version_uuid
(Optional)
A version UUID can be specified, which will result in the query running against that specific version of the model. Be mindful when using this, as only columns that exist at the version specified can be queried, and when using the wildcard "*"
, you can get wildly different results from version to version.
Polling the job URL
A successful initial request will result in the following response from the server.
{
"bm_uuid": "string",
"version_id": "Integer specifying the version sequence number of the version queried",
"version_uuid": "Same as in initial request if specified, otherwise will be version UUID of latest",
"bbox": {"Same as in initial request"},
"mapping": {
"columns": [
{
"col_id": "string",
"title": "string",
"data_type": "string"
} // one entry for each in index columns followed by selected columns
]
},
"columns": ["columns queried in initial request"],
"job_url": "https://example.seequent.com/blockmodel/orgs/{org_uuid}/workspaces/{workspace_uuid}/block-models/{bm_uuid}/jobs/{job_uuid}"
}
The field of the response body we care about here is job_url
, which can be polled via a GET
request to retrieve the current status of the query job. The server's response is largely determined by the value of the job_status
field, so it will be individually defined below.
Job status: QUEUED
or PROCESSING
QUEUED
: The query job is currently queued and waiting to be picked up by the compute service for processing.PROCESSING
: A compute node is currently processing the query.
{
"job_status": "QUEUED" | "PROCESSING"
}
Job status: COMPLETE
The query job has finished and the file is ready to be downloaded. The response will also contain the field payload
, which will contain a SAS signed blob URL which points to the file containing the queried blocks/columns of the block model.
{
"job_status": "COMPLETE",
"payload": {
"download_url": "https://example.seequent.com/blocksync/{org_id}
/query/{criteria_hash}.{file_format}?{additional_blob_query_params}"
}
}
Job status: FAILED
This means that the query job failed during processing. If this happens, a support ticket should be raised as nothing a user changes will result in a different outcome. The response will also contain the field payload
, which will contain an error message from the compute service explaining why the query failed.
{
"job_status": "FAILED",
"payload": {
"status": 500,
"title": "Internal Service Error",
"detail": "A service error occurred when processing this job",
"type": "https://seequent.com/error-codes/block-model-service/compute-service-error"
}
}
Downloading the query file
When the download_url
is retrieved from the payload of a complete job, it can be downloaded using either a GET
request to the download URL or using the Azure SDK.
The download URL that is generated every time the completed job_url
is polled has a 30 minute time-to-live (TTL), so you must start the download process within 30 minutes. The download itself can take longer than 30 minutes as long as the connection remains open.
Azure blob SDK Python example
Downloading using the Azure SDK for Python can be a bit easier than dealing with file streams, etc. For quick reference the example below uses the SDK function download_blob_from_url.
from azure.storage.blob import download_blob_from_url
file_extension = 'csv or parquet'
download_file_path = f'queried_block_model.{file_extension}'
with open(download_file_path, 'wb') as f:
download_blob_from_url(download_url, output=f)
Additional functionality
Once a query has been carried out by the server, BMS will cache the resultant file in blob storage indefinitely (subject to change). This file can be retrieved again in two ways:
- Make another query request with the exact same body + URL.
- Save the job URL from the query for later use. Polling again will return a new blob download URL.