Block Model Service (BMS) Lineage Support
What is lineage in BMS?
Lineage in the Block Model Service (BMS) describes how a block model was produced: its origins, the transformations applied, and the relationships to previous data. Lineage supports provenance (tracking data sources) and reproducibility (enabling the same results from the same inputs). It is a key part of data governance, letting you trace, audit, and automate workflows across the system.
The lineage
property follows a defined schema and is stored in the block model’s reference object. The following BMS endpoints support an optional lineage
property in their request body:
Reference object integration
- If you provide a
lineage
field in your request, BMS validates it against the lineage schema and, if valid, passes it to the reference object.- Note: Lineage for block model objects can only be viewed via the Goose APIs.
- If no reference object exists, BMS ignores the lineage property.
- If you provide an invalid
lineage
field in your request, BMS responds with a 422 status and the block model operation is aborted. - If you do not provide a
lineage
field in your request, the reference object will also have no lineage.
Block model creation
The Create a block model
endpoint accepts an optional lineage
property in the request body. The lineage
property is passed through to the reference object's lineage
property unmodified.
Block model updates
The Start a block model data update
endpoint accepts an optional lineage
property in the request body. For full updates, lineage
is passed through to the reference object's lineage
property unmodified. For partial updates see below.
What is a metadata update?
A metadata update modifies only metadata, such as renaming columns or updating block model properties, without changing the underlying block model data.
- Note: For lineage purposes, an update call that only deletes columns (and does not add/update any columns) is treated as a metadata update.
What is a partial update?
A partial update creates a new block model version by merging user-supplied data with existing data from the previous version.
What is a full update?
A full update creates a new block model version using only the user-supplied data. All columns in the new version are provided by the user, with no data merged from the previous version.
Metadata updates
The Update a block model's metadata
endpoint accepts an optional lineage
property.
- If you provide lineage: BMS passes it through to the reference object's
lineage
property as provided. - If you do not provide lineage: BMS generates a single-event lineage that records:
- The previous version of the block model as input
- The BMS job (e.g., update or restore) as the event
When you restore a block model, and it is renamed to avoid a conflict, BMS creates a new reference object version and generates lineage with a single event referencing the previous version and the restore job.
Partial Updates
If you provide a lineage
field in your request for a partial update, BMS modifies it before passing it through:
-
Update the terminal event’s output:
BMS updates the last event (referenced byself_link
) so its output is:{
"name": "<upload_blob_name>",
"namespace": "evoblockmodelupload://<hub>/<org>/<workspace>"
} -
Add a combining event:
BMS adds a new event to theevents
array inlineage
to represent the combining event:-
Sets
self_link
to point to the new output. -
Adds a job with:
name
: "update"namespace
: "blocksync"
-
Adds
run
for the update job with its run ID. -
Sets
eventTime
to the update start time. -
Inputs:
- The upload blob (output of previous event).
- The previous block model version (from the original
self_link
).
-
Outputs:
- Sets a single entry pointing at the new version, with a namespace of
evogeoscienceobject
and name of$self
.
- Sets a single entry pointing at the new version, with a namespace of
-
Example lineage modification
Before (lineage
with a single event):
{
"self_link": "/events/0/outputs/0",
"events": [
{
"job": {...},
"run": {...},
"eventTime": {...},
"inputs": [{...}],
"outputs": [
{
"name": "$self",
"namespace": "evogeoscienceobject://<hub>/<org>/<workspace>"
}
]
}
]
}
After (simplified lineage
with 2 events on the reference object):
{
"self_link": "/events/1/outputs/0", # Updated to point at the second event
"events": [
{
# The original event
"job": {...},
"run": {...},
"eventTime": {...},
"inputs": [{...}],
"outputs": [
{
"name": "<upload_blob_name>", # Modified to indicate the actual uploaded blob
"namespace": "evoblockmodelupload://<hub>/<org>/<workspace>"
}
]
},
{
# A new event representing the combining of the upload with the previous model version
"job": {/* Update Job */},
"run": {/* Updated Job ID */},
"eventTime": {/* Start time of update */},
"inputs": [
{/* The uploaded blob */},
{/* Previous version of the block model */},
],
"outputs": [
{
"name": "$self",
"namespace": "evogeoscienceobject://<hub>/<org>/<workspace>"
}
]
}
]
}