Block Model Service (BMS) Lineage Support
What is lineage in BMS?
Lineage in the Block Model Service (BMS) describes how a block model was produced: its origins, the transformations applied, and the relationships to previous data. Lineage supports provenance (tracking data sources) and reproducibility (enabling the same results from the same inputs). It is a key part of data governance, letting you trace, audit, and automate workflows across the system.
The lineage property follows a defined schema and is stored in the block model’s reference object. The following BMS endpoints support an optional lineage property in their request body:
Reference object integration
- If you provide a
lineagefield in your request, BMS validates it against the lineage schema and, if valid, passes it to the reference object.- Note: Lineage for block model objects can only be viewed via the Goose APIs.
- If no reference object exists, BMS ignores the lineage property.
- If you provide an invalid
lineagefield in your request, BMS responds with a 422 status and the block model operation is aborted. - If you do not provide a
lineagefield in your request, the reference object will also have no lineage.
Block model creation
The Create a block model endpoint accepts an optional lineage property in the request body. The lineage property is passed through to the reference object's lineage property unmodified.
Block model updates
The Start a block model data update endpoint accepts an optional lineage property in the request body. For full updates, lineage is passed through to the reference object's lineage property unmodified. For partial updates see below.
What is a metadata update?
A metadata update modifies only metadata, such as renaming columns or updating block model properties, without changing the underlying block model data.
- Note: For lineage purposes, an update call that only deletes columns (and does not add/update any columns) is treated as a metadata update.
What is a partial update?
A partial update creates a new block model version by merging user-supplied data with existing data from the previous version.
What is a full update?
A full update creates a new block model version using only the user-supplied data. All columns in the new version are provided by the user, with no data merged from the previous version.
Metadata updates
The Update a block model's metadata endpoint accepts an optional lineage property.
- If you provide lineage: BMS passes it through to the reference object's
lineageproperty as provided. - If you do not provide lineage: BMS generates a single-event lineage that records:
- The previous version of the block model as input
- The BMS job (e.g., update or restore) as the event
When you restore a block model, and it is renamed to avoid a conflict, BMS creates a new reference object version and generates lineage with a single event referencing the previous version and the restore job.
Partial Updates
If you provide a lineage field in your request for a partial update, BMS modifies it before passing it through:
-
Update the terminal event’s output:
BMS updates the last event (referenced byself_link) so its output is:{
"name": "<upload_blob_name>",
"namespace": "evoblockmodelupload://<hub>/<org>/<workspace>"
} -
Add a combining event:
BMS adds a new event to theeventsarray inlineageto represent the combining event:-
Sets
self_linkto point to the new output. -
Adds a job with:
name: "update"namespace: "blocksync"
-
Adds
runfor the update job with its run ID. -
Sets
eventTimeto the update start time. -
Inputs:
- The upload blob (output of previous event).
- The previous block model version (from the original
self_link).
-
Outputs:
- Sets a single entry pointing at the new version, with a namespace of
evogeoscienceobjectand name of$self.
- Sets a single entry pointing at the new version, with a namespace of
-
Example lineage modification
Before (lineage with a single event):
{
"self_link": "/events/0/outputs/0",
"events": [
{
"job": {...},
"run": {...},
"eventTime": {...},
"inputs": [{...}],
"outputs": [
{
"name": "$self",
"namespace": "evogeoscienceobject://<hub>/<org>/<workspace>"
}
]
}
]
}
After (simplified lineage with 2 events on the reference object):
{
"self_link": "/events/1/outputs/0", # Updated to point at the second event
"events": [
{
# The original event
"job": {...},
"run": {...},
"eventTime": {...},
"inputs": [{...}],
"outputs": [
{
"name": "<upload_blob_name>", # Modified to indicate the actual uploaded blob
"namespace": "evoblockmodelupload://<hub>/<org>/<workspace>"
}
]
},
{
# A new event representing the combining of the upload with the previous model version
"job": {/* Update Job */},
"run": {/* Updated Job ID */},
"eventTime": {/* Start time of update */},
"inputs": [
{/* The uploaded blob */},
{/* Previous version of the block model */},
],
"outputs": [
{
"name": "$self",
"namespace": "evogeoscienceobject://<hub>/<org>/<workspace>"
}
]
}
]
}