Skip to main content

Block Model Service (BMS) Lineage Support

What is lineage in BMS?

Lineage in the Block Model Service (BMS) describes how a block model was produced: its origins, the transformations applied, and the relationships to previous data. Lineage supports provenance (tracking data sources) and reproducibility (enabling the same results from the same inputs). It is a key part of data governance, letting you trace, audit, and automate workflows across the system.

The lineage property follows a defined schema and is stored in the block model’s reference object. The following BMS endpoints support an optional lineage property in their request body:

Reference object integration

  • If you provide a lineage field in your request, BMS validates it against the lineage schema and, if valid, passes it to the reference object.
    • Note: Lineage for block model objects can only be viewed via the Goose APIs.
  • If no reference object exists, BMS ignores the lineage property.
  • If you provide an invalid lineage field in your request, BMS responds with a 422 status and the block model operation is aborted.
  • If you do not provide a lineage field in your request, the reference object will also have no lineage.

Block model creation

The Create a block model endpoint accepts an optional lineage property in the request body. The lineage property is passed through to the reference object's lineage property unmodified.

Block model updates

The Start a block model data update endpoint accepts an optional lineage property in the request body. For full updates, lineage is passed through to the reference object's lineage property unmodified. For partial updates see below.

What is a metadata update?

A metadata update modifies only metadata, such as renaming columns or updating block model properties, without changing the underlying block model data.

  • Note: For lineage purposes, an update call that only deletes columns (and does not add/update any columns) is treated as a metadata update.

What is a partial update?

A partial update creates a new block model version by merging user-supplied data with existing data from the previous version.

What is a full update?

A full update creates a new block model version using only the user-supplied data. All columns in the new version are provided by the user, with no data merged from the previous version.

Metadata updates

The Update a block model's metadata endpoint accepts an optional lineage property.

  • If you provide lineage: BMS passes it through to the reference object's lineage property as provided.
  • If you do not provide lineage: BMS generates a single-event lineage that records:
    • The previous version of the block model as input
    • The BMS job (e.g., update or restore) as the event

When you restore a block model, and it is renamed to avoid a conflict, BMS creates a new reference object version and generates lineage with a single event referencing the previous version and the restore job.

Partial Updates

If you provide a lineage field in your request for a partial update, BMS modifies it before passing it through:

  1. Update the terminal event’s output:
    BMS updates the last event (referenced by self_link) so its output is:

    {
    "name": "<upload_blob_name>",
    "namespace": "evoblockmodelupload://<hub>/<org>/<workspace>"
    }
  2. Add a combining event:
    BMS adds a new event to the events array in lineage to represent the combining event:

    • Sets self_link to point to the new output.

    • Adds a job with:

      • name: "update"
      • namespace: "blocksync"
    • Adds run for the update job with its run ID.

    • Sets eventTime to the update start time.

    • Inputs:

      • The upload blob (output of previous event).
      • The previous block model version (from the original self_link).
    • Outputs:

      • Sets a single entry pointing at the new version, with a namespace of evogeoscienceobject and name of $self.

Example lineage modification

Before (lineage with a single event):

{
"self_link": "/events/0/outputs/0",
"events": [
{
"job": {...},
"run": {...},
"eventTime": {...},
"inputs": [{...}],
"outputs": [
{
"name": "$self",
"namespace": "evogeoscienceobject://<hub>/<org>/<workspace>"
}
]
}
]
}

After (simplified lineage with 2 events on the reference object):

{
"self_link": "/events/1/outputs/0", # Updated to point at the second event
"events": [
{
# The original event
"job": {...},
"run": {...},
"eventTime": {...},
"inputs": [{...}],
"outputs": [
{
"name": "<upload_blob_name>", # Modified to indicate the actual uploaded blob
"namespace": "evoblockmodelupload://<hub>/<org>/<workspace>"
}
]
},
{
# A new event representing the combining of the upload with the previous model version
"job": {/* Update Job */},
"run": {/* Updated Job ID */},
"eventTime": {/* Start time of update */},
"inputs": [
{/* The uploaded blob */},
{/* Previous version of the block model */},
],
"outputs": [
{
"name": "$self",
"namespace": "evogeoscienceobject://<hub>/<org>/<workspace>"
}
]
}
]
}