DownloadedObject
DownloadedObject
evo.objects.client.object_client.DownloadedObject
A downloaded geoscience object.
schema property
schema: ObjectSchema
The schema of the object.
metadata property
metadata: ObjectMetadata
The metadata of the object.
__init__
__init__(
object_: GeoscienceObject,
metadata: ObjectMetadata,
urls_by_name: dict[str, str],
connector: APIConnector,
cache: ICache | None = None,
) -> None
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
object_ | GeoscienceObject | The raw geoscience object model. | required |
metadata | ObjectMetadata | The parsed metadata for the object. | required |
urls_by_name | dict[str, str] | A mapping of data names to their initial download URLs. | required |
connector | APIConnector | The API connector to use for downloading data. | required |
cache | ICache | None | An optional cache to use for data downloads. | None |
from_reference async staticmethod
from_reference(
connector: APIConnector,
reference: ObjectReference | str,
cache: ICache | None = None,
request_timeout: int | float | tuple[int | float, int | float] | None = None,
) -> DownloadedObject
Download a geoscience object from the service, given an object reference.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
connector | APIConnector | The API connector to use for downloading data. | required |
reference | ObjectReference | str | The reference to the object to download, or a URL as a string that can be parsed into a reference. | required |
cache | ICache | None | An optional cache to use for data downloads. | None |
request_timeout | int | float | tuple[int | float, int | float] | None | An optional timeout to use for API requests. See evo.common.APIConnector for details. | None |
Raises:
| Type | Description |
|---|---|
ValueError | If the reference is invalid, or if the connector base URL does not match the reference hub URL. |
from_context async staticmethod
from_context(context: IContext, reference: ObjectReference | str) -> DownloadedObject
Download a geoscience object from the service using a context.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
context | IContext | The context providing the connector and cache. | required |
reference | ObjectReference | str | The reference to the object to download, or a URL as a string that can be parsed into a reference. | required |
Raises:
| Type | Description |
|---|---|
ValueError | If the reference is invalid, or if the connector base URL does not match the reference hub URL. |
as_dict
as_dict() -> dict
Get this object as a dictionary.
search
search(expression: str) -> Any
Search the object metadata using a JMESPath expression.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
expression | str | The JMESPath expression to use for the search. | required |
Returns:
| Type | Description |
|---|---|
Any | The result of the search. |
prepare_data_download
prepare_data_download(data_identifiers: Sequence[str | UUID]) -> Iterator[ObjectDataDownload]
Prepare to download multiple data files from the geoscience object service, for this object.
Any data IDs that are not associated with the requested object will raise a DataNotFoundError.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_identifiers | Sequence[str | UUID] | A list of sha256 digests or UUIDs for the data to be downloaded. | required |
Returns:
| Type | Description |
|---|---|
Iterator[ObjectDataDownload] | An iterator of data download contexts that can be used to download the data. |
Raises:
| Type | Description |
|---|---|
DataNotFoundError | If any requested data ID is not associated with this object. |
update async
update(
object_dict: dict,
check_for_conflict: bool = True,
request_timeout: int | float | tuple[int | float, int | float] | None = None,
) -> DownloadedObject
Update the geoscience object on the geoscience object service. Returning a new DownloadedObject representing the new version of the object.
This will create a new version of the object, that fully replaces the existing properties of the object with those provided in object_dict.
Note, this will not update the "DownloadedObject" instance in-place - it will still represent the original version of the object. You will need to download the updated version separately if you wish to work with it.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
object_dict | dict | The new properties of the object as a dictionary. | required |
check_for_conflict | bool | If True, and if a newer version of the object exists on the geoscience object service, the update will fail with a ObjectModifiedError exception. If False, it will not check whether there is a newer version, so will perform the update regardless. | True |
request_timeout | int | float | tuple[int | float, int | float] | None | An optional timeout to use for API requests. See evo.common.APIConnector for details. | None |
Returns:
| Type | Description |
|---|---|
DownloadedObject | The new version of the object as a DownloadedObject. |
download_table async
download_table(
table_info: TableInfo | str,
fb: IFeedback = NoFeedback,
*,
nan_values: list[int] | list[float] | str | None = None,
column_names: Sequence[str] | None = None,
) -> pa.Table
Download the data referenced by the given table info as a PyArrow Table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
table_info | TableInfo | str | The table info dict, ot JMESPath to table info within the object. | required |
fb | IFeedback | An optional feedback instance to report download progress to. | NoFeedback |
nan_values | list[int] | list[float] | str | None | An optional list of values to treat as null. Can also be a JMESPath expression to the list of nan values, or the nan_description structure. | None |
column_names | Sequence[str] | None | An optional list of column names for the table, instead of those in the Parquet file. | None |
Returns:
| Type | Description |
|---|---|
Table | A PyArrow Table containing the downloaded data. |
download_category_table async
download_category_table(
category_info: CategoryInfo | str,
*,
nan_values: list[int] | list[float] | str | None = None,
column_names: Sequence[str] | None = None,
fb: IFeedback = NoFeedback,
) -> pa.Table
Download the data referenced by the given category info as a PyArrow Table.
The arrays into the table will be DictionaryArrays constructed from the values and lookup tables.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
category_info | CategoryInfo | str | The category info dict, or JMESPath to the category info within the object. | required |
nan_values | list[int] | list[float] | str | None | An optional list of values to treat as null. Can also be a JMESPath expression to nan_description structure. | None |
column_names | Sequence[str] | None | An optional list of column names for the table, instead of those in the Parquet file. | None |
fb | IFeedback | An optional feedback instance to report download progress to. | NoFeedback |
Returns:
| Type | Description |
|---|---|
Table | A PyArrow Table containing the downloaded data. |
download_attribute_table async
download_attribute_table(attribute: AttributeInfo | str, fb: IFeedback = NoFeedback) -> pa.Table
Download the data referenced by the given attribute as a PyArrow Table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
attribute | AttributeInfo | str | The attribute info dict, or JMESPath to the attribute info within the object. | required |
fb | IFeedback | An optional feedback instance to report download progress to. | NoFeedback |
Returns:
| Type | Description |
|---|---|
Table | A PyArrow Table containing the downloaded data. |
download_dataframe async
download_dataframe(
table_info: TableInfo | str,
fb: IFeedback = NoFeedback,
*,
nan_values: list[int] | list[float] | str | None = None,
column_names: Sequence[str] | None = None,
) -> pd.DataFrame
Download the data referenced by the given table info as a Pandas DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
table_info | TableInfo | str | The table info dict, JMESPath to table info within the object. | required |
fb | IFeedback | An optional feedback instance to report download progress to. | NoFeedback |
nan_values | list[int] | list[float] | str | None | An optional list of values to treat as null. Can also be a JMESPath expression to nan_description structure. | None |
column_names | Sequence[str] | None | An optional list of column names for the table, instead of those from the Parquet file. | None |
Returns:
| Type | Description |
|---|---|
DataFrame | A Pandas DataFrame containing the downloaded data. |
download_category_dataframe async
download_category_dataframe(
category_info: CategoryInfo | str,
fb: IFeedback = NoFeedback,
*,
nan_values: list[int] | list[float] | str | None = None,
column_names: Sequence[str] | None = None,
) -> pd.DataFrame
Download the data referenced by the given category info as a Pandas DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
category_info | CategoryInfo | str | The category info dict, or JMESPath to the category info within the object. | required |
nan_values | list[int] | list[float] | str | None | An optional list of values to treat as null. Can also be a JMESPath expression to nan_description structure. | None |
column_names | Sequence[str] | None | An optional list of column names for the table, instead of those from the Parquet file. | None |
fb | IFeedback | An optional feedback instance to report download progress to. | NoFeedback |
Returns:
| Type | Description |
|---|---|
DataFrame | A Pandas DataFrame containing the downloaded data. |
download_attribute_dataframe async
download_attribute_dataframe(attribute: AttributeInfo | str, fb: IFeedback = NoFeedback) -> pd.DataFrame
Download the data referenced by the given attribute as a Pandas DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
attribute | AttributeInfo | str | The attribute info dict, or JMESPath to the attribute within the object. | required |
fb | IFeedback | An optional feedback instance to report download progress to. | NoFeedback |
Returns:
| Type | Description |
|---|---|
DataFrame | A Pandas DataFrame containing the downloaded data. |
download_array async
download_array(table_info: TableInfo | str, fb: IFeedback = NoFeedback) -> np.ndarray
Download the data referenced by the given table info as a NumPy array.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
table_info | TableInfo | str | The table info dict, JMESPath to table info within the object. | required |
fb | IFeedback | An optional feedback instance to report download progress to. | NoFeedback |
Returns:
| Type | Description |
|---|---|
ndarray | A NumPy array containing the downloaded data. |