Skip to content

hdx.data.resource

module hdx.data.resource

Resource class containing all logic for creating, checking, and updating resources.

Classes

  • Resource Resource class containing all logic for creating, checking, and updating resources.

class Resource(initial_data: dict | None = None, configuration: Configuration | None = None)

Bases : HDXObject

Resource class containing all logic for creating, checking, and updating resources.

Parameters

  • initial_data : dict | None Initial resource metadata dictionary. Defaults to None.

  • configuration : Configuration | None HDX configuration. Defaults to global configuration.

Methods

  • actions Dictionary of actions that can be performed on object

  • update_from_yaml Update resource metadata with static metadata from YAML file

  • update_from_json Update resource metadata with static metadata from JSON file

  • read_from_hdx Reads the resource given by identifier from HDX and returns Resource object

  • get_date_of_resource Get resource date as datetimes and strings in specified format. If no format is supplied, the ISO 8601 format is used. Returns a dictionary containing keys startdate (start date as datetime), enddate (end date as datetime), startdate_str (start date as string), enddate_str (end date as string) and ongoing (whether the end date is a rolls forward every day).

  • set_date_of_resource Set resource date from either datetime objects or strings. Any time and time zone information will be ignored by default (meaning that the time of the start date is set to 00:00:00, the time of any end date is set to 23:59:59 and the time zone is set to UTC). To have the time and time zone accounted for, set ignore_timeinfo to False. In this case, the time will be converted to UTC.

  • read_formats_mappings Read HDX formats list

  • set_formatsdict Set formats dictionary

  • get_mapped_format Given a file format, return an HDX format to which it maps

  • get_format Get the resource's format

  • set_format Set the resource's file type

  • clean_format Clean the resource's format, setting it to None if it is invalid and cannot be mapped

  • get_file_to_upload Get the file uploaded

  • set_file_to_upload Delete any existing url and set the file uploaded to the local path provided

  • check_both_url_filetoupload Check for error where both url or file to upload are provided for resource.

  • check_neither_url_filetoupload

  • correct_format Correct the format of the file

  • set_types Add resource_type and url_type if not supplied based on url or file to upload.

  • check_required_fields Check that metadata for resource is complete. The parameter ignore_fields should be set if required to any fields that should be ignored for the particular operation.

  • update_in_hdx Check if resource exists in HDX and if so, update it. To indicate that the data in an external resource (given by a URL) has been updated, set data_updated to True, which will result in the resource last_modified field being set to now. If the method set_file_to_upload is used to supply a file, the resource last_modified field is set to now automatically regardless of the value of data_updated.

  • create_in_hdx Check if resource exists in HDX and if so, update it, otherwise create it. To indicate that the data in an external resource (given by a URL) has been updated, set data_updated to True, which will result in the resource last_modified field being set to now. If the method set_file_to_upload is used to supply a file, the resource last_modified field is set to now automatically regardless of the value of data_updated.

  • delete_from_hdx Deletes a resource from HDX

  • get_dataset Return dataset containing this resource

  • search_in_hdx Searches for resources in HDX. NOTE: Does not search dataset metadata!

  • download Download resource store to provided folder or temporary folder if no folder supplied

  • get_all_resource_ids_in_datastore Get list of resources that have a datastore returning their ids.

  • has_datastore Check if the resource has a datastore.

  • create_datastore Create a datastore for the resource with the given schema.

  • update_datastore Update (upsert) records into the resource datastore.

  • delete_datastore Delete a resource from the HDX datastore

  • get_resource_views Get any resource views in the resource

  • add_update_resource_view Add new resource view in resource with new metadata

  • add_update_resource_views Add new or update existing resource views in resource with new metadata.

  • reorder_resource_views Order resource views in resource.

  • delete_resource_view Delete a resource view from the resource and HDX

  • enable_dataset_preview Enable dataset preview of resource

  • disable_dataset_preview Disable dataset preview of resource

  • is_broken Return if resource is broken

  • mark_broken Mark resource as broken

  • is_marked_data_updated Return if the resource's data is marked to be updated

  • mark_data_updated Mark resource data as updated

  • get_date_data_updated Get date resource data was updated

  • set_date_data_updated Set date resource data was updated

  • get_hdx_url Get the url of the resource on HDX

  • get_api_url Get the API url of the resource on HDX

staticmethod Resource.actions()dict[str, str]

Dictionary of actions that can be performed on object

Returns

  • dict[str, str] Dictionary of actions that can be performed on object

method Resource.update_from_yaml(path: Path | str = Path('config', 'hdx_resource_static.yaml'))None

Update resource metadata with static metadata from YAML file

Parameters

  • path : Path | str Path to YAML dataset metadata. Defaults to config/hdx_resource_static.yaml.

Returns

  • None None

method Resource.update_from_json(path: Path | str = Path('config', 'hdx_resource_static.json'))None

Update resource metadata with static metadata from JSON file

Parameters

  • path : Path | str Path to JSON dataset metadata. Defaults to config/hdx_resource_static.json.

Returns

  • None None

classmethod Resource.read_from_hdx(identifier: str, configuration: Configuration | None = None)Optional['Resource']

Reads the resource given by identifier from HDX and returns Resource object

Parameters

  • identifier : str Identifier of resource

  • configuration : Configuration | None HDX configuration. Defaults to global configuration.

Returns

  • Optional['Resource'] Resource object if successful read, None if not

Raises

  • HDXError

method Resource.get_date_of_resource(date_format: str | None = None, today: datetime = now_utc())dict

Get resource date as datetimes and strings in specified format. If no format is supplied, the ISO 8601 format is used. Returns a dictionary containing keys startdate (start date as datetime), enddate (end date as datetime), startdate_str (start date as string), enddate_str (end date as string) and ongoing (whether the end date is a rolls forward every day).

Parameters

  • date_format : str | None Date format. None is taken to be ISO 8601. Defaults to None.

  • today : datetime Date to use for today. Defaults to now_utc().

Returns

  • dict Dictionary of date information

method Resource.set_date_of_resource(startdate: datetime | str, enddate: datetime | str, ignore_timeinfo: bool = True)None

Set resource date from either datetime objects or strings. Any time and time zone information will be ignored by default (meaning that the time of the start date is set to 00:00:00, the time of any end date is set to 23:59:59 and the time zone is set to UTC). To have the time and time zone accounted for, set ignore_timeinfo to False. In this case, the time will be converted to UTC.

Parameters

  • startdate : datetime | str Resource start date

  • enddate : datetime | str Resource end date

  • ignore_timeinfo : bool Ignore time and time zone of date. Defaults to True.

Returns

  • None None

classmethod Resource.read_formats_mappings(configuration: Configuration | None = None, url: str | None = None)dict

Read HDX formats list

Parameters

  • configuration : Configuration | None HDX configuration. Defaults to global configuration.

  • url : str | None Url of tags cleanup spreadsheet. Defaults to None (internal configuration parameter).

Returns

  • dict Returns formats dictionary

classmethod Resource.set_formatsdict(formats_dict: dict)None

Set formats dictionary

Parameters

  • formats_dict : dict Formats dictionary

Returns

  • None None

classmethod Resource.get_mapped_format(format: str, configuration: Configuration | None = None)str | None

Given a file format, return an HDX format to which it maps

Parameters

  • format : str File type to map

  • configuration : Configuration | None HDX configuration. Defaults to global configuration.

Returns

  • str | None Mapped format or None if no mapping found

method Resource.get_format()str | None

Get the resource's format

Returns

  • str | None Resource's format or None if it has not been set

method Resource.set_format(format: str)str

Set the resource's file type

Parameters

  • format : str Format to set on resource

Returns

  • str Format that was set

Raises

  • HDXError

method Resource.clean_format()str

Clean the resource's format, setting it to None if it is invalid and cannot be mapped

Returns

  • str Format that was set

method Resource.get_file_to_upload()str | None

Get the file uploaded

Returns

  • str | None The file that will be or has been uploaded or None if there isn't one

method Resource.set_file_to_upload(file_to_upload: Path | str, guess_format_from_suffix: bool = False)str

Delete any existing url and set the file uploaded to the local path provided

Parameters

  • file_to_upload : Path | str Local path to file to upload

  • guess_format_from_suffix : bool Set format from file suffix. Defaults to False.

Returns

  • str The format that was guessed or None if no format was set

method Resource.check_both_url_filetoupload()None

Check for error where both url or file to upload are provided for resource.

Returns

  • None None

Raises

  • HDXError

method Resource.check_neither_url_filetoupload()None

Raises

  • HDXError

method Resource.correct_format(data: dict = None)None

Correct the format of the file

Parameters

  • data : dict Resource data.

Returns

  • None None

Raises

  • HDXError

method Resource.set_types()None

Add resource_type and url_type if not supplied based on url or file to upload.

Returns

  • None None

method Resource.check_required_fields(ignore_fields: Sequence[str] = ())None

Check that metadata for resource is complete. The parameter ignore_fields should be set if required to any fields that should be ignored for the particular operation.

Parameters

  • ignore_fields : Sequence[str] Fields to ignore. Default is ().

Returns

  • None None

method Resource.update_in_hdx(**kwargs: Any)int

Check if resource exists in HDX and if so, update it. To indicate that the data in an external resource (given by a URL) has been updated, set data_updated to True, which will result in the resource last_modified field being set to now. If the method set_file_to_upload is used to supply a file, the resource last_modified field is set to now automatically regardless of the value of data_updated.

Returns status code where

0 = no file to upload and last_modified set to now (data_updated flag is True), 1 = no file to upload and data_updated flag is False, 2 = file uploaded to filestore (either hash or size of file has changed), 3 = file not uploaded to filestore (hash and size of file are the same), 4 = file not uploaded (hash, size unchanged), given last_modified ignored

Parameters

  • **kwargs : Any See below

  • operation : string Operation to perform eg. patch. Defaults to update.

  • data_updated : bool If True, set last_modified to now. Defaults to False.

  • date_data_updated : datetime Date to use for last_modified. Default to None.

  • force_update : bool Force file to be updated even if it hasn't changed. Defaults to False.

  • dataset : Dataset Existing dataset if available to obtain resource id

Returns

  • int Status code

method Resource.create_in_hdx(**kwargs: Any)int

Check if resource exists in HDX and if so, update it, otherwise create it. To indicate that the data in an external resource (given by a URL) has been updated, set data_updated to True, which will result in the resource last_modified field being set to now. If the method set_file_to_upload is used to supply a file, the resource last_modified field is set to now automatically regardless of the value of data_updated.

Returns status code where

0 = no file to upload and last_modified set to now (resource creation or data_updated flag is True), 1 = no file to upload and data_updated flag is False, 2 = file uploaded to filestore (resource creation or either hash or size of file has changed), 3 = file not uploaded to filestore (hash and size of file are the same), 4 = file not uploaded (hash, size unchanged), given last_modified ignored

Parameters

  • **kwargs : Any See below

  • data_updated : bool If True, set last_modified to now. Defaults to False.

  • date_data_updated : datetime Date to use for last_modified. Default to None.

  • force_update : bool Force file to be updated even if it hasn't changed. Defaults to False.

  • dataset : Dataset Existing dataset if available to obtain resource id

Returns

  • int Status code

method Resource.delete_from_hdx()None

Deletes a resource from HDX

Returns

  • None None

method Resource.get_dataset()Dataset

Return dataset containing this resource

Returns

  • Dataset Dataset containing this resource

Raises

  • HDXError

staticmethod Resource.search_in_hdx(query: str, configuration: Configuration | None = None, **kwargs: Any)list['Resource']

Searches for resources in HDX. NOTE: Does not search dataset metadata!

Parameters

  • query : str Query configuration: HDX configuration. Defaults to global configuration. **kwargs: See below order_by (str): A field on the Resource model that orders the results offset (int): Apply an offset to the query limit (int): Apply a limit to the query

  • Returns List of resources resulting from query

method Resource.download(folder: Path | str | None = None, retriever: Retrieve | None = None)tuple[str, Path]

Download resource store to provided folder or temporary folder if no folder supplied

Parameters

  • folder : Path | str | None Folder to download resource to. Defaults to None.

  • retriever : Retrieve | None Retrieve object to use. Defaults to None.

Returns

  • tuple[str, Path] (URL downloaded, Path to downloaded file)

Raises

  • HDXError

staticmethod Resource.get_all_resource_ids_in_datastore(configuration: Configuration | None = None)list[str]

Get list of resources that have a datastore returning their ids.

Parameters

  • configuration : Configuration | None HDX configuration. Defaults to global configuration.

Returns

  • list[str] List of resource ids that are in the datastore

method Resource.has_datastore()bool

Check if the resource has a datastore.

Returns

  • bool Whether the resource has a datastore or not

method Resource.create_datastore(schema: Sequence[dict], primary_key: str | Sequence[str] | None = None)None

Create a datastore for the resource with the given schema.

Parameters

  • schema : Sequence[dict] Sequence of field definitions, each a dict with 'id' and 'type' keys.

  • primary_key : str | Sequence[str] | None Primary key field name(s). Defaults to None.

Returns

  • None None

method Resource.update_datastore(records: Sequence[dict], method: str = 'upsert')None

Update (upsert) records into the resource datastore.

Parameters

  • records : Sequence[dict] Sequence of record dicts to insert or update.

  • method : str Datastore update method ('upsert', 'insert', or 'update'). Defaults to 'upsert'.

Returns

  • None None

method Resource.delete_datastore()None

Delete a resource from the HDX datastore

Returns

  • None None

method Resource.get_resource_views()list[ResourceView]

Get any resource views in the resource

Returns

method Resource.add_update_resource_view(resource_view: ResourceView | dict)None

Add new resource view in resource with new metadata

Parameters

  • resource_view : ResourceView | dict Resource view metadata either from a ResourceView object or a dictionary

Returns

  • None None

method Resource.add_update_resource_views(resource_views: Sequence[ResourceView | dict])None

Add new or update existing resource views in resource with new metadata.

Parameters

  • resource_views : Sequence[ResourceView | dict] A list of resource views metadata from ResourceView objects or dictionaries

Returns

  • None None

Raises

  • HDXError

method Resource.reorder_resource_views(resource_views: Sequence[ResourceView | dict | str])None

Order resource views in resource.

Parameters

  • resource_views : Sequence[ResourceView | dict | str] A list of either resource view ids or resource views metadata from ResourceView objects or dictionaries

Returns

  • None None

Raises

  • HDXError

method Resource.delete_resource_view(resource_view: ResourceView | dict | str)None

Delete a resource view from the resource and HDX

Parameters

  • resource_view : ResourceView | dict | str Either a resource view id or resource view metadata either from a ResourceView object or a dictionary

Returns

  • None None

Raises

  • HDXError

method Resource.enable_dataset_preview()None

Enable dataset preview of resource

Returns

  • None None

method Resource.disable_dataset_preview()None

Disable dataset preview of resource

Returns

  • None None

method Resource.is_broken()bool

Return if resource is broken

Returns

  • bool Whether resource is broken

method Resource.mark_broken()None

Mark resource as broken

Returns

  • None None

method Resource.is_marked_data_updated()bool

Return if the resource's data is marked to be updated

Returns

  • bool Whether resource's data is marked to be updated

method Resource.mark_data_updated()None

Mark resource data as updated

Returns

  • None None

method Resource.get_date_data_updated()datetime

Get date resource data was updated

Returns

  • datetime Date resource data was updated

method Resource.set_date_data_updated(date: datetime | str, ignore_timeinfo: bool = False)None

Set date resource data was updated

Parameters

  • date : datetime | str Date resource data was updated

  • ignore_timeinfo : bool Ignore time and time zone of date. Defaults to False.

Returns

  • None None

method Resource.get_hdx_url()str | None

Get the url of the resource on HDX

Returns

  • str | None Url of the resource on HDX or None if the resource is missing the id field

method Resource.get_api_url()str | None

Get the API url of the resource on HDX

Returns

  • str | None API url of the resource on HDX or None if the resource is missing the id field