hdx.data.resource
module hdx.data.resource
Resource class containing all logic for creating, checking, and updating resources.
Classes
-
Resource — Resource class containing all logic for creating, checking, and updating resources.
class Resource(initial_data: dict | None = None, configuration: Configuration | None = None)
Bases : HDXObject
Resource class containing all logic for creating, checking, and updating resources.
Parameters
-
initial_data : dict | None — Initial resource metadata dictionary. Defaults to None.
-
configuration : Configuration | None — HDX configuration. Defaults to global configuration.
Methods
-
actions — Dictionary of actions that can be performed on object
-
update_from_yaml — Update resource metadata with static metadata from YAML file
-
update_from_json — Update resource metadata with static metadata from JSON file
-
read_from_hdx — Reads the resource given by identifier from HDX and returns Resource object
-
get_date_of_resource — Get resource date as datetimes and strings in specified format. If no format is supplied, the ISO 8601 format is used. Returns a dictionary containing keys startdate (start date as datetime), enddate (end date as datetime), startdate_str (start date as string), enddate_str (end date as string) and ongoing (whether the end date is a rolls forward every day).
-
set_date_of_resource — Set resource date from either datetime objects or strings. Any time and time zone information will be ignored by default (meaning that the time of the start date is set to 00:00:00, the time of any end date is set to 23:59:59 and the time zone is set to UTC). To have the time and time zone accounted for, set ignore_timeinfo to False. In this case, the time will be converted to UTC.
-
read_formats_mappings — Read HDX formats list
-
set_formatsdict — Set formats dictionary
-
get_mapped_format — Given a file format, return an HDX format to which it maps
-
get_format — Get the resource's format
-
set_format — Set the resource's file type
-
clean_format — Clean the resource's format, setting it to None if it is invalid and cannot be mapped
-
get_file_to_upload — Get the file uploaded
-
set_file_to_upload — Delete any existing url and set the file uploaded to the local path provided
-
check_both_url_filetoupload — Check for error where both url or file to upload are provided for resource.
-
correct_format — Correct the format of the file
-
set_types — Add resource_type and url_type if not supplied based on url or file to upload.
-
check_required_fields — Check that metadata for resource is complete. The parameter ignore_fields should be set if required to any fields that should be ignored for the particular operation.
-
update_in_hdx — Check if resource exists in HDX and if so, update it. To indicate that the data in an external resource (given by a URL) has been updated, set data_updated to True, which will result in the resource last_modified field being set to now. If the method set_file_to_upload is used to supply a file, the resource last_modified field is set to now automatically regardless of the value of data_updated.
-
create_in_hdx — Check if resource exists in HDX and if so, update it, otherwise create it. To indicate that the data in an external resource (given by a URL) has been updated, set data_updated to True, which will result in the resource last_modified field being set to now. If the method set_file_to_upload is used to supply a file, the resource last_modified field is set to now automatically regardless of the value of data_updated.
-
delete_from_hdx — Deletes a resource from HDX
-
get_dataset — Return dataset containing this resource
-
search_in_hdx — Searches for resources in HDX. NOTE: Does not search dataset metadata!
-
download — Download resource store to provided folder or temporary folder if no folder supplied
-
get_all_resource_ids_in_datastore — Get list of resources that have a datastore returning their ids.
-
has_datastore — Check if the resource has a datastore.
-
create_datastore — Create a datastore for the resource with the given schema.
-
update_datastore — Update (upsert) records into the resource datastore.
-
delete_datastore — Delete a resource from the HDX datastore
-
get_resource_views — Get any resource views in the resource
-
add_update_resource_view — Add new resource view in resource with new metadata
-
add_update_resource_views — Add new or update existing resource views in resource with new metadata.
-
reorder_resource_views — Order resource views in resource.
-
delete_resource_view — Delete a resource view from the resource and HDX
-
enable_dataset_preview — Enable dataset preview of resource
-
disable_dataset_preview — Disable dataset preview of resource
-
is_broken — Return if resource is broken
-
mark_broken — Mark resource as broken
-
is_marked_data_updated — Return if the resource's data is marked to be updated
-
mark_data_updated — Mark resource data as updated
-
get_date_data_updated — Get date resource data was updated
-
set_date_data_updated — Set date resource data was updated
-
get_hdx_url — Get the url of the resource on HDX
-
get_api_url — Get the API url of the resource on HDX
staticmethod Resource.actions() → dict[str, str]
Dictionary of actions that can be performed on object
Returns
-
dict[str, str] — Dictionary of actions that can be performed on object
method Resource.update_from_yaml(path: Path | str = Path('config', 'hdx_resource_static.yaml')) → None
Update resource metadata with static metadata from YAML file
Parameters
-
path : Path | str — Path to YAML dataset metadata. Defaults to config/hdx_resource_static.yaml.
Returns
-
None — None
method Resource.update_from_json(path: Path | str = Path('config', 'hdx_resource_static.json')) → None
Update resource metadata with static metadata from JSON file
Parameters
-
path : Path | str — Path to JSON dataset metadata. Defaults to config/hdx_resource_static.json.
Returns
-
None — None
classmethod Resource.read_from_hdx(identifier: str, configuration: Configuration | None = None) → Optional['Resource']
Reads the resource given by identifier from HDX and returns Resource object
Parameters
-
identifier : str — Identifier of resource
-
configuration : Configuration | None — HDX configuration. Defaults to global configuration.
Returns
-
Optional['Resource'] — Resource object if successful read, None if not
Raises
-
HDXError
method Resource.get_date_of_resource(date_format: str | None = None, today: datetime = now_utc()) → dict
Get resource date as datetimes and strings in specified format. If no format is supplied, the ISO 8601 format is used. Returns a dictionary containing keys startdate (start date as datetime), enddate (end date as datetime), startdate_str (start date as string), enddate_str (end date as string) and ongoing (whether the end date is a rolls forward every day).
Parameters
-
date_format : str | None — Date format. None is taken to be ISO 8601. Defaults to None.
-
today : datetime — Date to use for today. Defaults to now_utc().
Returns
-
dict — Dictionary of date information
method Resource.set_date_of_resource(startdate: datetime | str, enddate: datetime | str, ignore_timeinfo: bool = True) → None
Set resource date from either datetime objects or strings. Any time and time zone information will be ignored by default (meaning that the time of the start date is set to 00:00:00, the time of any end date is set to 23:59:59 and the time zone is set to UTC). To have the time and time zone accounted for, set ignore_timeinfo to False. In this case, the time will be converted to UTC.
Parameters
-
startdate : datetime | str — Resource start date
-
enddate : datetime | str — Resource end date
-
ignore_timeinfo : bool — Ignore time and time zone of date. Defaults to True.
Returns
-
None — None
classmethod Resource.read_formats_mappings(configuration: Configuration | None = None, url: str | None = None) → dict
Read HDX formats list
Parameters
-
configuration : Configuration | None — HDX configuration. Defaults to global configuration.
-
url : str | None — Url of tags cleanup spreadsheet. Defaults to None (internal configuration parameter).
Returns
-
dict — Returns formats dictionary
classmethod Resource.set_formatsdict(formats_dict: dict) → None
Set formats dictionary
Parameters
-
formats_dict : dict — Formats dictionary
Returns
-
None — None
classmethod Resource.get_mapped_format(format: str, configuration: Configuration | None = None) → str | None
Given a file format, return an HDX format to which it maps
Parameters
-
format : str — File type to map
-
configuration : Configuration | None — HDX configuration. Defaults to global configuration.
Returns
-
str | None — Mapped format or None if no mapping found
method Resource.get_format() → str | None
Get the resource's format
Returns
-
str | None — Resource's format or None if it has not been set
method Resource.set_format(format: str) → str
Set the resource's file type
Parameters
-
format : str — Format to set on resource
Returns
-
str — Format that was set
Raises
-
HDXError
method Resource.clean_format() → str
Clean the resource's format, setting it to None if it is invalid and cannot be mapped
Returns
-
str — Format that was set
method Resource.get_file_to_upload() → str | None
Get the file uploaded
Returns
-
str | None — The file that will be or has been uploaded or None if there isn't one
method Resource.set_file_to_upload(file_to_upload: Path | str, guess_format_from_suffix: bool = False) → str
Delete any existing url and set the file uploaded to the local path provided
Parameters
-
file_to_upload : Path | str — Local path to file to upload
-
guess_format_from_suffix : bool — Set format from file suffix. Defaults to False.
Returns
-
str — The format that was guessed or None if no format was set
method Resource.check_both_url_filetoupload() → None
Check for error where both url or file to upload are provided for resource.
Returns
-
None — None
Raises
-
HDXError
method Resource.check_neither_url_filetoupload() → None
Raises
-
HDXError
method Resource.correct_format(data: dict = None) → None
Correct the format of the file
Parameters
-
data : dict — Resource data.
Returns
-
None — None
Raises
-
HDXError
method Resource.set_types() → None
Add resource_type and url_type if not supplied based on url or file to upload.
Returns
-
None — None
method Resource.check_required_fields(ignore_fields: Sequence[str] = ()) → None
Check that metadata for resource is complete. The parameter ignore_fields should be set if required to any fields that should be ignored for the particular operation.
Parameters
-
ignore_fields : Sequence[str] — Fields to ignore. Default is ().
Returns
-
None — None
method Resource.update_in_hdx(**kwargs: Any) → int
Check if resource exists in HDX and if so, update it. To indicate that the data in an external resource (given by a URL) has been updated, set data_updated to True, which will result in the resource last_modified field being set to now. If the method set_file_to_upload is used to supply a file, the resource last_modified field is set to now automatically regardless of the value of data_updated.
Returns status code where
0 = no file to upload and last_modified set to now (data_updated flag is True), 1 = no file to upload and data_updated flag is False, 2 = file uploaded to filestore (either hash or size of file has changed), 3 = file not uploaded to filestore (hash and size of file are the same), 4 = file not uploaded (hash, size unchanged), given last_modified ignored
Parameters
-
**kwargs : Any — See below
-
operation : string — Operation to perform eg. patch. Defaults to update.
-
data_updated : bool — If True, set last_modified to now. Defaults to False.
-
date_data_updated : datetime — Date to use for last_modified. Default to None.
-
force_update : bool — Force file to be updated even if it hasn't changed. Defaults to False.
-
dataset : Dataset — Existing dataset if available to obtain resource id
Returns
-
int — Status code
method Resource.create_in_hdx(**kwargs: Any) → int
Check if resource exists in HDX and if so, update it, otherwise create it. To indicate that the data in an external resource (given by a URL) has been updated, set data_updated to True, which will result in the resource last_modified field being set to now. If the method set_file_to_upload is used to supply a file, the resource last_modified field is set to now automatically regardless of the value of data_updated.
Returns status code where
0 = no file to upload and last_modified set to now (resource creation or data_updated flag is True), 1 = no file to upload and data_updated flag is False, 2 = file uploaded to filestore (resource creation or either hash or size of file has changed), 3 = file not uploaded to filestore (hash and size of file are the same), 4 = file not uploaded (hash, size unchanged), given last_modified ignored
Parameters
-
**kwargs : Any — See below
-
data_updated : bool — If True, set last_modified to now. Defaults to False.
-
date_data_updated : datetime — Date to use for last_modified. Default to None.
-
force_update : bool — Force file to be updated even if it hasn't changed. Defaults to False.
-
dataset : Dataset — Existing dataset if available to obtain resource id
Returns
-
int — Status code
method Resource.delete_from_hdx() → None
Deletes a resource from HDX
Returns
-
None — None
method Resource.get_dataset() → Dataset
Return dataset containing this resource
Returns
-
Dataset — Dataset containing this resource
Raises
-
HDXError
staticmethod Resource.search_in_hdx(query: str, configuration: Configuration | None = None, **kwargs: Any) → list['Resource']
Searches for resources in HDX. NOTE: Does not search dataset metadata!
Parameters
-
query : str — Query configuration: HDX configuration. Defaults to global configuration. **kwargs: See below order_by (str): A field on the Resource model that orders the results offset (int): Apply an offset to the query limit (int): Apply a limit to the query
-
Returns — List of resources resulting from query
method Resource.download(folder: Path | str | None = None, retriever: Retrieve | None = None) → tuple[str, Path]
Download resource store to provided folder or temporary folder if no folder supplied
Parameters
-
folder : Path | str | None — Folder to download resource to. Defaults to None.
-
retriever : Retrieve | None — Retrieve object to use. Defaults to None.
Returns
-
tuple[str, Path] — (URL downloaded, Path to downloaded file)
Raises
-
HDXError
staticmethod Resource.get_all_resource_ids_in_datastore(configuration: Configuration | None = None) → list[str]
Get list of resources that have a datastore returning their ids.
Parameters
-
configuration : Configuration | None — HDX configuration. Defaults to global configuration.
Returns
-
list[str] — List of resource ids that are in the datastore
method Resource.has_datastore() → bool
Check if the resource has a datastore.
Returns
-
bool — Whether the resource has a datastore or not
method Resource.create_datastore(schema: Sequence[dict], primary_key: str | Sequence[str] | None = None) → None
Create a datastore for the resource with the given schema.
Parameters
-
schema : Sequence[dict] — Sequence of field definitions, each a dict with 'id' and 'type' keys.
-
primary_key : str | Sequence[str] | None — Primary key field name(s). Defaults to None.
Returns
-
None — None
method Resource.update_datastore(records: Sequence[dict], method: str = 'upsert') → None
Update (upsert) records into the resource datastore.
Parameters
-
records : Sequence[dict] — Sequence of record dicts to insert or update.
-
method : str — Datastore update method ('upsert', 'insert', or 'update'). Defaults to 'upsert'.
Returns
-
None — None
method Resource.delete_datastore() → None
Delete a resource from the HDX datastore
Returns
-
None — None
method Resource.get_resource_views() → list[ResourceView]
method Resource.add_update_resource_view(resource_view: ResourceView | dict) → None
Add new resource view in resource with new metadata
Parameters
-
resource_view : ResourceView | dict — Resource view metadata either from a ResourceView object or a dictionary
Returns
-
None — None
method Resource.add_update_resource_views(resource_views: Sequence[ResourceView | dict]) → None
Add new or update existing resource views in resource with new metadata.
Parameters
-
resource_views : Sequence[ResourceView | dict] — A list of resource views metadata from ResourceView objects or dictionaries
Returns
-
None — None
Raises
-
HDXError
method Resource.reorder_resource_views(resource_views: Sequence[ResourceView | dict | str]) → None
Order resource views in resource.
Parameters
-
resource_views : Sequence[ResourceView | dict | str] — A list of either resource view ids or resource views metadata from ResourceView objects or dictionaries
Returns
-
None — None
Raises
-
HDXError
method Resource.delete_resource_view(resource_view: ResourceView | dict | str) → None
Delete a resource view from the resource and HDX
Parameters
-
resource_view : ResourceView | dict | str — Either a resource view id or resource view metadata either from a ResourceView object or a dictionary
Returns
-
None — None
Raises
-
HDXError
method Resource.enable_dataset_preview() → None
Enable dataset preview of resource
Returns
-
None — None
method Resource.disable_dataset_preview() → None
Disable dataset preview of resource
Returns
-
None — None
method Resource.is_broken() → bool
Return if resource is broken
Returns
-
bool — Whether resource is broken
method Resource.mark_broken() → None
Mark resource as broken
Returns
-
None — None
method Resource.is_marked_data_updated() → bool
Return if the resource's data is marked to be updated
Returns
-
bool — Whether resource's data is marked to be updated
method Resource.mark_data_updated() → None
Mark resource data as updated
Returns
-
None — None
method Resource.get_date_data_updated() → datetime
Get date resource data was updated
Returns
-
datetime — Date resource data was updated
method Resource.set_date_data_updated(date: datetime | str, ignore_timeinfo: bool = False) → None
Set date resource data was updated
Parameters
-
date : datetime | str — Date resource data was updated
-
ignore_timeinfo : bool — Ignore time and time zone of date. Defaults to False.
Returns
-
None — None
method Resource.get_hdx_url() → str | None
Get the url of the resource on HDX
Returns
-
str | None — Url of the resource on HDX or None if the resource is missing the id field
method Resource.get_api_url() → str | None
Get the API url of the resource on HDX
Returns
-
str | None — API url of the resource on HDX or None if the resource is missing the id field