niquery.data.fetching module¶
- niquery.data.fetching.fetch_datalad_remote_files(df, out_dirname, dataset_name) tuple[source]¶
Fetch files from remote DataLad datasets.
Downloads only the files listed in the provided DataFrame instance. The DataFrame is expected to contain at least the following columns:
remote: Remote server name (e.g., ‘openneuro’)datasetid: Dataset identifier (e.g., ‘ds000231’)fullpath: Path of the file within the dataset (e.g. ‘sub-01/func/sub-01_task-flavor_run-02_bold.nii.gz’)
If the DataLad dataset already exists in the provided path, it is not cloned again.
A new DataLad dataset is created at the destination path, and each dataset is made to be a subdataset.
- Parameters:
df (
DataFrame) – Table containing at least ‘remote’, ‘datasetid’, and ‘fullpath’ columns. Each row corresponds to a file to be fetched.out_dirname (
Path) – Output directory where the datasets will be cloned and files stored.dataset_name (
str) – Name of the dataset.
- Returns:
fetched_files, failure_results – Dictionary of datasets and the filenames succeeded/failed for each.
- Return type: