Metadata-Version: 2.1
Name: idc_index
Version: 0.2.7
Summary: Package to query and download data from an index of ImagingDataCommons
Home-page: https://github.com/ImagingDataCommons/idc-index
Author: ImagingDataCommons
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Development Status :: 4 - Beta
Description-Content-Type: text/markdown
License-File: LICENSE

# IDC Index

The IDC Index is a Python library designed to query basic metadata and download data hosted on the NCI Imaging Data Commons (IDC).

## Installation

Install the IDC Index using pip:
```
pip install idc-index
```
## Description

The IDC Index offers a suite of functionalities, enabling users to retrieve diverse information regarding collections, patients, studies, series, and images. The library uses an index of data generated by the SQL query available in the [release notes](https://github.com/ImagingDataCommons/idc-index/releases).

## Usage

The library provides the following key functionalities along with their available arguments:

- Initialization: Instantiates the IDC Client Class by reading the CSV index and downloading the s5cmd tool.
- IDC Version:
  - get_idc_version() : Get the release version of IDC data 
- Data Retrieval:
  - get_collections(): Retrieve a list of unique collection IDs.
  - get_series_size(seriesInstanceUID): Obtain the size of a series in MB by providing the SeriesInstanceUID.
  - get_patients(collection_id=None, outputFormat="list" or ("dict" or "df")): Retrieve information about patients within a collection.
  - get_dicom_studies(patientId=None, outputFormat="list" or ("dict" or "df")): Retrieve studies for a patient_id.
  - get_dicom_series(studyInstanceUID=None, outputFormat="list" or ("dict" or "df")): Retrieve series within a study.
  - download_dicom_series(seriesInstanceUID, downloadDir, dry_run=False, quiet=True ): Download images associated with a SeriesInstanceUID to a specified directory.
  - download_from_selection(downloadDir=None, dry_run=True, collection_id=None, patientId=None, studyInstanceUID=None): Download images associated with specific filter(s) to a specified directory.

## Example

Here's an example demonstrating how to use the IDC Client:


### Initialize the IDC Client
```
from idc_index import index
```
```
idc_client = index.IDCClient()
```
### Check IDC Version
```
idc_client.get_idc_version()
```

### Query data
```
idc_client.get_collections()
```
```
idc_client.get_patients(collection_id='nsclc_radiomics',outputFormat="list")
```
```
idc_client.get_dicom_studies(patientId='D1-0975', outputFormat="dict")
```
```
idc_client.get_dicom_series(studyInstanceUID='1.3.6.1.4.1.32722.99.99.191411096482148278088383576909215626011', outputFormat="df")
```
### Download data
```
idc_client.download_dicom_series(seriesInstanceUID='1.3.6.1.4.1.32722.99.99.459644025247509819689655120845267405', downloadDir='/content/test')
```

## Resources

* [https://imaging.datacommons.cancer.gov/](https://imaging.datacommons.cancer.gov/)
* [https://github.com/peak/s5cmd](https://github.com/peak/s5cmd)
