Metadata-Version: 2.1
Name: daget
Version: 0.5
Summary: Download dataset via DOI or landing page url
Author-email: Olof Olsson <borsna@gmail.com>
License: MIT
Classifier: Development Status :: 3 - Alpha
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Python: >=3.6
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: requests
Requires-Dist: beautifulsoup4

# 🗃️ DaGet

Simple utility to download datasets from data respositories.

The goal of this project is to explore machine readable metadata and learn more about writing python packages.

⚠️ __script is in early development and needs testing__ ⚠️ 

## Installation

![PyPI](https://img.shields.io/pypi/v/daget)

to install daget using pip: 

```
pip install daget
```

## Usage

Download dataset via DOI or landing page url:

`daget https://doi.org/10.5878/331q-3p13 ./destination`

or short form doi:

`daget 10.5878/331q-3p13 ./destination`

### Example
```text
$ daget 10.7910/DVN/LFH4H8 ./test
destination:   /home/user/test
landing page:  https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/LFH4H8
 355.4KiB  Annex.pdf
 115.6KiB  List of items for measuring conceptions of democracy.csv
 81.1KiB  List of items for measuring conceptions of democracy.xlsx
 11.5KiB  Script - Part 1 Conceptions of democracy.R
 3.0KiB  Script - Part 2 Analyze items used to measure conceptions of democracy.R
 73.3KiB  Spreadsheet - Conceptions of democracy 10-2021.csv
 64.9KiB  Spreadsheet - Conceptions of democracy 10-2021.xlsx
 704.8KiB  downloaded 
```

## Supported data respositories with file metadata
* schema.org/Dataset
  * https://dataverse.harvard.edu
  * https://dataverse.no
  * https://snd.se/catalogue
* figshare
  * https://su.figshare.com
  * https://figshare.scilifelab.se
* zenodo 
  * https://zenodo.org

## Alternatives

* [datahugger](https://github.com/J535D165/datahugger/) - wider repository suport

## Improve the script

Adding suport for additional repositories requires test cases and investigation arround how to get file metadata from the landing page.

Please help by testing and reporting [issues](https://github.com/borsna/daget/issues)

## TODO

- [ ] Add error handling
- [x] Check empty destination directory
- [ ] Improve documentation
- [x] Package script for pip
