Metadata-Version: 2.4
Name: pyscrew
Version: 0.1.3
Summary: A Python package for accessing industrial research data from a screw driving system
Project-URL: Homepage, https://github.com/nikolaiwest/pyscrew
Project-URL: Bug Tracker, https://github.com/nikolaiwest/pyscrew/issues
Project-URL: Documentation, https://github.com/nikolaiwest/pyscrew#readme
Author-email: Nikolai West <nikolai.west@tu-dortmund.de>
License: MIT
License-File: LICENSE
Keywords: industrial data,manufacturing,open data,research data,screw driving
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Scientific/Engineering
Requires-Python: >=3.11
Requires-Dist: pyyaml>=6.0.0
Requires-Dist: rarfile>=4.2
Requires-Dist: requests>=2.28.0
Requires-Dist: tqdm>=4.65.0
Provides-Extra: test
Requires-Dist: pytest-cov>=4.0; extra == 'test'
Requires-Dist: pytest>=7.0; extra == 'test'
Description-Content-Type: text/markdown

# PyScrew

PyScrew is a Python package designed to simplify access to industrial research data from screw driving experiments. It provides a streamlined interface for downloading, validating, and preparing experimental datasets hosted on Zenodo.

More information on the data is available here: https://zenodo.org/records/14769379

## Features

- **Easy Data Access**: Simple interface to download and extract screw driving datasets
- **Data Integrity**: Automatic checksum verification and secure extraction
- **Caching System**: Smart caching to prevent redundant downloads
- **Cross-Platform**: Works on Windows, macOS, and Linux
- **Memory Efficient**: Handles large datasets through streaming operations
- **Secure**: Implements protection against common security vulnerabilities

## Installation

Install PyScrew directly from PyPI:

```bash
pip install pyscrew
```

## Quck start

```python 
import pyscrew

# List available datasets
scenarios = pyscrew.list_scenarios()
print("Available datasets:", scenarios)

# Download and extract a specific dataset
data_path = pyscrew.get_data("thread-degradation")
print(f"Data extracted to: {data_path}")
```

## Package structure

```bash
PyScrew/
├── src/
│   └── pyscrew/
│       ├── __init__.py      # Package initialization and version
│       ├── main.py          # Main interface and high-level functions
│       ├── loading.py       # Data loading from Zenodo
│       ├── processing.py    # Data processing functionality (planned)
│       └── validation.py    # Data validation checks (planned)
└── tests/                   # Test suite
```

## API Reference


### Main Functions
`get_data(scenario_name: str, cache_dir: Optional[Path] = None, force: bool = False) -> Path`

Downloads and extracts a specific dataset.

* `scenario_name`: Name of the dataset to download
* `cache_dir`: Optional custom cache directory (default: ~/.cache/pyscrew)
* `force`: Force re-download even if cached
* **Returns:** Path to extracted dataset

`list_scenarios() -> Dict[str, str]`

Lists all available datasets and their descriptions.

* Returns: Dictionary mapping scenario names to descriptions

## Cache Structure

Downloaded data is stored in:

```bash 
~/.cache/pyscrew/
├── archives/     # Compressed dataset archives
└── extracted/    # Extracted dataset files
```

## Development
The package is under active development. Further implementation will add data processing utilities and data validation tools. 

## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.

## License
This project is licensed under the MIT License - see the LICENSE file for details.

## Citation
If you use this package in your research, please cite either one of the following publications:
* West, N., & Deuse, J. (2024). A Comparative Study of Machine Learning Approaches for Anomaly Detection in Industrial Screw Driving Data. Proceedings of the 57th Hawaii International Conference on System Sciences (HICSS), 1050-1059. https://hdl.handle.net/10125/106504
* West, N., Trianni, A. & Deuse, J. (2024). Data-driven analysis of bolted joints in plastic housings with surface-based anomalies using supervised and unsupervised machine learning. CIE51 Proceedings. _(DOI will follow after publication of the proceedings)_

**A dedicated paper for this library is currently in progress.**