Metadata-Version: 2.4
Name: pydftracer
Version: 2.0.0.dev0
Summary: Python libs for DFTracer
Author-email: "Hariharan Devarajan (Hari)" <hariharandev1@llnl.gov>, Ray Andrew Sinurat <raydreww@gmail.com>
Maintainer-email: "Hariharan Devarajan (Hari)" <hariharandev1@llnl.gov>, Ray Andrew Sinurat <raydreww@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/LLNL/pydftracer
Project-URL: Repository, https://github.com/LLNL/pydftracer
Project-URL: Bug Tracker, https://github.com/LLNL/pydftracer/issues
Project-URL: Documentation, https://dftracer.readthedocs.io/en/latest/
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Software Development :: Build Tools
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3 :: Only
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: dev
Requires-Dist: pytest>=6.0; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"
Requires-Dist: pytest-xdist; extra == "dev"
Requires-Dist: h5py; extra == "dev"
Requires-Dist: numpy; extra == "dev"
Requires-Dist: pillow; extra == "dev"
Requires-Dist: ruff; extra == "dev"
Requires-Dist: mypy; extra == "dev"
Provides-Extra: dynamo
Requires-Dist: torch>=2.5.1; extra == "dynamo"
Dynamic: license-file

# pydftracer

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)

A no operation (**No-Op**) Python binding for [DFTracer](https://github.com/hariharan-devarajan/dftracer) that provides seamless API compatibility without requiring the full DFTracer installation. Perfect for testing, development, and environments where you want to use DFTracer's API without the overhead of actual tracing.

## Purposes

- Are prototyping applications that will eventually use full DFTracer
- Maintain DFTracer API compatibility in environments where tracing is not needed (e.g. production apps)

## Installation

```bash
pip install dftracer
 # with libdftracer and dftracer_utils
pip install dftracer[full]
```

## Development

```bash
# Install development dependencies
pip install -e .[dev]

# Using Make (recommended)
make test-parallel       # Run all tests with parallel execution
make test-subprocess     # Run only subprocess-based dftracer tests  
make test-ci             # Run comprehensive tests matching CI configuration
make test-ci-quick       # Run quick tests and checks (faster)
make check-all           # Run all quality checks (lint, format, type-check, test)

# Using pytest directly
pytest tests/ -v -n 4    # All tests with parallel execution
pytest tests/ --cov=dftracer --cov-report=term-missing -v -n 4  # Tests with coverage
```

## Documentation

* Building DFTracer: [https://dftracer.readthedocs.io/en/latest/build.html](https://dftracer.readthedocs.io/en/latest/build.html)
* Integrating DFTracer: [https://dftracer.readthedocs.io/en/latest/examples.html](https://dftracer.readthedocs.io/en/latest/examples.html)
* Visualizing DFTracer Traces: [https://dftracer.readthedocs.io/en/latest/perfetto.html](https://dftracer.readthedocs.io/en/latest/perfetto.html)
* Building DFAnalyzer: [https://dftracer.readthedocs.io/en/latest/dfanalyzer_build.html](https://dftracer.readthedocs.io/en/latest/dfanalyzer_build.html)

## Development

### Testing

This project uses a comprehensive test suite with subprocess-based isolation for proper dftracer testing.

#### Running Tests

```bash
# Install development dependencies
pip install -e .[dev]

# Using Make (recommended)
make test-parallel    # Run all tests with parallel execution
make test-subprocess  # Run only subprocess-based dftracer tests  
make test-ci          # Run tests matching CI configuration
make check-all        # Run all quality checks (lint, format, type-check, test)

# Using pytest directly
pytest tests/ -v -n 2                                    # All tests with parallel execution
pytest tests/ --cov=dftracer --cov-report=term-missing -v -n 2  # Tests with coverage
pytest tests/ -m subprocess -v -n 2                      # Only subprocess tests

# Use the provided test script (matches CI)
./scripts/test.sh
```

#### Test Structure

- **Unit Tests**: General functionality tests in `tests/test_general.py`
- **Integration Tests**: Subprocess-based dftracer tests in `tests/test_dftracer.py`
- **Parallel Execution**: Tests run in parallel using `pytest-xdist` for faster execution
- **Process Isolation**: dftracer tests run in separate subprocesses to handle the per-process nature of dftracer

#### CI/CD

The project uses GitHub Actions for continuous integration with:
- Multi-Python version testing (3.9, 3.10, 3.11, 3.12)
- Parallel test execution with coverage reporting
- Code linting with `ruff`
- Type checking with `mypy`
- Package building and installation testing

## Citation and Reference

The original SC'24 paper describes the design and implementation of the DFTracer code. Please cite this paper and the code if you use DFTracer in your research. 

```
@inproceedings{devarajan_dftracer_2024,
    address = {Atlanta, GA},
    title = {{DFTracer}: {An} {Analysis}-{Friendly} {Data} {Flow} {Tracer} for {AI}-{Driven} {Workflows}},
    shorttitle = {{DFTracer}},
    urldate = {2024-07-31},
    booktitle = {{SC24}: {International} {Conference} for {High} {Performance} {Computing}, {Networking}, {Storage} and {Analysis}},
    publisher = {IEEE},
    author = {Devarajan, Hariharan and Pottier, Loic and Velusamy, Kaushik and Zheng, Huihuo and Yildirim, Izzet and Kogiou, Olga and Yu, Weikuan and Kougkas, Anthony and Sun, Xian-He and Yeom, Jae Seung and Mohror, Kathryn},
    month = nov,
    year = {2024},
}

@misc{devarajan_dftracer_code_2024,
    type = {Github},
    title = {Github {DFTracer}},
    shorttitle = {{DFTracer}},
    url = {https://github.com/LLNL/dftracer.git},
    urldate = {2024-07-31},
    journal = {DFTracer: A multi-level dataflow tracer for capture I/O calls from worklows.},
    author = {Devarajan, Hariharan and Pottier, Loic and Velusamy, Kaushik and Zheng, Huihuo and Yildirim, Izzet and Kogiou, Olga and Yu, Weikuan and Kougkas, Anthony and Sun, Xian-He and Yeom, Jae Seung and Mohror, Kathryn},
    month = jun,
    year = {2024},
}
```

## Acknowledgments

This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344; and under the auspices of the National Cancer Institute (NCI) by Frederick National Laboratory for Cancer Research (FNLCR) under Contract 75N91019D00024. This research used resources of the Argonne Leadership Computing Facility, a U.S. Department of Energy (DOE) Office of Science user facility at Argonne National Laboratory and is based on research supported by the U.S. DOE Office of Science-Advanced Scientific Computing Research Program, under Contract No. DE-AC02-06CH11357. Office of Advanced Scientific Computing Research under the DOE Early Career Research Program. Also, This material is based upon work partially supported by LLNL LDRD 23-ERD-045 and 24-SI-005. LLNL-CONF-857447.
