Metadata-Version: 2.4
Name: survey-kit
Version: 0.1.2
Summary: Tools for working with survey and other data
Author-email: Jon Rothbaum <jon.rothbaum@gmail.com>
License: CC0 1.0 Universal
License-File: LICENSE.md
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.9
Requires-Dist: cffi
Requires-Dist: dill>=0.4.0
Requires-Dist: formulaic>=1.2.0
Requires-Dist: igraph
Requires-Dist: lightgbm
Requires-Dist: narwhals>=2.7.0
Requires-Dist: numba
Requires-Dist: numexpr>=2.10.2
Requires-Dist: optuna
Requires-Dist: polars>=1.25.2
Requires-Dist: psutil
Requires-Dist: pyarrow>=19.0.0
Requires-Dist: pypardiso>=0.4.6
Requires-Dist: scikit-learn
Requires-Dist: scipy>=1.13.1
Requires-Dist: sparse-dot-mkl>=0.9.9
Description-Content-Type: text/markdown

# Survey Kit

Tools for addressing missing data problems (nonresponse bias and item missingness) including extremely fast calibration weighting and machine learning-based imputation.

A furlough project inspired by the code used for the U.S. Census Bureau for the [National Experimental Wellbeing Statistics (NEWS)](https://www.census.gov/data/experimental-data-products/national-experimental-wellbeing-statistics.html) project.

## Installation
```bash
pip install survey-kit
```

## Features

- **Calibration Weighting** - Fast entropy balancing for nonresponse bias
- **SRMI Imputation** - ML-based multiple imputation with checkpointing
- **Statistics & Standard Errors** - Proper variance estimation for complex surveys

Works with Polars, Pandas, Arrow, and DuckDB. Optimized for large datasets (100K+ rows).

## Documentation

Full documentation: [https://jrothbaum.github.io/survey_kit/](https://jrothbaum.github.io/survey_kit/)

- [Calibration Guide](https://jrothbaum.github.io/survey_kit/user-guide/calibration/)
- [Imputation Guide](https://jrothbaum.github.io/survey_kit/user-guide/imputation/)
- [Statistics Guide](https://jrothbaum.github.io/survey_kit/user-guide/statistics/)

## Support

- [Issues](https://github.com/jrothbaum/survey_kit/issues)
- [Discussions](https://github.com/jrothbaum/survey_kit/discussions)

## License

This project is in the public domain within the United States, and copyright and related rights in the work worldwide are waived through the CC0 1.0 Universal public domain dedication.