Metadata-Version: 2.1
Name: PrivacySherlock
Version: 0.0.1
Summary: A Python package for PII detection and classification
Author: Tosif Ansari
Author-email: tosif.ansari@example.com
Keywords: python,pii detection,privacy,data privacy,pii,data classification
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: altair==5.4.1
Requires-Dist: annotated-types==0.7.0
Requires-Dist: anyio==4.4.0
Requires-Dist: attrs==24.2.0
Requires-Dist: azure-core==1.30.2
Requires-Dist: blinker==1.8.2
Requires-Dist: blis==0.7.11
Requires-Dist: boto3==1.35.13
Requires-Dist: botocore==1.35.13
Requires-Dist: cachetools==5.5.0
Requires-Dist: catalogue==2.0.10
Requires-Dist: certifi==2024.8.30
Requires-Dist: cffi==1.17.1
Requires-Dist: charset-normalizer==3.3.2
Requires-Dist: click==8.1.7
Requires-Dist: cloudpathlib==0.19.0
Requires-Dist: colorama==0.4.6
Requires-Dist: confection==0.1.5
Requires-Dist: cryptography==43.0.1
Requires-Dist: cymem==2.0.8
Requires-Dist: distro==1.9.0
Requires-Dist: dnspython==2.6.1
Requires-Dist: easyocr==1.7.1
Requires-Dist: exceptiongroup==1.2.2
Requires-Dist: filelock==3.15.4
Requires-Dist: fsspec==2024.9.0
Requires-Dist: gitdb==4.0.11
Requires-Dist: GitPython==3.1.43
Requires-Dist: groq==0.11.0
Requires-Dist: h11==0.14.0
Requires-Dist: htbuilder==0.6.2
Requires-Dist: httpcore==1.0.5
Requires-Dist: httpx==0.27.2
Requires-Dist: idna==3.8
Requires-Dist: imageio==2.35.1
Requires-Dist: Jinja2==3.1.4
Requires-Dist: jmespath==1.0.1
Requires-Dist: jsonschema==4.23.0
Requires-Dist: jsonschema-specifications==2023.12.1
Requires-Dist: langcodes==3.4.0
Requires-Dist: language_data==1.2.0
Requires-Dist: lazy_loader==0.4
Requires-Dist: linecache2==1.0.0
Requires-Dist: marisa-trie==1.2.0
Requires-Dist: markdown-it-py==3.0.0
Requires-Dist: MarkupSafe==2.1.5
Requires-Dist: mdurl==0.1.2
Requires-Dist: more-itertools==10.5.0
Requires-Dist: mpmath==1.3.0
Requires-Dist: murmurhash==1.0.10
Requires-Dist: mysql==0.0.3
Requires-Dist: mysql-connector-python==9.0.0
Requires-Dist: mysql-connector-python-rf==2.2.2
Requires-Dist: mysqlclient==2.2.4
Requires-Dist: narwhals==1.6.2
Requires-Dist: networkx==3.3
Requires-Dist: ninja==1.11.1.1
Requires-Dist: numpy==1.26.4
Requires-Dist: opencv-python-headless==4.10.0.84
Requires-Dist: packaging==24.1
Requires-Dist: pandas==2.2.2
Requires-Dist: pdfminer.six==20231228
Requires-Dist: pdfplumber==0.11.4
Requires-Dist: phonenumbers==8.13.45
Requires-Dist: pillow==10.4.0
Requires-Dist: plotly==5.24.0
Requires-Dist: preshed==3.0.9
Requires-Dist: presidio_analyzer==2.2.355
Requires-Dist: presidio_anonymizer==2.2.355
Requires-Dist: protobuf==5.28.0
Requires-Dist: pyarrow==17.0.0
Requires-Dist: pyclipper==1.3.0.post5
Requires-Dist: pycparser==2.22
Requires-Dist: pycryptodome==3.20.0
Requires-Dist: pydantic==2.9.1
Requires-Dist: pydantic_core==2.23.3
Requires-Dist: pydeck==0.9.1
Requires-Dist: Pygments==2.18.0
Requires-Dist: pymongo==4.8.0
Requires-Dist: pypdfium2==4.30.0
Requires-Dist: pytesseract==0.3.13
Requires-Dist: python-bidi==0.6.0
Requires-Dist: python-dateutil==2.9.0.post0
Requires-Dist: pytz==2024.1
Requires-Dist: PyYAML==6.0.2
Requires-Dist: referencing==0.35.1
Requires-Dist: regex==2024.7.24
Requires-Dist: requests==2.32.3
Requires-Dist: requests-file==2.1.0
Requires-Dist: rich==13.8.0
Requires-Dist: rpds-py==0.20.0
Requires-Dist: s3transfer==0.10.2
Requires-Dist: scikit-image==0.24.0
Requires-Dist: scipy==1.14.1
Requires-Dist: shapely==2.0.6
Requires-Dist: shellingham==1.5.4
Requires-Dist: six==1.16.0
Requires-Dist: smart-open==7.0.4
Requires-Dist: smmap==5.0.1
Requires-Dist: sniffio==1.3.1
Requires-Dist: spacy==3.7.6
Requires-Dist: spacy-legacy==3.0.12
Requires-Dist: spacy-loggers==1.0.5
Requires-Dist: srsly==2.4.8
Requires-Dist: st-annotated-text==4.0.1
Requires-Dist: streamlit==1.38.0
Requires-Dist: sympy==1.13.2
Requires-Dist: tenacity==8.5.0
Requires-Dist: thinc==8.2.5
Requires-Dist: tifffile==2024.8.30
Requires-Dist: tldextract==5.1.2
Requires-Dist: toml==0.10.2
Requires-Dist: torch==2.4.1
Requires-Dist: torchvision==0.19.1
Requires-Dist: tornado==6.4.1
Requires-Dist: tqdm==4.66.5
Requires-Dist: traceback2==1.4.0
Requires-Dist: typer==0.12.5
Requires-Dist: typing_extensions==4.12.2
Requires-Dist: tzdata==2024.1
Requires-Dist: unittest2==1.1.0
Requires-Dist: urllib3==2.2.2
Requires-Dist: wasabi==1.1.3
Requires-Dist: watchdog==4.0.2
Requires-Dist: weasel==0.4.1
Requires-Dist: wrapt==1.16.0


# Privacy Sherlock - PII Detection and Classification

This Python package helps detect Personally Identifiable Information (PII) in various data sources (e.g., MySQL, Amazon S3, MongoDB) and classifies them into categories (financial, personal, etc.). It supports PII risk assessment and visualization of the data.

## Features
- Detects PII using regular expressions.
- Classifies PII into categories like financial and personal.
- Supports data ingestion from MySQL, Amazon S3, and MongoDB.
- Provides a risk score based on the detected PII.
- Visualizes PII distribution using Plotly.
