Metadata-Version: 2.1
Name: data_governance_checkup
Version: 0.1.1
Summary: A library for data governance and compliance
Home-page: https://github.com/pratiklahudkar/data-governance
Author: Pratik
Author-email: pratik.lahudkar@gmail.com
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.7
Description-Content-Type: text/markdown

## Data Governance Library

The **Data Governance Library** is a Python package designed to help organizations maintain data compliance, security, and governance standards. With features tailored to address common regulatory frameworks like GDPR, HIPAA, ISO 27001, and more, this library provides tools for automated compliance checks, metadata management, data lineage tracking, and role-based access control auditing.

---

### Features

1. **Automated Compliance Checks**
   - Support for frameworks like GDPR, HIPAA, ISO 27001, and CCPA.
   - Customizable rules for new compliance standards.

2. **Data Lineage Tracking**
   - Monitor data flow and identify its origin, transformation, and destination.

3. **Role-Based Access Control (RBAC) Auditing**
   - Ensure that data access policies are adhered to.

4. **Metadata Management and Cataloging**
   - Store, manage, and query metadata associated with your datasets.

5. **Data Masking and Anonymization**
   - Protect sensitive data with masking and anonymization techniques.

---

### Installation

Install the library via pip:

```bash
pip install data_governance_checkup
```

---

### Usage

#### Example: Running Compliance Checks

```python
from data_governance_checkup.compliance.gdpr import GDPRCompliance
from data_governance_checkup.compliance.hipaa import HIPAACompliance
from data_governance_checkup.compliance.iso27001 import ISO27001Compliance
from data_governance_checkup.compliance.ccpa import CCPACompliance

# Sample data
data = {
    "personal_data": "John Doe",
    "PHI": "Medical Record",
    "PHI_encrypted": False,
    "backup_enabled": False,
    "sensitive_data_access": [{"user": "alice", "logged": False}],
    "data_sold": True,
    "consumer_consent": False,
}

# Run compliance checks
gdpr_violations = GDPRCompliance.check(data)
hipaa_violations = HIPAACompliance.check(data)
iso_violations = ISO27001Compliance.check(data)
ccpa_violations = CCPACompliance.check(data)

# Print results
print("GDPR Violations:", gdpr_violations)
print("HIPAA Violations:", hipaa_violations)
print("ISO 27001 Violations:", iso_violations)
print("CCPA Violations:", ccpa_violations)

```

---

#### Data Lineage Tracking

```python
from data_governance_checkup.lineage.lineage import DataLineageTracker

tracker = DataLineageTracker()
tracker.add_record("dataset1", "source1")
tracker.add_record("dataset2", "dataset1")

# Get lineage for a specific dataset
lineage = tracker.get_lineage("dataset2")
print("Lineage:", lineage)
```

---

#### Metadata Management

```python
from data_governance_checkup.metadata.metadata_manager import MetadataManager

manager = MetadataManager()
manager.add_metadata("dataset1", {"owner": "Alice", "description": "Sales data"})
metadata = manager.get_metadata("dataset1")
print("Metadata:", metadata)
```

---

#### Role-Based Access Control (RBAC) Auditing

```python
from data_governance_checkup.rbac.rbac_audit import RBACAuditor

auditor = RBACAuditor()
auditor.add_role("Alice", ["read", "write"])
violations = auditor.audit_access("Alice", "delete")
print("RBAC Violations:", violations)
```

---

#### Data Masking and Anonymization

```python
from data_governance_checkup.masking.anonymizer import DataAnonymizer

anonymizer = DataAnonymizer()
masked_data = anonymizer.mask("123-45-6789", "SSN")
print("Masked Data:", masked_data)
```

---

### Contributing

Contributions are welcome! Please submit pull requests or open issues for any enhancements, bugs, or additional compliance frameworks you'd like to see.

---

### License

This project is licensed under the MIT License. See the LICENSE file for more details.

---

### Roadmap

1. Add support for more compliance standards (e.g., SOC 2, PCI DSS).
2. Build visualization dashboards for compliance status.
3. Integrate with real-time data pipelines for live compliance checks.

---

### Contact

For questions or support, please contact pratik.lahudkar@gmail or open an issue on the GitHub repository.
