Metadata-Version: 2.4
Name: ml-workbench
Version: 0.1.3
Summary: Local ML workbench configured for Databricks using uv
Author: Pheno
License: MIT
Requires-Python: <3.13,>=3.12
Requires-Dist: matplotlib>=3.10.7
Requires-Dist: mlflow>=2.9.0
Requires-Dist: numpy<2.0.0
Requires-Dist: pandas>=2.3.3
Requires-Dist: pyyaml>=6.0.2
Requires-Dist: scikit-learn>=1.7.2
Description-Content-Type: text/markdown

# ML Workbench

## Setup

### Environment Configuration for MLFlow Databricks Integration

To direct MLFlow to your Databricks workspace (dev-internal), create a `.env` file in the project root with the following configuration:

```bash
# Set MLflow tracking URI to your Databricks workspace
MLFLOW_TRACKING_URI="databricks"

# Define Databricks datapoint that match your workspace (this one is for dev-internal)
DATABRICKS_HOST="https://dbc-787720e9-26e6.cloud.databricks.com"

# Getting Your Databricks Token
# - Go to your Databricks workspace: https://dbc-787720e9-26e6.cloud.databricks.com
# - Click on your profile icon (top-right)
# - Select "Settings"
# - In "User" section, select "Developer"
# - Go to Access Tokens tab
# - Click Generate New Token
# - Give it a name (e.g., "MLFlow Local Development") and expiry
# - Copy the token (you'll only see it once!)
DATABRICKS_TOKEN="dapi123456781234567890"   # <- replace with your own
```

**Steps to set up:**

1. Copy `.env.template` to `.env`:
   ```bash
   cp .env.template .env
   ```

2. Edit `.env` and replace `DATABRICKS_TOKEN` with your personal access token (see instructions in the comments above).

3. The `.env` file is already in `.gitignore`, so your token won't be committed to version control.

Once configured, MLFlow will automatically log experiments to your Databricks workspace when you run experiments using the ML Workbench.

### Git Pre-commit Hook for Automatic Version Increment

This project includes a pre-commit hook that automatically increments the patch version (last number) in `pyproject.toml` on each commit. For example, `0.0.2` → `0.0.3`.

**To set up the pre-commit hook:**

**Option 1: Use the setup script (recommended)**
```bash
./scripts/setup-pre-commit.sh
```

**Option 2: Manual installation**
```bash
cp scripts/pre-commit .git/hooks/pre-commit && chmod +x .git/hooks/pre-commit
```

**Verify the hook is set up correctly:**
```bash
ls -la .git/hooks/pre-commit
```
You should see the file is executable (`-rwxr-xr-x`).

**How it works:**

- On each commit, the hook automatically:
  - Reads the current version from `pyproject.toml`
  - Increments the patch version (e.g., `0.0.2` → `0.0.3`)
  - Updates `pyproject.toml` with the new version
  - Stages the updated file so it's included in your commit

**Note:** The hook only increments the patch version (last number). To bump minor or major versions, manually edit `pyproject.toml` before committing.

