Metadata-Version: 2.4
Name: databricks-mcp-server-local
Version: 0.1.2
Summary: Model Context Protocol server for Databricks (local development)
Author: Databricks MCP Contributors
License: MIT
Project-URL: Homepage, https://github.com/yourusername/databricks-mcp-server-local
Project-URL: Documentation, https://github.com/yourusername/databricks-mcp-server-local#readme
Project-URL: Repository, https://github.com/yourusername/databricks-mcp-server-local
Project-URL: Issues, https://github.com/yourusername/databricks-mcp-server-local/issues
Keywords: databricks,mcp,model-context-protocol,ai,cursor
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: fastmcp>=2.14.0
Requires-Dist: databricks-sdk>=0.1.0
Requires-Dist: python-dotenv>=1.0.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Dynamic: license-file

# Databricks MCP Server (Local)

Complete setup and configuration for the Databricks MCP (Model Context Protocol) server integration with Cursor IDE. This repository provides a full MCP server implementation with setup scripts, configuration templates, and comprehensive documentation.

## Overview

The Databricks MCP server enables AI assistants in Cursor IDE to interact with your Databricks workspace using natural language. You can manage notebooks, clusters, jobs, SQL warehouses, and workspace files—all through conversational queries.

## Features

- **Notebook Management**: Run, list, get, create, and delete notebooks
- **Cluster Management**: List, create, start, stop, and restart clusters
- **Job Management**: Create, run, monitor, and manage Databricks jobs
- **SQL Warehouses**: Execute SQL queries and manage SQL warehouses
- **Workspace Operations**: List, read, create, and delete workspace files
- **Git Repositories**: List and manage Git repositories in workspace

## Quick Start

### Prerequisites

- Python 3.9 or higher
- Cursor IDE
- A Databricks workspace account
- Basic familiarity with command-line operations

### Installation Steps

1. **Install Dependencies**
   ```bash
   ./scripts/install-dependencies.sh
   ```
   This will check for Python and install `uv` (the recommended package manager) if needed.

2. **Run Setup Script**
   ```bash
   ./scripts/setup.sh
   ```
   This interactive script will:
   - Guide you through creating a Databricks Personal Access Token
   - Create your `.env` file with credentials
   - Generate a Cursor MCP configuration file

3. **Install the MCP Server**
   ```bash
   pip install -e .
   ```
   Or if using `uvx`:
   ```bash
   uvx databricks-mcp-server-local
   ```

4. **Configure Cursor IDE**
   - Open Cursor Settings
   - Navigate to **Features > MCP Servers**
   - Click **"+ Add new global MCP server"**
   - Copy the configuration from `config/cursor-mcp-config.json` (created by setup script)
   - Or manually add the configuration shown in the setup script output

5. **Restart Cursor IDE**
   - Close and reopen Cursor to load the MCP server

6. **Verify Connection**
   ```bash
   ./scripts/verify-connection.sh
   ```

7. **Test It Out**
   In Cursor Composer, try asking:
   - "List all clusters in my Databricks workspace"
   - "Show me all notebooks in the workspace"
   - "Create a new cluster named 'test-cluster'"
   - "Execute SQL query 'SELECT * FROM my_table' on warehouse abc123"

## What's Included

### Configuration Files

- **`.env.example`** - Template for environment variables
- **`config/cursor-mcp-config.json.example`** - Example Cursor MCP configuration

### Helper Scripts

- **`scripts/install-dependencies.sh`** - Checks Python and installs uv
- **`scripts/setup.sh`** - Interactive setup wizard
- **`scripts/verify-connection.sh`** - Tests your Databricks connection

### Documentation

- **`docs/authentication.md`** - Detailed authentication setup guide
- **`docs/troubleshooting.md`** - Common issues and solutions
- **`docs/usage-examples.md`** - Example queries and use cases
- **`docs/publishing.md`** - PyPI publishing guide and security checklist

## Authentication Setup

### Personal Access Token (Recommended)

1. Go to your Databricks workspace
2. Click your username in the top right
3. Select **User Settings**
4. Go to **Access Tokens** tab
5. Click **Generate New Token**
6. Give it a name (e.g., "Databricks MCP Integration")
7. Set expiration (or leave blank for no expiration)
8. Copy the token immediately (it's only shown once!)

Then add to your `.env` file:
```bash
DATABRICKS_HOST=https://your-workspace.cloud.databricks.com
DATABRICKS_TOKEN=your_personal_access_token_here
```

### OAuth / Unified Authentication

The Databricks SDK will automatically use:
- Databricks CLI authentication (`~/.databrickscfg`)
- Environment variables
- Notebook-native authentication (if running in Databricks)

See [docs/authentication.md](docs/authentication.md) for detailed instructions.

## Installation Methods

The server supports multiple installation methods:

### Option 1: Using uvx (Recommended)

```bash
uvx databricks-mcp-server-local
```

This automatically downloads and runs the package.

### Option 2: Using pip

```bash
pip install databricks-mcp-server-local
```

### Option 3: Local Development

```bash
pip install -e .
```

## Configuration Options

### Environment Variables

**Required:**
- `DATABRICKS_HOST` - Your Databricks workspace URL (e.g., `https://your-workspace.cloud.databricks.com`)

**Authentication (choose one):**
- `DATABRICKS_TOKEN` - Personal Access Token (recommended)
- OAuth/unified authentication (automatic via Databricks SDK)

**Optional:**
- `DATABRICKS_ACCOUNT_ID` - Account ID for account-level operations
- `DATABRICKS_CLUSTER_ID` - Default cluster ID
- `DATABRICKS_WAREHOUSE_ID` - Default SQL warehouse ID

### Cursor MCP Configuration

The MCP server uses stdio transport by default. Example configuration:

```json
{
  "mcpServers": {
    "databricks-mcp-server-local": {
      "command": "python",
      "args": ["-m", "databricks_mcp.server"],
      "env": {
        "DATABRICKS_HOST": "https://your-workspace.cloud.databricks.com",
        "DATABRICKS_TOKEN": "your_personal_access_token_here"
      }
    }
  }
}
```

## Usage Examples

Once configured, you can use natural language to interact with Databricks:

### Notebook Management
- "List all notebooks in my workspace"
- "Get the content of notebook /Users/user@example.com/my_notebook"
- "Create a new Python notebook at /Users/user@example.com/test"
- "Run notebook /Users/user@example.com/my_notebook with parameters {'param1': 'value1'}"

### Cluster Management
- "List all clusters"
- "Show me details of cluster abc-123-def"
- "Create a new cluster named 'production-cluster' with Spark 13.3.x"
- "Start cluster abc-123-def"
- "Stop cluster abc-123-def"

### Job Management
- "List all jobs"
- "Show me details of job 12345"
- "Create a job to run notebook /path/to/notebook"
- "Run job 12345"
- "What's the status of job run 67890?"

### SQL Operations
- "List all SQL warehouses"
- "Execute SQL query 'SELECT * FROM my_table LIMIT 10' on warehouse abc123"
- "Show me recent query history"
- "Create a saved query named 'daily_report'"

### Workspace Operations
- "List files in /Users/user@example.com"
- "Read the content of /Users/user@example.com/script.py"
- "Create a new file at /Users/user@example.com/test.py"
- "List all Git repositories"

See [docs/usage-examples.md](docs/usage-examples.md) for more examples.

## Security Best Practices

1. **Never commit tokens** - The `.env` file is gitignored for a reason
2. **Use Personal Access Tokens** - More secure than password authentication
3. **Set token expiration** - Rotate tokens regularly according to your security policy
4. **Use least privilege** - Grant only necessary permissions to the token
5. **Store securely** - Use environment variables or secure credential storage

See [docs/authentication.md](docs/authentication.md#security-best-practices) for detailed security guidance.

## Troubleshooting

### Connection Issues

- Verify your `.env` file exists and has correct values
- Run `./scripts/verify-connection.sh` to test authentication
- Check that Cursor IDE has been restarted after configuration changes
- Ensure the MCP server shows as connected (green indicator) in Cursor

### Authentication Problems

- Verify your Personal Access Token is correct (no extra spaces)
- Check that the token hasn't expired
- Ensure your workspace URL is correct and accessible
- For OAuth, verify Databricks CLI is configured (`databricks auth login`)

### MCP Server Not Working

- Check Python version (requires 3.9+)
- Verify the package is installed: `pip list | grep databricks-mcp-server-local`
- Review Cursor logs for error messages
- Try running the server directly: `python -m databricks_mcp.server`

See [docs/troubleshooting.md](docs/troubleshooting.md) for comprehensive troubleshooting guide.

## Project Structure

```
databricks-mcp-server-local/
├── README.md                          # This file
├── .env.example                       # Environment variable template
├── .gitignore                         # Git ignore rules
├── LICENSE                            # MIT License
├── pyproject.toml                     # Python project configuration
├── requirements.txt                   # Python dependencies
├── src/
│   └── databricks_mcp/
│       ├── __init__.py
│       ├── server.py                  # Main MCP server implementation
│       ├── databricks_client.py       # Databricks SDK wrapper
│       └── tools/
│           ├── __init__.py
│           ├── notebooks.py           # Notebook operations
│           ├── clusters.py            # Cluster management
│           ├── jobs.py                # Job operations
│           ├── sql.py                 # SQL warehouses & queries
│           └── workspace.py          # Workspace operations
├── config/
│   └── cursor-mcp-config.json.example # Cursor MCP config template
├── scripts/
│   ├── install-dependencies.sh       # Install Python/uv
│   ├── setup.sh                       # Interactive setup wizard
│   └── verify-connection.sh          # Test Databricks connection
└── docs/
    ├── authentication.md              # Auth setup guide
    ├── troubleshooting.md             # Common issues
    └── usage-examples.md              # Usage examples
```

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## License

MIT License - See [LICENSE](LICENSE) file for details.

## Additional Resources

- [Model Context Protocol Documentation](https://modelcontextprotocol.io)
- [Cursor IDE MCP Documentation](https://docs.cursor.com/en/guides/tutorials/building-mcp-server)
- [Databricks SDK for Python](https://docs.databricks.com/en/dev-tools/sdk-python.html)
- [Databricks REST API Reference](https://docs.databricks.com/api/)

## Acknowledgments

Built with [FastMCP](https://gofastmcp.com/) and the [Databricks SDK for Python](https://docs.databricks.com/en/dev-tools/sdk-python.html).
