Metadata-Version: 2.4
Name: dbini
Version: 1.0.1
Summary: Zero-config NoSQL backend database as a Python package
Author-email: Binidu Ranasinghe <rbinidu@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/Binidu01/dbini
Keywords: NoSQL,database,JSON,Python,local-first,file-based DB,FastAPI,REST API,WebSocket
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: fastapi
Requires-Dist: uvicorn
Dynamic: license-file

# dbini

**A complete, zero-configuration NoSQL database solution for Python applications**

[![Python](https://img.shields.io/badge/Python-3.9%2B-blue?style=for-the-badge&logo=python&logoColor=white)](https://python.org)
[![PyPI](https://img.shields.io/pypi/v/dbini?style=for-the-badge)](https://pypi.org/project/dbini)
[![GitHub Stars](https://img.shields.io/github/stars/Binidu01/dbini?style=for-the-badge&logo=github)](https://github.com/Binidu01/dbini/stargazers)
[![License](https://img.shields.io/github/license/Binidu01/dbini?style=for-the-badge)](https://github.com/Binidu01/dbini/blob/main/LICENSE)
[![Issues](https://img.shields.io/github/issues/Binidu01/dbini?style=for-the-badge&logo=github)](https://github.com/Binidu01/dbini/issues)

---

## Overview

dbini is a feature-complete NoSQL database solution designed for Python applications that need advanced document storage without the complexity of external database setup. It provides a comprehensive, file-based storage system with support for complex queries, array operations, full-text search, and REST API access - making it ideal for production applications, prototyping, and local-first development.

## Key Features

### Core Database Features
- **Zero Configuration**: Start using immediately without setup or external dependencies
- **Document Storage**: Store and query JSON documents with full CRUD operations
- **Array Operations**: Query arrays with `$size`, `$elemMatch`, and array indexing
- **Nested Queries**: Deep object querying with dot notation (`user.profile.preferences`)
- **Full-Text Search**: FTS5-powered search across all document content
- **Schema Validation**: Define and enforce document schemas with type checking
- **Upsert Operations**: Update if exists, insert if not exists
- **Bulk Operations**: Insert, update, or delete multiple documents efficiently

### Advanced Query Operators
- **Comparison**: `$eq`, `$ne`, `$gt`, `$gte`, `$lt`, `$lte`
- **Logical**: `$and`, `$or`, `$not`
- **Array**: `$in`, `$nin`, `$size`, `$elemMatch`
- **Text**: `$like`, `$regex`
- **Existence**: `$exists`

### Performance & Reliability
- **Atomic Operations**: Secure atomic writes ensuring data integrity
- **WAL with HMAC**: Write-Ahead Logging with tamper detection
- **Indexing**: SQLite-based indexes for query optimization
- **Thread Safety**: Multi-thread safe operations
- **Aggregations**: count, min, max, avg, sum, distinct with group by

### Interfaces
- **Embedded Library**: Use directly in Python applications
- **REST API Server**: Full-featured HTTP API with FastAPI
- **Real-time Updates**: WebSocket support for live data synchronization
- **File Management**: Integrated file storage and retrieval system

## Installation

### Requirements

- Python 3.9 or higher
- pip package manager

### Install from PyPI

```bash
pip install dbini
```

### Install from Source

```bash
git clone https://github.com/Binidu01/dbini.git
cd dbini
pip install .
```

## Quick Start

### Basic Document Operations

```python
from dbini import DBini

# Initialize database for your project
db = DBini("myproject")

# Add a new document
user_data = {
    "name": "Alice Johnson",
    "email": "alice@example.com",
    "age": 28,
    "skills": ["python", "javascript", "react"],
    "profile": {
        "bio": "Full-stack developer",
        "location": "San Francisco",
        "preferences": {"theme": "dark"}
    }
}
doc_id = db.add_document("users", user_data)

# Simple queries
users = db.find("users", filters={"age": 28})
user = db.get_document("users", doc_id)

# Complex queries with array operations
react_users = db.find("users", filters={
    "skills": {"$in": ["react"]}
})

# Nested object queries
dark_theme_users = db.find("users", filters={
    "profile.preferences.theme": "dark"
})

# Array size queries
skilled_users = db.find("users", filters={
    "skills": {"$size": {"$gte": 3}}
})

db.close()
```

### Advanced Query Examples

```python
from dbini import DBini
from dbini.core import Collection

db = DBini("advanced_project")
users = Collection(db, "users")

# Sample data with complex structures
sample_users = [
    {
        "name": "Alice",
        "age": 28,
        "skills": ["python", "react", "node.js"],
        "projects": [
            {"name": "E-commerce API", "status": "completed", "tech": ["python"]},
            {"name": "Dashboard", "status": "in_progress", "tech": ["react"]}
        ],
        "location": "San Francisco"
    },
    {
        "name": "Bob", 
        "age": 32,
        "skills": ["python", "django"],
        "projects": [
            {"name": "Blog Platform", "status": "completed", "tech": ["django"]}
        ],
        "location": "New York"
    }
]

# Bulk insert
user_ids = users.bulk_insert(sample_users)

# Complex array queries - find users with completed projects
completed_project_users = users.find({
    "projects": {
        "$elemMatch": {"status": "completed"}
    }
})

# Complex logical queries
experienced_python_devs = users.find({
    "$and": [
        {"age": {"$gte": 25}},
        {"skills": {"$in": ["python"]}},
        {"projects": {"$size": {"$gte": 1}}}
    ]
})

# Regular expression queries
sf_users = users.find({
    "location": {"$regex": "San.*"}
})

# Full-text search
react_related = users.find(search="react dashboard")

# Upsert operation
user_id, was_inserted = users.upsert(
    {"email": "charlie@example.com"},  # filter
    {"name": "Charlie", "age": 25, "skills": ["javascript"]}  # data
)

# Aggregations with grouping
avg_age_by_location = users.aggregate("avg", "age", group_by="location")
skill_counts = users.aggregate("count", group_by="skills")

db.close()
```

### Schema Validation

```python
from dbini import DBini

db = DBini("validated_project")

# Define schema
user_schema = {
    "fields": {
        "name": {"type": "string", "required": True, "minLength": 2},
        "email": {"type": "string", "required": True},
        "age": {"type": "integer", "min": 0, "max": 120},
        "skills": {"type": "array"},
        "profile.bio": {"type": "string", "maxLength": 500}
    }
}

# Set schema for collection
db.set_schema("users", user_schema)

# Documents are automatically validated
try:
    doc_id = db.add_document("users", {
        "name": "John",
        "email": "john@example.com", 
        "age": 25,
        "skills": ["python", "django"]
    })
    print("Document added successfully")
except ValueError as e:
    print(f"Validation failed: {e}")

# Manual validation
valid_doc = {"name": "Jane", "email": "jane@example.com", "age": 30}
is_valid, errors = db.validate_document("users", valid_doc)
print(f"Valid: {is_valid}, Errors: {errors}")

db.close()
```

### REST API Server

```python
from dbini.server import serve

# Start enhanced API server with all features
serve(project_root="myproject", host="localhost", port=8080)
```

The server provides comprehensive REST API endpoints:

#### Document Operations

| Method | Endpoint | Description |
|--------|----------|-------------|
| `POST` | `/v1/collections/{collection}/documents` | Create document with validation |
| `GET` | `/v1/collections/{collection}/documents/{id}` | Get document by ID |
| `PATCH` | `/v1/collections/{collection}/documents/{id}` | Update document (partial) |
| `DELETE` | `/v1/collections/{collection}/documents/{id}` | Delete document |
| `PUT` | `/v1/collections/{collection}/documents/upsert` | Upsert document |

#### Advanced Querying

| Method | Endpoint | Description |
|--------|----------|-------------|
| `POST` | `/v1/collections/{collection}:query` | Advanced query with filters, search, sorting |
| `GET` | `/v1/collections/{collection}/documents` | Get all documents with pagination |
| `POST` | `/v1/collections/{collection}:aggregate` | Aggregation operations |

#### Bulk Operations

| Method | Endpoint | Description |
|--------|----------|-------------|
| `POST` | `/v1/collections/{collection}/bulk:insert` | Bulk insert documents |
| `POST` | `/v1/collections/{collection}/bulk:update` | Bulk update documents |
| `POST` | `/v1/collections/{collection}/bulk:delete` | Bulk delete documents |

#### Schema Management

| Method | Endpoint | Description |
|--------|----------|-------------|
| `POST` | `/v1/collections/{collection}/schema` | Set collection schema |
| `GET` | `/v1/collections/{collection}/schema` | Get collection schema |
| `DELETE` | `/v1/collections/{collection}/schema` | Remove collection schema |
| `POST` | `/v1/collections/{collection}:validate` | Validate document against schema |

#### Collection Management

| Method | Endpoint | Description |
|--------|----------|-------------|
| `GET` | `/v1/collections` | List collections with stats |
| `GET` | `/v1/collections/{collection}/stats` | Get collection statistics |
| `POST` | `/v1/collections/{collection}/index` | Create field index |

#### File Operations

| Method | Endpoint | Description |
|--------|----------|-------------|
| `POST` | `/v1/files` | Upload file |
| `GET` | `/v1/files/{id}` | Download file |
| `GET` | `/v1/files/{id}/info` | Get file metadata |
| `GET` | `/v1/files` | List all files |
| `DELETE` | `/v1/files/{id}` | Delete file |

#### System Operations

| Method | Endpoint | Description |
|--------|----------|-------------|
| `GET` | `/v1/health` | System health check |
| `GET` | `/v1/info` | Database information and stats |
| `POST` | `/v1/export` | Export database snapshot |
| `POST` | `/v1/restore` | Restore from snapshot |

#### WebSocket Real-time

| Endpoint | Description |
|----------|-------------|
| `ws://host:port/v1/ws/{collection}` | Collection-specific updates |
| `ws://host:port/v1/ws` | All database updates |

## API Usage Examples

### Complex Query via REST API

```bash
# Advanced query with multiple conditions
curl -X POST "http://localhost:8080/v1/collections/users:query" \
  -H "Content-Type: application/json" \
  -d '{
    "filters": {
      "$and": [
        {"age": {"$gte": 25}},
        {"skills": {"$in": ["python"]}},
        {"projects": {"$elemMatch": {"status": "completed"}}}
      ]
    },
    "sort": [["age", -1], ["name", 1]],
    "limit": 10,
    "search": "developer"
  }'
```

### Bulk Operations via REST API

```bash
# Bulk insert multiple users
curl -X POST "http://localhost:8080/v1/collections/users/bulk:insert" \
  -H "Content-Type: application/json" \
  -d '{
    "docs": [
      {"name": "Alice", "age": 28, "skills": ["python", "react"]},
      {"name": "Bob", "age": 32, "skills": ["java", "spring"]}
    ]
  }'

# Bulk update with filters
curl -X POST "http://localhost:8080/v1/collections/users/bulk:update" \
  -H "Content-Type: application/json" \
  -d '{
    "filters": {"age": {"$gte": 30}},
    "updates": {"is_senior": true}
  }'
```

### Schema Validation via REST API

```bash
# Set schema for collection
curl -X POST "http://localhost:8080/v1/collections/users/schema" \
  -H "Content-Type: application/json" \
  -d '{
    "schema": {
      "fields": {
        "name": {"type": "string", "required": true},
        "email": {"type": "string", "required": true},
        "age": {"type": "integer", "min": 0, "max": 120}
      }
    }
  }'
```

## Complete API Reference

### Core Database Class

```python
from dbini import DBini

db = DBini("project_path")

# Document operations
doc_id = db.add_document(collection, document, doc_id=None)
doc = db.get_document(collection, doc_id)
success = db.update_document(collection, doc_id, updates)
doc_id, was_inserted = db.upsert_document(collection, filters, updates, doc_id=None)
success = db.delete_document(collection, doc_id)

# Advanced querying
docs = db.find(collection, filters=None, limit=None, sort=None, search=None)

# Bulk operations  
doc_ids = db.bulk_insert(collection, documents)
count = db.bulk_update(collection, filters, updates)
count = db.bulk_delete(collection, filters)

# Aggregations
result = db.aggregate(collection, operation, field=None, group_by=None)

# Schema operations
db.set_schema(collection, schema)
schema = db.get_schema(collection)
is_valid, errors = db.validate_document(collection, document)

# Indexing and optimization
db.ensure_index_on(collection, field_path)
stats = db.collection_stats(collection)

# File operations
file_id = db.save_file(file_path, dest_filename=None)
file_path = db.get_file_path(file_id)
files = db.list_files()

# Utilities
collections = db.list_collections()
snapshot_path = db.export_snapshot(snapshot_dir)
db.restore_snapshot(snapshot_path)
is_valid, bad_line, message = db.verify_wal()
db.close()
```

### Collection Wrapper Class

```python
from dbini.core import Collection

collection = Collection(db, "collection_name")

# Simplified operations
doc_id = collection.add(document, doc_id=None)
doc = collection.get(doc_id)
success = collection.update(doc_id, updates)
doc_id, was_inserted = collection.upsert(filters, updates, doc_id=None)
success = collection.delete(doc_id)

# Querying
docs = collection.find(filters=None, limit=None, sort=None, search=None)
doc = collection.find_one(filters)

# Bulk operations
doc_ids = collection.bulk_insert(documents)
count = collection.bulk_update(filters, updates)
count = collection.bulk_delete(filters)

# Aggregations and utilities
result = collection.aggregate(operation, field=None, group_by=None)
collection.create_index(field)
collection.set_schema(schema)
schema = collection.get_schema()
stats = collection.stats()
```

## Project Structure

When you initialize a dbini project, the following directory structure is created:

```
myproject/
├── data/
│   ├── collections/
│   │   └── users/
│   │       ├── 550e8400-e29b-41d4-a716-446655440000.json
│   │       └── 6ba7b810-9dad-11d1-80b4-00c04fd430c8.json
│   └── files/
│       ├── 123e4567-e89b-12d3-a456-426614174000_image.jpg
│       └── 987fcdeb-51a2-43d1-9f12-345678901234_document.pdf
├── index/
│   ├── users.sqlite
│   └── files.index.sqlite
├── wal/
│   └── append.log
└── meta/
    ├── project.json
    ├── keys.json
    └── users_schema.json
```

- **data/collections/**: JSON documents organized by collection
- **data/files/**: Uploaded files referenced by unique IDs
- **index/**: SQLite databases for query optimization and FTS
- **wal/**: Write-Ahead Log with HMAC chain for integrity
- **meta/**: Project metadata, keys, and schemas

## Architecture

dbini is built with modern Python technologies and best practices:

- **Core Engine**: Pure Python with minimal dependencies
- **Query Engine**: Hybrid SQLite indexes + in-memory processing
- **Full-Text Search**: SQLite FTS5 for advanced text queries
- **API Server**: FastAPI framework with automatic OpenAPI docs
- **ASGI Server**: Uvicorn for high-performance async operations
- **Storage**: Atomic file operations with fsync and os.replace
- **Security**: HMAC-based WAL chain for tamper detection
- **Real-time**: WebSocket support for live data synchronization

## Use Cases

### Production Applications
- **E-commerce**: Product catalogs, user profiles, order management
- **Content Management**: Articles, media, user-generated content
- **Analytics**: Event logging, user behavior tracking
- **IoT Applications**: Sensor data, device management

### Development & Prototyping
- **Rapid Prototyping**: Get started with persistent storage immediately
- **Local Development**: Test applications without external database dependencies
- **Desktop Applications**: Ideal for tkinter, PyQt, or other desktop GUI frameworks
- **Microservices**: Lightweight storage for individual services

### Specialized Scenarios
- **Edge Computing**: Lightweight storage for resource-constrained environments
- **Offline-first Apps**: Applications that need to work without network connectivity
- **Embedded Systems**: Local data storage in IoT and embedded devices
- **Data Analysis**: Store and query datasets for research and analysis

## Performance Characteristics

- **Document Size**: Optimized for documents up to 16MB
- **Collection Size**: Efficient for collections up to 1M documents
- **Query Performance**: Sub-millisecond simple queries, <100ms complex queries
- **Concurrent Users**: Supports 10-100 concurrent operations
- **Storage Efficiency**: Compact JSON storage with optional compression
- **Memory Usage**: Configurable memory usage based on query complexity

## Migration Guide

### From Basic DBini (v0.1) to Enhanced DBini (v0.2)

The enhanced version is fully backward compatible. Existing code will work unchanged, but you can now use additional features:

```python
# Old code (still works)
db = DBini("myproject")
docs = db.find("users", filters={"age": 25})

# New enhanced features
# Array operations
users_with_skills = db.find("users", filters={"skills": {"$size": {"$gte": 3}}})

# Full-text search
search_results = db.find("users", search="python developer")

# Upsert operations
user_id, inserted = db.upsert_document("users", {"email": "user@example.com"}, {"name": "Updated Name"})

# Bulk operations
inserted_ids = db.bulk_insert("users", [{"name": "User1"}, {"name": "User2"}])
```

## Contributing

We welcome contributions to dbini! Here's how you can help:

1. **Fork** the repository on GitHub
2. **Create** a feature branch (`git checkout -b feature/amazing-feature`)
3. **Add tests** for your changes
4. **Commit** your changes (`git commit -m 'Add amazing feature'`)
5. **Push** to the branch (`git push origin feature/amazing-feature`)
6. **Open** a Pull Request

### Development Setup

```bash
git clone https://github.com/Binidu01/dbini.git
cd dbini
pip install -e ".[dev]"
pytest  # Run tests
```

Please ensure your code follows Python best practices and includes appropriate tests.

## License

This project is licensed under the MIT License. See the [LICENSE](https://github.com/Binidu01/dbini/blob/main/LICENSE) file for details.

## Support

- **Documentation**: [GitHub Repository](https://github.com/Binidu01/dbini)
- **Package**: [PyPI](https://pypi.org/project/dbini)
- **Issues**: [GitHub Issues](https://github.com/Binidu01/dbini/issues)
- **Discussions**: [GitHub Discussions](https://github.com/Binidu01/dbini/discussions)
- **API Documentation**: Available at `/docs` when running the server

## Changelog

### v0.2.0 - Enhanced NoSQL Features
- Added array operations (`$size`, `$elemMatch`)
- Added full-text search with FTS5
- Added schema validation system
- Added upsert operations
- Added bulk operations (insert, update, delete)
- Added regex and advanced query operators
- Added nested object querying with dot notation
- Added enhanced aggregations with group by
- Enhanced REST API with 20+ new endpoints
- Added WebSocket improvements
- Added comprehensive documentation

### v0.1.0 - Initial Release
- Basic document CRUD operations
- Simple query filtering
- File storage system
- REST API server
- WebSocket support

## Acknowledgments

dbini is inspired by modern NoSQL databases like MongoDB and local-first software principles. Special thanks to the Python community and all contributors who help improve this project.

---

**Made with ❤️ by [Binidu01](https://github.com/Binidu01)**

*If you find dbini useful, please consider giving it a ⭐ on GitHub!*
