Metadata-Version: 2.4
Name: astchunk-extended
Version: 0.2.0
Summary: AST-based code chunking with dynamic multi-language support (fork of astchunk)
Project-URL: Homepage, https://github.com/liker0704/astchunk-extended
Project-URL: Repository, https://github.com/liker0704/astchunk-extended
Project-URL: Upstream, https://github.com/yilinjz/astchunk
Author-email: "Yilin (Jason) Zhang" <jasonzh3@andrew.cmu.edu>, Xinran Zhao <xinranz3@andrew.cmu.edu>, Zora Zhiruo Wang <zhiruow@andrew.cmu.edu>, Chenyang Yang <cyang3@andrew.cmu.edu>, Jiayi Wei <jiayi@augmentcode.com>, Sherry Tongshuang Wu <sherryw@andrew.cmu.edu>
License-Expression: MIT
License-File: LICENSE
Keywords: ast,chunking,code analysis,code indexing,code retrieval,parsing,tree-sitter
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.9
Requires-Dist: numpy>=1.20.0
Requires-Dist: pyrsistent>=0.18.0
Requires-Dist: tree-sitter>=0.23.0
Provides-Extra: all
Requires-Dist: tree-sitter-bash; extra == 'all'
Requires-Dist: tree-sitter-c; extra == 'all'
Requires-Dist: tree-sitter-c-sharp; extra == 'all'
Requires-Dist: tree-sitter-cpp; extra == 'all'
Requires-Dist: tree-sitter-go; extra == 'all'
Requires-Dist: tree-sitter-java; extra == 'all'
Requires-Dist: tree-sitter-kotlin; extra == 'all'
Requires-Dist: tree-sitter-lua; extra == 'all'
Requires-Dist: tree-sitter-php; extra == 'all'
Requires-Dist: tree-sitter-python; extra == 'all'
Requires-Dist: tree-sitter-ruby; extra == 'all'
Requires-Dist: tree-sitter-rust; extra == 'all'
Requires-Dist: tree-sitter-scala; extra == 'all'
Requires-Dist: tree-sitter-typescript; extra == 'all'
Requires-Dist: tree-sitter-zig; extra == 'all'
Provides-Extra: core
Requires-Dist: tree-sitter-c-sharp; extra == 'core'
Requires-Dist: tree-sitter-java; extra == 'core'
Requires-Dist: tree-sitter-python; extra == 'core'
Requires-Dist: tree-sitter-typescript; extra == 'core'
Provides-Extra: jvm
Requires-Dist: tree-sitter-java; extra == 'jvm'
Requires-Dist: tree-sitter-kotlin; extra == 'jvm'
Requires-Dist: tree-sitter-scala; extra == 'jvm'
Provides-Extra: scripting
Requires-Dist: tree-sitter-bash; extra == 'scripting'
Requires-Dist: tree-sitter-lua; extra == 'scripting'
Requires-Dist: tree-sitter-php; extra == 'scripting'
Requires-Dist: tree-sitter-ruby; extra == 'scripting'
Provides-Extra: systems
Requires-Dist: tree-sitter-c; extra == 'systems'
Requires-Dist: tree-sitter-cpp; extra == 'systems'
Requires-Dist: tree-sitter-go; extra == 'systems'
Requires-Dist: tree-sitter-rust; extra == 'systems'
Requires-Dist: tree-sitter-zig; extra == 'systems'
Provides-Extra: test
Requires-Dist: pytest-cov>=4.0.0; extra == 'test'
Requires-Dist: pytest>=7.0.0; extra == 'test'
Description-Content-Type: text/markdown

# astchunk-extended

Drop-in replacement for [astchunk](https://github.com/yilinjz/astchunk) with dynamic multi-language support.

## What's different

- **15 languages** instead of 4 (C, C++, Go, Rust, Ruby, Bash, Kotlin, Scala, Lua, PHP, Zig + original Python, Java, C#, TypeScript)
- **Dynamic parser discovery** — if a tree-sitter parser is installed, it works automatically
- **No hardcoded imports** — parsers loaded via `importlib` on demand
- **Optional extras** — install only the languages you need

## Installation

```bash
# All languages
pip install astchunk-extended[all]

# Or pick what you need
pip install astchunk-extended[core]        # Python, Java, C#, TypeScript
pip install astchunk-extended[systems]     # C, C++, Go, Rust, Zig
pip install astchunk-extended[scripting]   # Ruby, Bash, Lua, PHP
pip install astchunk-extended[jvm]         # Java, Kotlin, Scala
```

With LEANN:
```bash
uv tool install leann-core --with leann --with "astchunk-extended[all]"
python -c "from astchunk.patch_leann import apply; apply()"
```

## Usage

```python
from astchunk import ASTChunkBuilder, get_available_languages

# See what's installed
print(get_available_languages())
# ['bash', 'c', 'cpp', 'go', 'java', 'kotlin', 'lua', 'php', 'python', 'rust', ...]

# Chunk C++ code
builder = ASTChunkBuilder(max_chunk_size=300, language="cpp", metadata_template="default")
chunks = builder.chunkify(code)
```

### Custom languages

```python
from astchunk import register_language

register_language("haskell", "tree_sitter_haskell")
builder = ASTChunkBuilder(max_chunk_size=300, language="haskell", metadata_template="default")
```

## License

MIT (same as upstream astchunk)
