Metadata-Version: 2.1
Name: zimscan
Version: 0.2.0
Summary: ZIM file iterator
Home-page: https://github.com/jojolebarjos/zimscan
License: MIT
Keywords: zim,iterator,wikipedia,gutenberg,kiwix
Author: Johan Berdat
Author-email: jojolebarjos@gmail.com
Requires-Python: >=3.6,<4.0
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Text Processing
Classifier: Topic :: Utilities
Requires-Dist: numpy (>=1.18,<2.0)
Requires-Dist: zstandard (>=0.16.0)
Project-URL: Repository, https://github.com/jojolebarjos/zimscan
Description-Content-Type: text/markdown

# ZIM Scan

Minimal ZIM file reader, designed for article streaming.


## Getting Started

Install using pip:

```
pip install zimscan
```

Or from Git repository, for latest version:

```
pip install -U git+https://github.com/jojolebarjos/zimscan.git
```

Iterate over a records, which are binary file-like objects:

```python
from zimscan import Reader

path = "wikipedia_en_all_nopic_2019-10.zim"
with Reader(open(path, "rb"), skip_metadata=True) as reader:
    for record in reader:
        data = record.read()
        ...
```


## Links

 * [ZIM file format](https://openzim.org/wiki/ZIM_file_format), official documentation
 * [Kiwix ZIM repository](http://download.kiwix.org/zim/), to download official ZIM files
 * [Wikipedia ZIM dumps](https://dumps.wikimedia.org/other/kiwix/zim/wikipedia/), to download Wikipedia ZIM files
 * [ZIMply](https://github.com/kimbauters/ZIMply), a ZIM file reader in the browser, in Python
 * [libzim](https://github.com/openzim/libzim), the reference implementation, in C++
 * [pyzim](https://github.com/pediapress/pyzim), Python wrapper for libzim
 * [pyzim](https://framagit.org/mgautierfr/pyzim), another Python wrapper for libzim
 * [Internet In A Box](https://github.com/iiab/internet-in-a-box), a project to bundle open knowledge locally

