Metadata-Version: 2.4
Name: zuele
Version: 0.1.0
Summary: Jieba in economy
Author: ikun
Author-email: 2206490823@qq.com
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: License
Dynamic: author
Dynamic: author-email
Dynamic: description
Dynamic: description-content-type
Dynamic: license-file
Dynamic: requires-python
Dynamic: summary

# zuele
零配置、开箱即用的中文经济文本分词库。  
内置 1125 份年报提炼词典，支持带词性输出。
目前正在优化词典
# 测试
    ```python
    import os

    import fitz

    from zuele import Tokenizer

    tok = Tokenizer()



    # 定义提取PDF文本的函数
    def extract_text_from_pdf(pdf_path):
    doc = fitz.open(pdf_path)
    text = ""
    for page_num in range(len(doc)):
        page = doc.load_page(page_num)
        text += page.get_text()
    return text


    # 主程序
    if __name__ == "__main__":
        pdf_dir = "pdf"  # 存储PDF文件的目录
        for filename in os.listdir(pdf_dir):
            pdf_path = os.path.join(pdf_dir, filename)
            text = extract_text_from_pdf(pdf_path)
            print(list(tok.cut(text)))



项目根目录执行：
pip install --upgrade pip

python -m build
        
twine check dist/*
twine upload dist/*

# 1. 改完 README 后，顺手改 setup.py / pyproject.toml 里的 version
# 2. 清理旧构建产物
rm -rf build/ dist/ *.egg-info
# 3. 重新打包
python -m build
# 4. 上传
twine upload dist/*
