Metadata-Version: 2.4
Name: xiaothink
Version: 1.4.1
Summary: An AI toolkit that helps users quickly call interfaces related to the Xiaothink framework.
Author: Shi Jingqi
Author-email: xiaothink@foxmail.com
License: Apache License 2.0
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
License-File: NOTICE
Requires-Dist: numpy>=1.21.0
Requires-Dist: requests>=2.25.0
Provides-Extra: tensorflow
Requires-Dist: tensorflow>=2.10.0; extra == "tensorflow"
Provides-Extra: paddle
Requires-Dist: paddlepaddle>=2.5.0; extra == "paddle"
Requires-Dist: jieba>=0.42.1; extra == "paddle"
Provides-Extra: paddle-gpu
Requires-Dist: paddlepaddle-gpu>=2.5.0; extra == "paddle-gpu"
Requires-Dist: jieba>=0.42.1; extra == "paddle-gpu"
Provides-Extra: vision
Requires-Dist: Pillow>=9.0.0; extra == "vision"
Provides-Extra: all
Requires-Dist: tensorflow>=2.10.0; extra == "all"
Requires-Dist: paddlepaddle>=2.5.0; extra == "all"
Requires-Dist: jieba>=0.42.1; extra == "all"
Requires-Dist: Pillow>=9.0.0; extra == "all"
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: license
Dynamic: license-file
Dynamic: provides-extra
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# Xiaothink Python Module Usage Documentation

[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
Xiaothink is an AI research organization focused on Natural Language Processing (NLP), dedicated to training advanced on-device models with limited data and computing resources. The Xiaothink Python module is our core toolkit, covering various functions such as text-based Q&A, multimodal Q&A, image compression, sentiment classification, and more. Below is the detailed usage guide and code examples.

## Table of Contents
1. [Installation](#installation)
2. [Local Dialogue Models](#local-dialogue-models)
3. [Image Feature Extraction and Multimodal Dialogue](#image-feature-extraction-and-multimodal-dialogue)
4. [Image Compression to Feature Technology (img_zip)](#image-compression-to-feature-technology-img_zip)
5. [Sentiment Classification Tool](#sentiment-classification-tool)
6. [AI Rate Detection Tool](#ai-rate-detection-tool)
7. [Changelog](#changelog)

---

## Installation

First, you need to install the Xiaothink module via pip:

```bash
pip install xiaothink
```

---

## License

This project is licensed under the Apache License, Version 2.0 - see the [LICENSE](LICENSE) file for details.

The [NOTICE](NOTICE) file contains additional attribution information for the proprietary technologies included in this module.

---

## Local Text-only Dialogue Models

For locally loaded dialogue models, you should call the corresponding function according to the model type.

### Single-turn Dialogue (to be removed in future versions)

Suitable for single-turn dialogue scenarios.

### Example Code

```python
import xiaothink.llm.inference.test_formal as tf

model = tf.QianyanModel(
    ckpt_dir=r'path/to/your/t6_model',
    MT='t6_beta_dense',
    vocab=r'path/to/your/vocab'# vocab file is provided in the model repository
)

while True:
    inp = input('[Q]: ')
    if inp == '[CLEAN]':
        print('[Context Cleared]\n\n')
        model.clean_his()
        continue
    re = model.chat_SingleTurn(inp, temp=0.32)  # Use chat_SingleTurn for single-turn dialogue
    print('\n[A]:', re, '\n')
```

### Multi-turn Dialogue

Suitable for multi-turn dialogue scenarios.

### Example Code

```python
import xiaothink.llm.inference.test_formal as tf

model = tf.QianyanModel(
    ckpt_dir=r'path/to/your/t6_model',
    MT='t6_beta_dense',
    vocab=r'path/to/your/vocab'# vocab file is provided in the model repository
)

while True:
    inp = input('[Q]: ')
    if inp == '[CLEAN]':
        print('[Context Cleared]\n\n')
        model.clean_his()
        continue
    re = model.chat(inp, temp=0.32)  # Use chat for multi-turn dialogue
    print('\n[A]:', re, '\n')
```

### Text Continuation

Suitable for more flexible text continuation scenarios.

### Example Code

```python
import xiaothink.llm.inference.test as test

MT = 't6_beta_dense'
m, d = test.load(
    ckpt_dir=r'path/to/your/t6_model',
    MT='t6_beta_dense',
    vocab=r'path/to/your/vocab'# vocab file is provided in the model repository
)

inp='Hello!'
belle_chat = '{"conversations": [{"role": "user", "content": {inp}}, {"role": "assistant", "content": "'.replace('{inp}', inp)    # Instruct format supported by instruction-tuned models in the T6 series
inp_m = belle_chat

ret = test.generate_texts_loop(m, d, inp_m,    
                               num_generate=100,
                               every=lambda a: print(a, end='', flush=True),
                               temperature=0.32,
                               pass_char=['▩'])    # ▩ is the <unk> token for T6 series models
```

**Important Note**: For local models, it is recommended to use the `model.chat` function for multi-turn dialogue. For pre-trained models without instruction tuning, it is recommended to use the `test.generate_texts_loop` function. **The single-turn dialogue function `model.chat_SingleTurn` will be removed in future versions.**

---

## PaddlePaddle-based Models (New in v1.4.0)

Xiaothink now supports PaddlePaddle-based models with RWKV architecture. These models offer efficient inference and are suitable for resource-constrained environments.

### Installation

```bash
pip install xiaothink
pip install paddlepaddle  # or paddlepaddle-gpu for GPU support
```

### Multi-turn Dialogue (PaddlePaddle)

```python
from xiaothink.llm.inference_paddle import QianyanModel

model = QianyanModel(
    ckpt_dir=r'path/to/your/t7.5_model',
    MT='t7.5_paddle_small_instruct_pro'
)

while True:
    inp = input('[Q]: ')
    if inp == '[CLEAN]':
        print('[Context Cleared]\n\n')
        model.clean_his()
        continue
    re = model.chat(inp, temp=0.34, form=2)  # form=2 for simplified format
    print('\n[A]:', re, '\n')
```

### Text Generation (PaddlePaddle)

```python
from xiaothink.llm.inference_paddle import TextGenerator

generator = TextGenerator(
    checkpoint_path=r'path/to/your/t7.5_model',
    MT='t7.5_paddle_small_instruct_pro'
)
generator.load_model()

generated_text = generator.generate_text(
    prompt='<|U|>Hello, how are you?<|A|>',
    max_length=100,
    temperature=0.8,
    top_p=0.9,
    repetition_penalty=1.2
)
print(generated_text)
```

### Supported Model Architectures (PaddlePaddle)

| Model Name | MT Parameter | Description | form Parameter |
|------------|--------------|-------------|----------------|
| Xiaothink-T7.5-0.1B | 't7.5_paddle_small_instruct' | Base instruction model | form=2 |

### Automatic Device Selection

The PaddlePaddle module automatically selects the optimal device based on GPU memory usage:

```python
# Automatic device selection (default)
# If GPU memory usage > 80%, switches to CPU
AUTO_DEVICE = True
GPU_MEMORY_THRESHOLD = 80.0

# Or manually set device
import paddle
paddle.set_device('cpu')  # or 'gpu:0'
```

### Train-On-Time (TOT) Dynamic Learning

Xiaothink provides an innovative Train-On-Time (TOT) feature that enables dynamic learning during inference. Unlike traditional models that only use pre-trained knowledge, TOT models continuously learn from similar examples in your training data repository. This technology can significantly enhance the model's basic capabilities. However, it's important to note that TOT models' inference speed will be somewhat affected, as it needs to perform similarity matching and fine-tuning before each conversation. Additionally, this technology cannot significantly improve the model's mathematical and reasoning abilities, and can only be used to enhance basic model capabilities.

**How TOT Works:**
1. **Similarity Matching**: Automatically finds similar instructions from your training data
2. **Dynamic Fine-tuning**: Fine-tunes the model in memory using these similar examples
3. **Enhanced Response**: Generates more accurate answers based on the newly learned knowledge
4. **Memory Management**: Optimizes GPU memory usage and cleans up resources

**Key Features:**
- **Real-time Learning**: Every conversation triggers learning from relevant examples
- **Similarity-based Matching**: Uses difflib to find semantically similar instructions
- **Parallel Processing**: Multi-core similarity calculation for faster matching
- **Multi-format Support**: Loads models from various checkpoint formats
- **GPU Optimization**: Automatic GPU memory management

```python
from xiaothink.llm.inference_paddle import TOTModel, TOT_AVAILABLE

if TOT_AVAILABLE:
    # 自定义训练数据路径
    custom_data_paths = [
        r'path/to/your/belle_train.jsonl',
        r'path/to/your/coig_minimind.jsonl',
        r'path/to/your/firefly_data.jsonl'
    ]
    
    model = TOTModel(
        ckpt_dir=r'path/to/your/t7.5_model',
        MT='t7.5_paddle_small_instruct_pro',
        data_paths=custom_data_paths  # 自定义训练数据路径
    )
    
    # The model will automatically learn from similar examples before answering
    while True:
        inp = input('[Q]: ')
        if inp == '[CLEAN]':
            model.clean_his()
            continue
        re = model.chat(inp, temp=0.68)
        print('\n[A]:', re, '\n')
else:
    print("TOT feature requires PaddlePaddle support")
```

**Custom Data Paths:**
You can now specify your own training data paths when initializing TOTModel:

```python
# Default behavior (uses built-in paths)
model = TOTModel(ckpt_dir='path/to/model')

# Custom data paths
model = TOTModel(
    ckpt_dir='path/to/model',
    data_paths=[
        'path/to/data1.jsonl',
        'path/to/data2.jsonl',
        'path/to/data3.txt'
    ]
)
```

**Training Data Requirements:**
The TOT system allows users to customize training datasets by passing them through the `data_paths` parameter. The system will load and process training data based on the paths provided by the user.

## Supported File Formats

The TOT system supports the following file formats:

- .jsonl files (JSON Lines format)
- .txt files (text files)

## Supported Data Structures
The system can identify and process two main data structures:

### 1. Conversations Format
```json
{
  "conversations": [
    {"content": "User question"},
    {"content": "Assistant answer"}
  ]
}
```

### 2. Instruction-Output Format
```json
{
  "instruction": "Instruction content",
  "output": "Output content"
}
```

**Memory Optimization:**
- Automatic GPU memory cleanup
- In-memory fine-tuning without disk writes
- Batch processing for efficient training

---

## Image Feature Extraction and Multimodal Dialogue

### Dual-vision Solution

In version 1.2.0, we introduced an innovative dual-vision solution:
1. **Image Compression to Feature (img_zip)**: Convert images to text tokens that can be inserted anywhere in the dialogue.
2. **Native Vision Encoder**: Pass the latest image to the native vision model's vision encoder (standard approach).

This solution achieves:
- Detailed analysis of the latest single image based on the native vision encoder
- Understanding of multiple images in the context based on img_zip technology
- Significant reduction in computing resource requirements

### Vision Model Usage Guidelines

For vision-enabled models, regardless of whether there is image input, you should use the following code:

```python
from xiaothink.llm.inference.test_formal import QianyanModel

if __name__ == '__main__':
    model = QianyanModel(
        ckpt_dir=r'path/to/your/vision_model',
        MT='t6_standard_vision',  # Note: model type is vision model
        vocab=r'path/to/your/vocab.txt',
        imgzip_model_path='path/to/img_zip/model.keras'  # Specify img_zip model path
    )

    temp = 0.28  # Temperature parameter
    
    while True:
        inp = input('[Q]: ')
        if inp == '[CLEAN]':
            print('[Context Cleared]\n\n')
            model.clean_his()
            continue
        # Use chat_vision for dialogue
        ret = model.chat_vision(inp, temp=temp, pre_text='', pass_start_char=[])
        print('\n[A]:', ret, '\n')
```

**Important Notes**:
- Vision models must use the `chat_vision` method; do not use `chat` (which is only for text-only models)
- You must prepare an img_zip image compression encoder model that matches the vision model
- Mismatched models will cause the model to fail to understand the meaning of encoded tokens

### Image Processing Interfaces

Two new image processing interfaces have been added:

1. **img2ms** (for non-native vision models):
   ```python
   description = model.img2ms('path/to/image.jpg', temp=0.28)
   print(description)
   ```

2. **img2ms_vision** (for native vision models):
   ```python
   description = model.img2ms_vision('path/to/image.jpg', temp=0.28, max_shape=224)
   print(description)
   ```

### Image Reference Syntax

In dialogue, use the following syntax to reference images:
```python
<img>image path or URL</img>Please describe this image
```

The model will automatically parse the image path, extract features, and answer based on the image content.

**Notes**:
1. Image paths should use absolute paths to ensure correct parsing
2. Native vision models only support analyzing the most recent image
3. img_zip technology supports referencing multiple images in the context

---

## Image Compression to Feature Technology (img_zip)

The `img_zip` module provides advanced image and video compression/decompression functions based on deep learning feature extraction technology. Below are the detailed usage methods:

### 1. Command-line Interactive Mode

```bash
python -m xiaothink.llm.img_zip.img_zip
```

After running, you will enter an interactive command-line interface:

```
===== img_zip Image Video Compression Tool =====
Please enter .keras model path: path/to/your/imgzip_model.keras
Model loaded successfully!

Please select a function:
1. Compress image
2. Decompress image
3. Compress video
4. Decompress video
0. Exit

Please select (0-6):
```

### 2. Python Code Invocation

```python
from xiaothink.llm.img_zip.img_zip import ImgZip

# Initialize instance
img_zip = ImgZip(model_path='path/to/your/imgzip_model.keras')

# Compress image
compressed_path = img_zip.compress_image(
    img_path='input.jpg',
    patch=True,  # Whether to use patch processing
    save_path='compressed_img'  # Save path prefix
    ability=0.02,# New feature in 1.2.5: Set custom compression rate to 0.02 (when ability is 0, it means not using custom compression rate). The algorithm calculates and compresses to a close size (there may be errors between theoretical calculation and actual size)
)

# Generates two files: compressed_img.npy and compressed_img.shape

# Decompress image
img_zip.decompress_image(
    compressed_input='compressed_img',  # Compressed file prefix
    patch=True,  # Whether to use patch processing
    save_path='decompressed.jpg'  # Output path
)

# Compress video
compressed_paths, metadata_path = img_zip.compress_video(
    video_path='input.mp4',
    output_dir='compressed_video',  # Output directory
    patch=True  # Whether to use patch processing
)

# Decompress video
img_zip.decompress_video(
    compressed_dir='compressed_video',  # Compressed file directory
    output_path='decompressed.mp4'  # Output path
)

# Convert image to array and save
img_array = img_zip.image_to_array('input.jpg')
img_zip.save_image_array(img_array, 'image_array.npy')

# Load image from array
loaded_array = img_zip.load_image_array('image_array.npy')
img = img_zip.array_to_image(loaded_array)
img.save('restored.jpg')
```

### 3. Key Function Descriptions

1. **Compress Image** (`compress_image`)
   - `patch=True`: Split large images into 80x80 patches for separate processing
   - Outputs two files: `.npy` (feature vectors) and `.shape` (original size information)

2. **Decompress Image** (`decompress_image`)
   - Requires both `.npy` and `.shape` files
   - Automatically restores original dimensions

3. **Video Processing** (`compress_video`/`decompress_video`)
   - Automatically extracts video frames and processes them in batches
   - Preserves original video frame rate and resolution information
   - Uses temporary directories for intermediate file processing

#### 4. Parameter Descriptions

| Parameter | Type | Description |
|-----------|------|-------------|
| `model_path` | str | Path to img_zip model (.keras file) |
| `patch` | bool | Whether to use patch processing (default: True) |
| `save_path` | str | Output file path prefix |
| `img_path` | str | Input image path |
| `video_path` | str | Input video path |
| `output_dir` | str | Output directory path |
| `output_path` | str | Output file path |

#### 5. Processing Flow Features

1. **Patch Processing**:
   - Automatically splits large images into 80x80 patches
   - Each patch is independently encoded into feature vectors
   - Preserves original size information

2. **Video Processing**:
   - Automatically extracts frames and processes them in batches
   - Preserves original video parameters (fps, resolution)
   - Uses temporary directories for intermediate file processing

3. **Progress Display**:
   - All operations come with detailed progress bars
   - Displays current processing step and remaining time

4. **Error Handling**:
   - Comprehensive exception catching mechanism
   - Detailed error information prompts

#### 6. Usage Recommendations

1. For images larger than 80x80, it is recommended to use patch processing (`patch=True`)
2. Video processing requires sufficient disk space for temporary frame files
3. Ensure the input model matches the processing task
4. Use absolute paths to avoid file location issues

This module is the core component of Xiaothink vision models (especially non-native ones). Based on efficient image feature representation and compression, it can enable any text-only AI model to have basic vision capabilities through fine-tuning.

---

## Sentiment Classification Tool

The sentiment classification tool is based on loaded dialogue models and provides text sentiment tendency analysis functionality, which can quickly determine the sentiment category of input text (e.g., positive, negative, neutral, etc.).

### Feature Description
- This tool is a customized interface based on Xiaothink framework (Xiaothink T6 series, etc.) models
- Implements sentiment classification based on Xiaothink framework language models without the need to load additional classification models
- Supports input of ultra-long text and returns sentiment analysis results
- It is recommended to use single-turn dialogue enhanced models, such as: Xiaothink-T6-0.15B-ST

### Usage Example

```python
from xiaothink.llm.inference.test_formal import *
from xiaothink.llm.tools.classify import *

if __name__ == '__main__':
    # Initialize basic dialogue model
    model = QianyanModel(
        ckpt_dir=r'path/to/your/t6_model',  # Model weight directory  It is recommended to use _ST version models
        MT='t6_standard',  # Model type (must match weights)
        vocab=r'path/to/your/vocab.txt',  # Vocabulary path
        use_patch=0  # Do not use patch processing (text-only model)
    )
    
    # Initialize sentiment classification model (depends on basic dialogue model)
    cmodel = ClassifyModel(model)
    
    # Loop input text for sentiment classification
    while True:
        inp = input('Enter text: ')
        res = cmodel.emotion(inp)  # Call sentiment classification interface
        print(res)  # Output sentiment analysis results
```

### Notes
1. The sentiment classification model depends on an initialized `QianyanModel`; ensure the base model is loaded successfully
2. It is recommended to use instruction-tuned models (e.g., `t6_standard`); non-tuned models may affect classification accuracy
3. The output result format is: {'Positive': 0.6667, 'Negative': 0.1667, 'Neutral': 0.1667}

---
## AI Rate Detection Tool
The AI rate detection tool is based on loaded detection models and provides text AI generation probability analysis functionality. It can accurately determine the AI generation probability of each character in the text, output the overall AI rate average, and return detailed character-level detection information, achieving comprehensive traceability analysis of text AI generation traces.

### Feature Description
- This tool is a customized interface based on Xiaothink framework (Xiaothink T series, etc.) models
- Implements text AI rate analysis based on Xiaothink framework detection models without the need to load independent detection models
- Supports ultra-long text detection and batch text detection, returning multi-dimensional complete detection results
- Can output **four levels of results: overall AI rate average, detection conclusion, probability statistics information, and character-level detailed information**

### Usage Example
```python
if __name__ == "__main__" and 1:
    # 1. Initialize detector
    detector = AIDetector(
        ckpt_dir=r'E:\Xiaothink Framework\Paper\ganskchat\ckpt_test_t7',
        model_type='t7',
        print_load_info=True
    )

    # 2. Detect text(Some of models may not support Chinese text detection)
    test_texts = [
        "This is a sentence that a car repair blogger active on mobile internet used to start many of his videos before being sued by BYD. Finally, this 'most miserable repairman in history' has received the first-instance judgment of being sued by BYD.",
        "\"Isn't it,\" Grandma looked up at the osmanthus tree, her eyes filled with gentle memories, \"This was planted by your grandfather back then, almost thirty years ago. At that time, he said, planting an osmanthus tree, it will bloom in autumn, fragrant and beautiful, and when we have children, we can make osmanthus cake to eat.\"",
        "These days, my heart has been quite unsettled. Sitting in the yard enjoying the cool air tonight, I suddenly thought of the lotus pond I pass by every day. In this moonlight of the full moon, it should have a different appearance. The moon gradually rose higher, and the laughter of children on the road outside the wall could no longer be heard; my wife was patting Run'er inside the house, humming a lullaby drowsily. I quietly put on my large shirt and went out the door."
    ]

    # 3. Execute detection
    for text in test_texts:
        print(f"\n{'='*60}")
        print(f"Detected Text: {text}")
        result = detector.detect_ai_rate(text)
        
        print(f"AI Rate (Probability Average): {result['AI Rate (Probability Average)']}")
        print(f"Detection Conclusion: {result['Detection Conclusion']}")
        print(f"Probability Statistics: Min={result['Probability Statistics']['Minimum Probability']} | Max={result['Probability Statistics']['Maximum Probability']}")
        
        # Optional: Print character-level details
        print("\nCharacter-level Details:")
        for detail in result['Character-level Details']:
            print(f"  Position {detail['Character Position']}: Previous Text 「{detail['Complete Previous Text']}」→ Character 「{detail['Target Character']}」→ Probability {detail['Prediction Probability']}")

    # 4. Release resources
    detector.close()
```

### Notes
1. When initializing the AI rate detector, ensure `ckpt_dir` points to the correct T7 series model weight directory; otherwise, model loading will fail
2. **Core Accuracy Note**: This tool has **relatively accurate** AI rate detection results for **small model-generated text**, which can meet the traceability needs of small model-generated content; however, it has poor AI rate detection effect for **large model-generated text**, and the detection results have low reference value. It is strictly prohibited to use this tool for AI determination scenarios of large model-generated content
3. After detection is completed, you must call the `detector.close()` method to release resources such as video memory and hardware handles to avoid memory leaks and excessive video memory usage caused by long-term operation
4. Character-level details are optional output items. For ultra-long text of ten thousand characters, printing these details will significantly increase output time and can be selectively printed according to actual needs
5. When detecting a large number of texts in batches, it is recommended to process them in batches according to text length to avoid detection lag caused by passing too many ultra-long texts in a single batch
6. Enabling `print_load_info=True` when loading the model allows you to view loading progress and hardware adaptation information, which is convenient for troubleshooting model loading exceptions

---
Xiaothink framework series model names, their corresponding MT (model architecture version), and form (model prompt input format) list:
| Model Name (by release time)              | mt parameter           | form parameter   |
|-----------------------|------------------|-------------|
| Xiaothink-T7.5-0.1B | mt='t7.5_paddle_small_instruct' | form=2 |
| Xiaothink-T7-ART(0.07B)| mt='t7_cpu_standard'    | form=1 |
| Xiaothink-T6-0.08B       | mt='t6_beta_dense'| form=1      |
| Xiaothink-T6-0.15B       | mt='t6_standard' | form=1      |
| Xiaothink-T6-0.02B       | mt='t6_fast'     | form=1      |
| Xiaothink-T6-0.5B        | mt='t6_large'    | form=1      |
| Xiaothink-T6-0.5B-pretrain| mt='t6_large'    | form='pretrain' |

---

## Changelog
### Version 1.4.1 (2026-02-16)
- **New Module**:
  - Added `xiaothink.llm.inference_paddle` module for PaddlePaddle-based inference
  - Supports Xiaothink-T7.5 series models with RWKV architecture
  - Provides `TextGenerator` and `QianyanModel` classes for PaddlePaddle
- **Updated Dependencies**:
  - Removed TensorFlow as a required dependency (now optional)
  - Added PaddlePaddle as the primary deep learning framework
  - Added jieba for Chinese word segmentation
- **Model Support**:
  - Added support for MT architectures: 't7.5_paddle_small_instruct', 't7.5_paddle_small_instruct_pro', etc.
  - Supports automatic device selection (CPU/GPU) based on memory usage

### Version 1.4.0 (2026-02-16)[Yanked]
- **Note**: This version has been yanked due to issues with the README.md file content. Please use the updated version instead.

### Version 1.3.2 (2025-12-27)
- **Updated Interfaces**:
  - Added "AI Rate Detection" interface based on Xiaothink-T series models.
- **New Models**:
  - Added support for MT architectures "t7" and "t7_cpu_standard" in the Xiaothink-T7 series models.

### Version 1.3.1 (2025-10-31)
- **Updated Interfaces**:
  - Added custom input shape (must be supported by the corresponding model) for vision-related interfaces instead of the fixed 80*80*3 in previous versions
  - The ImgZIP command-line interface also added custom input shape (must be supported by the corresponding model) instead of the fixed 80*80*3 in previous versions, and added comprehensive quality scores based on SNR, PSNR, and SSIM.

### Version 1.3.0 (2025-10-17)[Yanked]
- **New Models**:
  - Added support for the Xiaothink-T7 series model architecture.

### Version 1.2.5 (2025-09-02)
- **Updated Interfaces**:
  - Added "custom compression rate" function to the ImgZIP command-line interface, supporting other compression rates beyond the model's native compression rate (implemented based on calculating and scaling the original image).

### Version 1.2.4 (2025-08-30)
- **Updated Interfaces**:
  - Updated the import method of ImgZIP-related interfaces in the documentation to: from xiaothink.llm.img_zip.img_zip import ImgZip

### Version 1.2.3 (2025-08-30)
- **New Features**:
  - Added Xiaothink-T6-0.02B series models (MT='t6_fast')
  - Added Xiaothink-T6-0.5B series models (MT='t6_large')
  - Added support for form='pretrain' in the model.chat method. For instruction-tuned models in the T6 series, form=1 should be used; for pre-trained models, form='pretrain' should be used

### Version 1.2.2 (2025-08-18)
- **New Features**:
  - Added sentiment classification tool to implement text sentiment tendency analysis through `ClassifyModel`
  - Added `xiaothink.llm.tools.classify` module to support sentiment classification based on basic dialogue models
  - Provided `cmodel.emotion(inp)` interface to return real-time text sentiment results

### Version 1.2.1 (2025-08-16)
- **New Models**:
  - Added Xiaothink-T6-0.15B series models (MT='t6_standard')

### Version 1.2.0 (2025-08-08)
- **Breakthrough Innovation**:
  - Added support for native vision models using an innovative dual-vision solution
  - Dual-path processing of image compression to feature tokens (img_zip) + native vision encoder
  - Retains multi-image context understanding capability while achieving single-image detail analysis

- **New Interfaces**:
  - `model.chat_vision`: Specialized dialogue interface for vision models
  - `model.img2ms`: Image description interface for non-native vision models
  - `model.img2ms_vision`: Image description interface for native vision models (supports max_shape parameter)
  
- **Module Expansion**:
  - Added `xiaothink.llm.img_zip.img_zip` command-line tool
  - Supports compression and decompression of images and videos
  - Provides rich parameters to adjust compression quality

- **Usage Guidelines**:
  - Vision models must use the `chat_vision` method
  - Must use a matching img_zip encoder model
  - Image paths should use absolute paths

### Version 1.1.0 (2025-08-02)
- **New Features**:
  - Added `img2ms` and `ms2img` interfaces to achieve high compression ratio lossy compression of images
  - Supports converting images into AI-readable feature tokens
  - Extended dialogue models to support multimodal input (image + text)
  - In test_formal, it supports converting feature tokens generated by multimodal AI into images and saving them to the system temporary folder by default.
  
- **Technical Upgrades**:
  - Based on Xiaothink framework's self-developed img_zip technology
  - Supports intelligent compression of 80x80x3 image patches
  - When outputting 96 feature values, combined with .7z algorithm, it can achieve an ultra-high compression ratio of 10%
  
- **Usage Method**:
  - Insert images using the `<img>{image_path}</img>` tag in dialogue
  - Need to specify the img_zip model path when initializing the model
  - Supports multimodal dialogue (image description, image Q&A, and other scenarios)

---

The above covers the main functions and usage methods of the Xiaothink Python module.

If you have any questions or suggestions, please feel free to contact us: xiaothink@foxmail.com.

---

# Xiaothink Python 模块使用文档

[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
Xiaothink 是一个以自然语言处理（NLP）为核心的AI研究组织，致力于在少数据、低算力下训练出先进的端侧模型。Xiaothink Python 模块是我们提供的核心工具包，涵盖了文本问答、图文问答、图像压缩、情感分类等多种功能。以下是详细的使用指南和代码示例。

## 目录
1. [安装](#安装)
2. [本地对话模型](#本地对话模型)
3. [图像特征提取与多模态对话](#图像特征提取与多模态对话)
4. [图像压缩转特征技术 (img_zip)](#图像压缩转特征技术-img_zip)
5. [情感分类工具](#情感分类工具)
5. [AI率检测工具](#AI率检测工具)
6. [更新日志](#更新日志)

---

## 安装

首先，您需要通过 pip 安装 Xiaothink 模块：

```bash
pip install xiaothink
```

---

## License

This project is licensed under the Apache License, Version 2.0 - see the [LICENSE](LICENSE) file for details.

The [NOTICE](NOTICE) file contains additional attribution information for the proprietary technologies included in this module.

---

## 本地纯文本对话模型

对于本地加载的对话模型，根据模型类型的不同，应调用相应的函数来进行对话。

### 单轮对话（即将在未来版本被移除）

适用于单轮对话场景。

### 示例代码

```python
import xiaothink.llm.inference.test_formal as tf

model = tf.QianyanModel(
    ckpt_dir=r'path/to/your/t6_model',
    MT='t6_beta_dense',
    vocab=r'path/to/your/vocab'# vocab文件在模型储存库中已给出
)

while True:
    inp = input('【问】：')
    if inp == '[CLEAN]':
        print('【清空上下文】\n\n')
        model.clean_his()
        continue
    re = model.chat_SingleTurn(inp, temp=0.32)  # 使用 chat_SingleTurn 进行单轮对话
    print('\n【答】：', re, '\n')
```

### 多轮对话

适用于多轮对话场景。

### 示例代码

```python
import xiaothink.llm.inference.test_formal as tf

model = tf.QianyanModel(
    ckpt_dir=r'path/to/your/t6_model',
    MT='t6_beta_dense',
    vocab=r'path/to/your/vocab'# vocab文件在模型储存库中已给出
)

while True:
    inp = input('【问】：')
    if inp == '[CLEAN]':
        print('【清空上下文】\n\n')
        model.clean_his()
        continue
    re = model.chat(inp, temp=0.32)  # 使用 chat 进行多轮对话
    print('\n【答】：', re, '\n')
```

### 文本续写

适用于更灵活的文本续写场景

### 示例代码

```python
import xiaothink.llm.inference.test as test

MT = 't6_beta_dense'
m, d = test.load(
    ckpt_dir=r'path/to/your/t6_model',
    MT='t6_beta_dense',
    vocab=r'path/to/your/vocab'# vocab文件在模型储存库中已给出
)

inp='你好！'
belle_chat = '{"conversations": [{"role": "user", "content": {inp}}, {"role": "assistant", "content": "'.replace('{inp}', inp)    # t6系列中经过指令微调的模型支持的instruct格式
inp_m = belle_chat

ret = test.generate_texts_loop(m, d, inp_m,    
                               num_generate=100,
                               every=lambda a: print(a, end='', flush=True),
                               temperature=0.32,
                               pass_char=['▩'])    #▩是t6系列模型的<unk>标识
```

**重要提示**：对于本地模型，建议调用 `model.chat` 函数进行多轮对话，未进行指令微调的预训练模型建议调用 `test.generate_texts_loop` 函数。** 单轮对话的 `model.chat_SingleTurn` 函数即将在未来版本被移除 **

---

## 基于 PaddlePaddle 的模型（v1.4.0 新增）

Xiaothink 现已支持基于 PaddlePaddle 的 混合RWKV 架构模型。这些模型提供高效的推理能力，适合资源受限的环境。

### 安装

```bash
pip install xiaothink
pip install paddlepaddle  # 或 paddlepaddle-gpu 用于 GPU 支持
```

### 多轮对话（PaddlePaddle）

```python
from xiaothink.llm.inference_paddle import QianyanModel

model = QianyanModel(
    ckpt_dir=r'path/to/your/t7.5_model',
    MT='t7.5_paddle_small_instruct_pro'
)

while True:
    inp = input('[问]: ')
    if inp == '[CLEAN]':
        print('[清空上下文]\n\n')
        model.clean_his()
        continue
    re = model.chat(inp, temp=0.34, form=2)  # form=2 为简化格式
    print('\n[答]:', re, '\n')
```

### 文本生成（PaddlePaddle）

```python
from xiaothink.llm.inference_paddle import TextGenerator

generator = TextGenerator(
    checkpoint_path=r'path/to/your/t7.5_model',
    MT='t7.5_paddle_small_instruct_pro'
)
generator.load_model()

generated_text = generator.generate_text(
    prompt='<|U|>你好，最近怎么样？<|A|>',
    max_length=100,
    temperature=0.8,
    top_p=0.9,
    repetition_penalty=1.2
)
print(generated_text)
```

### 支持的模型架构（PaddlePaddle）

| 模型名称 | MT 参数 | 描述 |
|----------|---------|------|
| Xiaothink-T7.5-0.1B | 't7.5_paddle_small_instruct'(_pro) | 基础指令模型 |


### 自动设备选择

PaddlePaddle 模块会根据 GPU 内存使用率自动选择最优设备：

```python
# 自动设备选择（默认）
# 如果 GPU 内存使用率 > 80%，则切换到 CPU
AUTO_DEVICE = True
GPU_MEMORY_THRESHOLD = 80.0

# 或手动设置设备
import paddle
paddle.set_device('cpu')  # 或 'gpu:0'
```

### Train-On-Time (TOT) 动态学习

Xiaothink 提供了创新的 Train-On-Time (TOT) 功能，支持推理时动态学习。与传统模型不同，TOT 模型会从你的训练数据仓库中持续学习相似示例，实现实时知识更新。该技术可以大幅提升模型的基础能力。但需要注意的是，TOT 模型的推理速度会受到一定影响，因为它需要在每次对话前进行相似度匹配和微调。且该技术无法大幅度提升模型的数学和推理能力，只能用于提升模型基础能力。

**TOT 工作原理：**
1. **相似度匹配**：自动从训练数据中找到相似的指令
2. **动态微调**：使用这些相似示例在内存中微调模型
3. **增强响应**：基于新学习的知识生成更准确的答案
4. **内存管理**：优化 GPU 内存使用并清理资源

**核心特性：**
- **实时学习**：每次对话都会从相关示例中学习
- **基于相似度匹配**：使用 difflib 查找语义相似的指令
- **并行处理**：多核心相似度计算，匹配速度更快
- **多格式支持**：加载各种检查点格式的模型
- **GPU 优化**：自动 GPU 内存管理

```python
from xiaothink.llm.inference_paddle import TOTModel, TOT_AVAILABLE

if TOT_AVAILABLE:
    # 自定义训练数据路径
    custom_data_paths = [
        r'path/to/your/belle_train.jsonl',
        r'path/to/your/coig_minimind.jsonl',
        r'path/to/your/firefly_data.jsonl'
    ]
    
    model = TOTModel(
        ckpt_dir=r'path/to/your/t7.5_model',
        MT='t7.5_paddle_small_instruct_pro',
        data_paths=custom_data_paths  # 自定义训练数据路径
    )
    
    # 模型会在回答前自动从相似示例中学习
    while True:
        inp = input('[问]: ')
        if inp == '[CLEAN]':
            model.clean_his()
            continue
        re = model.chat(inp, temp=0.68)
        print('\n[答]:', re, '\n')
else:
    print("TOT 功能需要 PaddlePaddle 支持")
```

**自定义数据路径：**
您现在可以在初始化 TOTModel 时指定自己的训练数据路径：

```python
# 默认行为（使用内置路径）
model = TOTModel(ckpt_dir='path/to/model')

# 自定义数据路径
model = TOTModel(
    ckpt_dir='path/to/model',
    data_paths=[
        'path/to/data1.jsonl',
        'path/to/data2.jsonl',
        'path/to/data3.txt'
    ]
)
```

**训练数据要求：**
TOT 系统允许用户自定义训练数据集，通过 `data_paths` 参数传入。系统会根据用户提供的路径加载和处理训练数据。

## 支持的文件格式

TOT 系统支持以下文件格式：

- .jsonl 文件（JSON Lines 格式）
- .txt 文件（文本文件）

## 支持的数据结构
系统能够识别和处理两种主要的数据结构：

### 1. 对话格式（Conversations）
```json
{
  "conversations": [
    {"content": "用户问题"},
    {"content": "助手回答"}
  ]
}
```

### 2. 指令-输出格式（Instruction-Output）
```json
{
  "instruction": "指令内容",
  "output": "输出内容"
}
```

**内存优化：**
- 自动 GPU 内存清理
- 内存中微调，无需磁盘写入
- 批处理实现高效训练

---

## 图像特征提取与多模态对话

### 双视觉方案

在1.2.0版本中，我们引入了创新的双视觉方案：
1. **图像压缩转特征(img_zip)**：将图像转为文本token插入在对话的任何位置
2. **原生视觉编码器**：将最新的一张图片传入原生视觉模型的视觉编码器（标准做法）

这种方案实现了：
- 基于原生视觉编码器对最新单图进行细节分析
- 基于img_zip技术对上下文中多图的理解能力
- 大幅降低算力资源需求

### 视觉模型使用规范

对于支持视觉的模型，无论是否有图像输入，都应使用以下代码：

```python
from xiaothink.llm.inference.test_formal import QianyanModel

if __name__ == '__main__':
    model = QianyanModel(
        ckpt_dir=r'path/to/your/vision_model',
        MT='t6_standard_vision',  # 注意模型类型为视觉模型
        vocab=r'path/to/your/vocab.txt',
        imgzip_model_path='path/to/img_zip/model.keras'  # 指定img_zip模型路径
    )

    temp = 0.28  # 温度参数
    
    while True:
        inp = input('【问】：')
        if inp == '[CLEAN]':
            print('【清空上下文】\n\n')
            model.clean_his()
            continue
        # 使用chat_vision进行对话
        ret = model.chat_vision(inp, temp=temp, pre_text='', pass_start_char=[])
        print('\n【答】：', ret, '\n')
```

**重要提示**：
- 视觉模型必须使用 `chat_vision` 方法，不能使用 `chat`（仅适用于纯文本模型）
- 必须提前准备好与视觉模型匹配的img_zip图像压缩编码器模型
- 不匹配的模型会导致模型无法理解编码后的token含义

### 图像处理接口

新增两种图像处理接口：

1. **img2ms**（适用于非原生视觉模型）：
   ```python
   description = model.img2ms('path/to/image.jpg', temp=0.28)
   print(description)
   ```

2. **img2ms_vision**（适用于原生视觉模型）：
   ```python
   description = model.img2ms_vision('path/to/image.jpg', temp=0.28, max_shape=224)
   print(description)
   ```

### 图像引用语法

在对话中，使用以下语法引用图像：
```python
<img>图像路径或URL</img>请你描述这张图片
```

模型将自动解析图像路径并提取特征，然后根据图像内容进行回答。

**注意事项**：
1. 图像路径需使用绝对路径以确保正确解析
2. 原生视觉模型只支持分析最近的一张图像
3. img_zip技术支持在上下文中引用多张图像

---

## 图像压缩转特征技术 (img_zip)

`img_zip` 模块提供了先进的图像和视频压缩/解压功能，基于深度学习的特征提取技术。以下是详细的使用方法：

### 1. 命令行交互模式

```bash
python -m xiaothink.llm.img_zip.img_zip
```

运行后会进入交互式命令行界面：

```
===== img_zip 图像视频压缩工具 =====
请输入.keras模型路径: path/to/your/imgzip_model.keras
模型加载完成!

请选择功能:
1. 压缩图像
2. 解压图像
3. 压缩视频
4. 解压视频
0. 退出

请选择 (0-6): 
```

### 2. Python 代码调用

```python
from xiaothink.llm.img_zip.img_zip import ImgZip

# 初始化实例
img_zip = ImgZip(model_path='path/to/your/imgzip_model.keras')

# 压缩图像
compressed_path = img_zip.compress_image(
    img_path='input.jpg',
    patch=True,  # 是否使用分块处理
    save_path='compressed_img'  # 保存路径前缀
    ability=0.02,# 1.2.5新特性：设置自定义压缩率为0.02（当ability为0时代表不使用自定义压缩率），算法计算并压缩至接近的大小（理论计算与实际大小存在误差）
)

# 生成两个文件: compressed_img.npy 和 compressed_img.shape

# 解压图像
img_zip.decompress_image(
    compressed_input='compressed_img',  # 压缩文件前缀
    patch=True,  # 是否使用分块处理
    save_path='decompressed.jpg'  # 输出路径
)

# 压缩视频
compressed_paths, metadata_path = img_zip.compress_video(
    video_path='input.mp4',
    output_dir='compressed_video',  # 输出目录
    patch=True  # 是否使用分块处理
)

# 解压视频
img_zip.decompress_video(
    compressed_dir='compressed_video',  # 压缩文件目录
    output_path='decompressed.mp4'  # 输出路径
)

# 图像转数组并保存
img_array = img_zip.image_to_array('input.jpg')
img_zip.save_image_array(img_array, 'image_array.npy')

# 从数组加载图像
loaded_array = img_zip.load_image_array('image_array.npy')
img = img_zip.array_to_image(loaded_array)
img.save('restored.jpg')
```

### 3. 关键功能说明

1. **压缩图像** (`compress_image`)
   - `patch=True`: 将大图切分为80x80块分别处理
   - 输出两个文件: `.npy` (特征向量) 和 `.shape` (原始尺寸信息)

2. **解压图像** (`decompress_image`)
   - 需要`.npy`和`.shape`两个文件
   - 自动恢复原始尺寸

3. **视频处理** (`compress_video`/`decompress_video`)
   - 自动提取视频帧并批量处理
   - 保留原始视频的帧率、分辨率信息
   - 使用临时目录处理中间文件




#### 4. 参数说明

| 参数 | 类型 | 说明 |
|------|------|------|
| `model_path` | str | img_zip模型路径 (.keras文件) |
| `patch` | bool | 是否使用分块处理 (默认为True) |
| `save_path` | str | 输出文件路径前缀 |
| `img_path` | str | 输入图像路径 |
| `video_path` | str | 输入视频路径 |
| `output_dir` | str | 输出目录路径 |
| `output_path` | str | 输出文件路径 |

#### 5. 处理流程特点

1. **分块处理**:
   - 大图自动分割为80x80块
   - 每块独立编码为特征向量
   - 保留原始尺寸信息

2. **视频处理**:
   - 自动提取帧并批量处理
   - 保留原始视频参数 (fps, 分辨率)
   - 使用临时目录处理中间文件

3. **进度显示**:
   - 所有操作都带详细进度条
   - 显示当前处理步骤和剩余时间

4. **错误处理**:
   - 完善的异常捕获机制
   - 详细的错误信息提示

#### 6. 使用建议

1. 对于大于80x80的图像，推荐使用分块处理 (`patch=True`)
2. 视频处理需要足够磁盘空间存放临时帧文件
3. 确保输入模型与处理任务匹配
4. 使用绝对路径避免文件定位问题

此模块为Xiaothink视觉模型（尤其是非原生的模型）的核心组件，基于高效的图像特征表示和压缩，可以经过微调让任何纯文本AI模型都拥有基础的视觉能力。

---

## 情感分类工具

情感分类工具基于已加载的对话模型，提供文本情感倾向分析功能，可快速判断输入文本的情感类别（如积极、消极、中性等）。

### 功能说明
- 该工具是基于小思框架（Xiaothink T6系列等）模型的定制化接口
- 基于小思框架语言模型实现情感分类，无需额外加载分类模型
- 支持输入超长文本并返回情感分析结果
- 建议使用单论对话增强模型，例如：Xiaothink-T6-0.15B-ST

### 使用示例

```python
from xiaothink.llm.inference.test_formal import *
from xiaothink.llm.tools.classify import *

if __name__ == '__main__':
    # 初始化基础对话模型
    model = QianyanModel(
        ckpt_dir=r'path/to/your/t6_model',  # 模型权重目录  建议使用_ST版模型
        MT='t6_standard',  # 模型类型（需与权重匹配）
        vocab=r'path/to/your/vocab.txt',  # 词汇表路径
        use_patch=0  # 不使用分块处理（纯文本模型）
    )
    
    # 初始化情感分类模型（依赖基础对话模型）
    cmodel = ClassifyModel(model)
    
    # 循环输入文本进行情感分类
    while True:
        inp = input('输入文本：')
        res = cmodel.emotion(inp)  # 调用情感分类接口
        print(res)  # 输出情感分析结果
```

### 注意事项
1. 情感分类模型依赖已初始化的`QianyanModel`，需确保基础模型加载成功
2. 推荐使用经过指令微调的模型（如`t6_standard`），非微调模型可能影响分类精度
4. 输出结果格式为：{'积极': 0.6667, '消极': 0.1667, '中性': 0.1667}

---
## AI率检测工具
AI率检测工具基于已加载的检测模型，提供文本AI生成概率分析功能，可精准判定文本中各字符的AI生成概率、输出整体AI率均值，并返回精细化的字符级检测详情，实现文本AI生成痕迹的全方位溯源分析。

### 功能说明
- 该工具是基于小思框架（Xiaothink T系列等）模型的定制化接口
- 基于小思框架检测模型实现文本AI率分析，无需额外加载独立检测模型
- 支持超长文本检测、批量文本检测，返回多维度完整检测结果
- 可输出**整体AI率均值、检测结论、概率统计信息、字符级精细化详情** 四层维度结果

### 使用示例
```python
if __name__ == "__main__" and 1:
    # 1. 初始化检测器
    detector = AIDetector(
        ckpt_dir=r'E:\小思框架\论文\ganskchat\ckpt_test_t7',
        model_type='t7',
        print_load_info=True
    )

    # 2. 检测文本
    test_texts = [
        "这是一位活跃在移动互联网上的修车博主在被比亚迪起诉之前，很多期视频开头的一句话，而这位\"史上最惨修理工\"，终于迎来了被比亚迪起诉的一审判决。",
        "\"可不是嘛，\"奶奶抬眼望了望桂树，眼神里满是温柔的回忆，\"这是你爷爷当年栽的，算下来都快三十年了。那时候他说，栽棵桂树，以后秋天开花，又香又好看，等咱们有了孩子，还能做桂花糕吃。\"",
        "这几天心里颇不宁静。今晚在院子里坐着乘凉，忽然想起日日走过的荷塘，在这满月的光里，总该另有一番样子吧。月亮渐渐地升高了，墙外马路上孩子们的欢笑，已经听不见了；妻在屋里拍着闰儿，迷迷糊糊地哼着眠歌。我悄悄地披了大衫，带上门出去。"
    ]

    # 3. 执行检测
    for text in test_texts:
        print(f"\n{'='*60}")
        print(f"检测文本：{text}")
        result = detector.detect_ai_rate(text)
        
        print(f"AI率（概率平均值）：{result['AI率（概率平均值）']}")
        print(f"检测结论：{result['检测结论']}")
        print(f"概率统计：最小={result['概率统计信息']['最小概率']} | 最大={result['概率统计信息']['最大概率']}")
        
        # 可选：打印字符级详情
        print("\n字符级详情：")
        for detail in result['字符级详情']:
            print(f"  位置{detail['字符位置']}：前文「{detail['完整前文']}」→ 字符「{detail['目标字符']}」→ 概率{detail['预测概率']}")

    # 4. 释放资源
    detector.close()
```

### 注意事项
1. AI率检测器初始化时，需确保`ckpt_dir`指向正确的T7系列模型权重目录，否则会导致模型加载失败
2. **核心精度说明**：该工具对**小模型生成文本**的AI率检测结果**相对准确**，可满足小模型生成内容的溯源需求；但对**大模型生成文本**的AI率检测效果不佳，检测结果参考价值低，严禁将本工具用于大模型生成内容的AI判定场景
3. 检测完成后必须调用`detector.close()`方法释放显存、硬件句柄等资源，避免长时间运行造成内存泄漏、显存占用过高的问题
4. 字符级详情为可选输出项，针对万字级超长文本，打印该详情会显著增加输出耗时，可根据实际需求选择性打印
5. 批量检测大数量文本时，建议按文本长度分批处理，避免单批次传入过多超长文本导致检测卡顿
6. 模型加载时开启`print_load_info=True`可查看加载进度与硬件适配信息，便于排查模型加载异常问题

---
小思框架系列模型名称、其对应MT（模型架构版本）以及form（模型prompt传入格式）一览：
| 模型名称（按发布时间）              | mt 参数           | form 参数   |
|-----------------------|------------------|-------------|
| Xiaothink-T7.5-0.1B | mt='t7.5_paddle_small_instruct' | form=2 |
| Xiaothink-T7-ART(0.07B)| mt='t7_cpu_standard'    | form=1 |
| Xiaothink-T6-0.08B       | mt='t6_beta_dense'| form=1      |
| Xiaothink-T6-0.15B       | mt='t6_standard' | form=1      |
| Xiaothink-T6-0.02B       | mt='t6_fast'     | form=1      |
| Xiaothink-T6-0.5B        | mt='t6_large'    | form=1      |
| Xiaothink-T6-0.5B-pretrain| mt='t6_large'    | form='pretrain' |

---

## 更新日志
### 版本 1.4.1 (2026-02-16)
- **新增模块**：
  - 添加 `xiaothink.llm.inference_paddle` 模块，支持基于 PaddlePaddle 的推理
  - 支持 Xiaothink-T7.5 系列模型（RWKV 架构）
  - 提供 `TextGenerator` 和 `QianyanModel` 类用于 PaddlePaddle
- **更新依赖**：
  - 移除 TensorFlow 作为必需依赖（现为可选）
  - 添加 PaddlePaddle 作为主要深度学习框架
  - 添加 jieba 用于中文分词
- **模型支持**：
  - 添加 MT 架构支持：'t7.5_paddle_small_instruct', 't7.5_paddle_small_instruct_pro' 等
  - 支持基于 GPU 内存使用率的自动设备选择（CPU/GPU）

### 版本 1.4.0 (2026-02-16)[Yanked]
- **Note**：由于README.md文件内容有误，该版本已被标记为不推荐使用，请使用更新的版本。

### 版本 1.3.2 (2025-12-27)
- **更新接口**：
  - 添加了基于xiaothink-T系列模型的"AI率检测"接口。
- **新增模型**：
  - 添加了Xiaothink-T7系列模型中MT为"t7"与"t7_cpu_standard"的架构的支持。


### 版本 1.3.1 (2025-10-31)
- **更新接口**：
  - 为视觉相关接口添加了自定义输入shape（须对应模型支持）而非以前版本的固定80*80*3
  - ImgZIP命令行版接口也同步添加了自定义输入shape（须对应模型支持）而非以前版本的固定80*80*3，并加入了基于SNR、PSNR、SSIM的综合质量得分。


### 版本 1.3.0 (2025-10-17)[已Yank]
- **新增模型**：
  - 添加了Xiaothink-T7系列模型架构的支持。

### 版本 1.2.5 (2025-09-02)
- **更新接口**：
  - ImgZIP命令行版接口添加"自定义压缩率"功能，支持自定义模型原生压缩率之外的其他压缩率（基于计算并缩放原图实现）。

### 版本 1.2.4 (2025-08-30)
- **更新接口**：
  - 更新文档中ImgZIP相关接口的导入方法为：from xiaothink.llm.img_zip.img_zip import ImgZip

### 版本 1.2.3 (2025-08-30)
- **新增功能**：
  - 添加了Xiaothink-T6-0.02B系列模型（MT='t6_fast'）
  - 添加了Xiaothink-T6-0.5B系列模型（MT='t6_large'）
  - 在model.chat方法中添加了form='pretrain'的支持，t6系列指令微调的模型应使用form=1，预训练模型应使用form='pretrain'

### 版本 1.2.2 (2025-08-18)
- **新增功能**：
  - 新增情感分类工具，通过`ClassifyModel`实现文本情感倾向分析
  - 新增`xiaothink.llm.tools.classify`模块，支持基于基础对话模型的情感分类
  - 提供`cmodel.emotion(inp)`接口，实时返回文本情感结果

### 版本 1.2.1 (2025-08-16)
- **新增模型**：
  - 添加了Xiaothink-T6-0.15B系列模型（MT='t6_standard'）


### 版本 1.2.0 (2025-08-08)
- **突破性创新**：
  - 添加对原生视觉模型的支持，采用创新的双视觉方案
  - 图像压缩转特征token(img_zip) + 原生视觉编码器双路处理
  - 既保留多图上下文理解能力，又实现单图细节分析

- **新增接口**：
  - `model.chat_vision`：视觉模型专用对话接口
  - `model.img2ms`：非原生视觉模型图像描述接口
  - `model.img2ms_vision`：原生视觉模型图像描述接口（支持max_shape参数）
  
- **模块扩展**：
  - 新增 `xiaothink.llm.img_zip.img_zip` 命令行工具
  - 支持图像和视频的压缩与解压
  - 提供丰富的参数调节压缩质量

- **使用规范**：
  - 视觉模型必须使用 `chat_vision` 方法
  - 必须使用匹配的img_zip编码器模型
  - 图像路径需使用绝对路径

### 版本 1.1.0 (2025-08-02)
- **新增功能**：
  - 添加`img2ms`和`ms2img`接口，实现图像的高压缩率有损压缩
  - 支持将图像转换为AI可读的特征tokens
  - 扩展对话模型支持多模态输入（图像+文本）
  - test_formal中，默认支持将多模态AI生成的特征tokens转为图像并保存至系统临时文件夹。
  
- **技术升级**：
  - 基于小思框架自研的img_zip技术
  - 支持80x80x3图像块的智能压缩
  - 当输出为96个特征值时，结合.7z算法可实现10%超高压缩率
  
- **使用方式**：
  - 在对话中使用`<img>{image_path}</img>`标签插入图像
  - 初始化模型时需指定img_zip模型路径
  - 支持多模态对话（图像描述、图像问答等场景）


---

以上就是 Xiaothink Python 模块的主要功能及使用方法。

如有任何疑问或建议，请随时联系我们：xiaothink@foxmail.com.
