Metadata-Version: 2.4
Name: rica-server
Version: 0.0.1.dev1
Summary: Multi-threaded Reasoning for Large Language Models
Author-email: kaokao221 <kaokao221@outlook.com>
License-Expression: GPL-3.0-only
Project-URL: Repository, https://github.com/rica-team/rica-server.git
Keywords: llms,pytorch,nlp
Requires-Python: >=3.12
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: pt
Requires-Dist: transformers; extra == "pt"
Requires-Dist: torch; extra == "pt"
Requires-Dist: timm; extra == "pt"
Requires-Dist: accelerate; extra == "pt"
Requires-Dist: hf_xet; extra == "pt"
Dynamic: license-file

# RiCA: Multi-threaded Reasoning for Large Language Models

> Since OpenAI released ChatGPT, Large Language Models (LLMs) have become an integral part of our lives. From GPT to
> Gemini,from Claude to Grok, with the development of technologies like Function Calling and MCP, we are gradually
> moving towards AGI. However, even today, large models are still confined to a "single-threaded thinking" mode.
>
> Looking at human thought processes, searching for information and acquiring knowledge doesn't interrupt our thinking.
> Essentially, querying large models is also a form of "information lookup". Based on this insight, we developed
> Reasoning in Comprehensive Area (RiCA), aiming to introduce "multi-threading" capabilities to large models.

~~Today, we bring you RiCA's first demonstration set, including an Example that gives you a preliminary understanding of
RiCA's coding principles. We will release our first usable Beta as early as this week. Additionally, we will also bring
you RiCA's demo video. Thank you for your support!~~

~~Now, the project is closing to completion. We have given a demo on demo/example.py. Thanks for your support! It seems
that only connections (adapters) are missing and we will release an available version working with transformers and
torch in the next few days.~~

Now we released a beta version with Transformers Adapter working with PyTorch. The adapter is generated by Junie and
we are still in working on the modification for a stable version. Keep waiting for a beta version🤗🤗🤗

## 初次使用 (以基于 Transformers 的 PyTorch 适配器为例)

首先, 在你的应用中, 你需要创建一个 `ReasoningThread`(`rt`) 对象和一个 RiCA 应用程序, 特别地, Transformers (PyTorch) 的 RT
需要额外传入模型名称 (默认为 `google/gemma-3-1n`)：

```python
from rica import RiCA
from rica.connector import transformer_adapter as tf


app = RiCA()

rt = tf.ReasoningThread(model_name="google/gemma-3-1n")
```

这样, 你就拥有了一个 "线程". 显然, 这个"线程"当前是冻结状态, 我们需要激活它, 为此, 我们可以传入一些请求. 与大部分常用的交互方式不同,
RiCA 的所有双向的信息沟通都是通过使用工具实现的. RT 提供了一个内建方法用于向模型传入参数, 同时提供使用 RT 的 `@trigger` 装饰一个回调函数,
用于模型向外部发送消息:

```python
@rt.trigger
def callback(message):
    print(message)
```

演示方便, 我们不妨新建一个简单的 Python Exec 的包 (`package`) 来测试:

```python
@app.register("sys.python.exec", True, 1000)
async def _sys_python_exec(input_, *args, **kwargs):
    """
    A tool to execute Python code.
    input:{"code": "1+1"}
    output:{"result": "2"}
    """
    try:
        code = input_.get("code", "")
        result = eval(code)
        return {"result": str(result)}
    except Exception as e:
        return {"error": str(e)}
```

其中, `register` 需要传入至少一个，至多三个参数，分别是 包名 (`package`), 是否后台执行 (`background`) (通常情况下, 除了如响应信息一类的,
我们建议设定为 `True` 或使用缺省值) 和 超时时间 (`timeout`) (单位毫秒). 一切准备就绪，我们可以开始向模型发起请求了

```python
    rt.insert("Please calculate 123*456 using `sys.python.exec` package.")
rt.wait()
print(rt.context)
rt.destroy()
```

这里的 `wait` 是等待模型中止 (生成 `EOS` 标记). 在模型生成过程中, 你总是能随时修改 RiCA 类, 随时插入新的指令,
随时打印上下文甚至强制变更上下文, 一切由你决定.

更详细的文档, 我们将尽快完成, 感谢您的支持
