Metadata-Version: 2.1
Name: nanochain
Version: 0.3
Summary: NanoChain is a Python library designed to provide a suite of useful tools for text processing, vector indexing, and utilizing OpenAI's GPT API to perform various tasks. This is a minimalistic implementation of LangChain because it's too complicated and difficult to customize.
Home-page: https://github.com/Troyanovsky/nano_chain
Author: Troy (Guodong) Zhao
Author-email: gzhao1997@gmail.com
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Description-Content-Type: text/markdown

# NanoChain Library Documentation
NanoChain is a Python library designed to provide a suite of useful tools for text processing, vector indexing, and utilizing OpenAI's GPT-3.5-turbo model to perform various tasks.

## Features
- GPT API Completion
- Summary buffer memory for conversation summarization
- Text file and PDF file loader
- Text splitting for large texts
- Vector indexing using the ChromaDB library
- ReActAgent for performing tasks using available tools

## Installation
To install:
```
pip install nanochain
```

## Usage
In order to use the NanoChain library, import it into your Python script as shown below:
```
import nanochain

# Set up OpenAI Key
export OPENAI_API_KEY='sk-...'
# or
import openai
openai.api_key = "sk-..."
```

### GPT API Completion
The get_API_Response function is used to interact with the GPT-3.5-turbo API. It sends a prompt and returns the response from the model.

Input:
- prompt (string): The message for which you want to generate a response.  
- system_prompt (string, optional): The prompt to use as the system text in the conversation history. Default is "You are a helpful assistant".  
- stop (string, optional): A sequence of characters at which generation should stop.  

Output:
- response (string): The generated response from GPT-3.5-Turbo.  

Example usage:
```
response = nanochain.get_API_Response("What is the capital of France?")
print(response)
```

### SummaryBufferMemory
The SummaryBufferMemory class is used to store and summarize a conversation. It can be initialized with an optional word_limit parameter.

When using with GPT, you mainly pass the buffer_messages_string() and summary to GPT as the prompt.

Methods/Attributes:
- 'buffer_messages_string()' returns a string representation of all messages currently in the buffer.  
- 'summarize_messages()' generates a summary of all messages in the buffer and returns it as a string. The summary is generated using an API call that summarizes conversation history in 150 words or less.  
- 'add_message(role, message)' adds a new message to the buffer. The 'role' parameter is a string representing who sent the message (either "AI" or "human"), and 'message' is a string representing the content of the message.  
- 'delete_last_message()' removes the last message added to the buffer.  
- 'count_words(text)' counts the number of words in a given text string. 
- all_messages: A list of all messages.
- summary: A string containing the summary of the conversation so far.

Example usage:
```
# Create a new SummaryBufferMemory object with a word limit of 500
sbm = SummaryBufferMemory(word_limit=500)

# Add some messages to the buffer
sbm.add_message("AI", "Hello, how can I help you today?")
sbm.add_message("human", "I'm having trouble with my email.")
sbm.add_message("AI", "What seems to be the problem?")

# Generate a summary of the conversation so far
summary = sbm.summarize_messages()
print(summary)

# Delete the last message added to the buffer
sbm.delete_last_message()

# Add another message
sbm.add_message("human", "I keep getting an error message when I try to send an email.")

# Check the current contents of the buffer
buffer_str = sbm.buffer_messages_string()
print(buffer_str)
```

### File loader
The load_files_from_path function is used to load all the pdf and txt file in a directory or a single pdf/txt file. The function returns a list of tuples (file_name, content_string).

Example usage:
```
pdf_text = nanochain.load_files_from_path("document.pdf")
print(pdf_text[0][1])

txt_text = nanochain.load_files_from_path("document.txt")
print(txt_text[0][1])
```

### Text splitting
The text_splitter function is used to split a large text into smaller sections, optionally with a specified overlap between sections.

Example usage:
```
sections = nanochain.text_splitter(text, n_words=500, overlap=50)
print(sections)
```

### Vector Database
The vectorIndex class is used for indexing and querying text documents using the ChromaDB library.

Methods/Attributes:
- vectorIndex(documents): Initializes a new vector index object with the given documents. The documents parameter is a list of strings, where each string represents a document to be added to the index.
- add_documents(documents): Adds new documents to the index. The documents parameter is a list of strings, where each string represents a document to be added.
- query_documents(query_string, n_results=3): Executes a query against the index and returns the top n matching results as a dictionary. The query_string parameter is a string representing the query to execute, and n_results is an optional integer that specifies the maximum number of results to return (default value is 3).
- delete_documents(ids): Deletes the documents with the specified ids from the index. The ids parameter is a list of strings, where each string represents the id of a document to be deleted.
- persist(): saves the database to a local directory.

Example usage:
```
vector_index = nanochain.vectorIndex(documents=["Document 1", "Document 2"])
vector_index.add_documents(documents=["Document 3", "Document 4"])
results = vector_index.query_documents(query_string="Search query")
print(results)
```

### ReActAgent
The ReActAgent class is designed to perform tasks using available tools. You can add or remove tools, and run the agent with a given task string.

Methods/Attributes:
- ReActAgent(verbose=False, max_steps=10): Initializes a new ReActAgent object. The verbose parameter is an optional boolean that determines whether to print out the thought/action/action input/observation for each step (default is False). The max_steps parameter is an optional integer that specifies the maximum number of steps allowed before the agent stops (default is 10).
- add_tool(tool): Adds a new tool to the list of available tools. The tool parameter is a tuple with three elements: a string representing the name of the tool, a string representing the description of the tool, and a callable function that represents the implementation of the tool.
- remove_tool(tool_name): Removes a tool from the list of available tools by its name.
- get_action(task_string): Generates a string representation of the next action to take based on the current task. The task_string parameter is a string representing the input task/question.
- parse_and_execute(action_string): Parses an action string into its corresponding thought, action, action input, and observation, and executes the appropriate tool function. The action_string parameter is a string representing the action to take.
- run(task_string): Executes the entire ReAct process, taking steps until a final answer is generated or the maximum number of steps is reached. The task_string parameter is a string representing the input task/question.

Example usage:
```
agent = nanochain.ReActAgent(verbose=True, max_steps=10)
agent.add_tool(("New tool", "New tool description", function_name))
agent.remove_tool("New tool")

task_string = "Find the capital of France."
result = agent.run(task_string)
print(result)
```

### Recursive Summary
The get_recursive_summary function is used to generate a summary of a text using a recursive summarization algorithm.

Example usage:
```
summary = nanochain.get_recursive_summary(text)
```
