Metadata-Version: 2.1
Name: moatless-tree-search
Version: 0.0.2
Summary: 
Author: Albert Örwall
Author-email: albert@moatless.ai
Requires-Python: >=3.10,<=3.13
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Dist: anthropic (>=0.34.1,<0.35.0)
Requires-Dist: faiss-cpu (>=1.8.0.post1,<2.0.0)
Requires-Dist: gitpython (>=3.1.43,<4.0.0)
Requires-Dist: instructor (>=1.3.7,<2.0.0)
Requires-Dist: jsonref (>=1.1.0,<2.0.0)
Requires-Dist: litellm (>=1.44.22,<2.0.0)
Requires-Dist: llama-index (>=0.10.65,<0.11.0)
Requires-Dist: llama-index-embeddings-openai (>=0.1.11,<0.2.0)
Requires-Dist: llama-index-embeddings-voyageai (>=0.1.4,<0.2.0)
Requires-Dist: llama-index-readers-file (>=0.1.33,<0.2.0)
Requires-Dist: matplotlib (>=3.9.2,<4.0.0)
Requires-Dist: moatless-testbeds (>=0.0.5,<0.0.6)
Requires-Dist: networkx (>=3.3,<4.0)
Requires-Dist: numpy (>=1.0,<2.0)
Requires-Dist: openai (>=1.41.0,<2.0.0)
Requires-Dist: pandas (>=2.2.2,<3.0.0)
Requires-Dist: plotly (>=5.24.0,<6.0.0)
Requires-Dist: pyarrow (>=17.0.0,<18.0.0)
Requires-Dist: pydantic (>=2.8.2,<3.0.0)
Requires-Dist: pygraphviz (>=1.14,<2.0)
Requires-Dist: pylint (>=3.2.6,<4.0.0)
Requires-Dist: rapidfuzz (>=3.9.5,<4.0.0)
Requires-Dist: scipy (>=1.14.0,<2.0.0)
Requires-Dist: streamlit (>=1.38.0,<2.0.0)
Requires-Dist: tiktoken (>=0.7.0,<0.8.0)
Requires-Dist: tree-sitter (==0.22.3)
Requires-Dist: tree-sitter-java (==0.21.0)
Requires-Dist: tree-sitter-python (==0.21.0)
Requires-Dist: unidiff (>=0.7.5,<0.8.0)
Description-Content-Type: text/markdown

# Moatless Tree Search 

### Code for paper [SWE-Search: Enhancing Software Agents with Monte Carlo Tree Search and Iterative Refinement](https://arxiv.org/abs/2410.20285)

Note: The original development code can be found at [github.com/a-antoniades/swe-search](https://github.com/a-antoniades/swe-search). It is only intended for reproducing the results in the paper. This is a clean refactor with a modular design, which will be maintained and extended.

##

<div align="center">

[![License](https://img.shields.io/badge/License-Apache_2.0-4D9885.svg?style=flat&logo=apache)](./LICENSE)
[![arXiv](https://img.shields.io/badge/arXiv-2408.08435-B31B1B.svg?style=flat&logo=arxiv)](https://arxiv.org/abs/2410.20285)
[![Streamlit](https://img.shields.io/badge/Demo-Streamlit-FF4B4B.svg?style=flat&logo=streamlit)](https://streamlit.moatless.ai/)
[![YouTube](https://img.shields.io/badge/Video-YouTube-FF0000.svg?style=flat&logo=youtube)](https://www.youtube.com/watch?v=VcEHX_TNDgQ)
[![Twitter](https://img.shields.io/badge/Tweet-X-1DA1F2.svg?style=flat&logo=twitter)](https://x.com/anton_iades/status/1852022811113697307)
[![Discord](https://img.shields.io/badge/👾-Discord-7289DA?style=flat-square)](https://discord.gg/74VX8ppBEg)
</div>

<div align="center">
  <a href="assets/method.pdf" target="_blank">
    <img src="./assets/method.png" alt="Method Diagram" width="100%">
  </a>

  <p>Overview of SWE-Search showing the tree search process, where states (nodes) and actions (edges) are evaluated using contextual information and value function feedback to guide expansion.</p>
</div>

## Installation

Install the package:

```bash
pip install moatless-tree-search
```

### Environment Setup

Before running the evaluation, you'll need:
1. At least one LLM provider API key (e.g., OpenAI, Anthropic, etc.)
2. A Voyage AI API key from [voyageai.com](https://voyageai.com) to use the pre-embedded vector stores for SWE-Bench instances.
3. (Optional) Access to a testbed environment - see [moatless-testbeds](https://github.com/aorwall/moatless-testbeds) for setup instructions

You can configure these settings by either:

1. Create a `.env` file in the project root (copy from `.env.example`):

   ```bash
   cp .env.example .env
   # Edit .env with your values
   ```

2. Or export the variables directly:
   
   ```bash
   # Directory for storing vector index store files  
   export INDEX_STORE_DIR="/tmp/index_store"    

   # Directory for storing clonedrepositories 
   export REPO_DIR="/tmp/repos"

   # Required: At least one LLM provider API key
   export OPENAI_API_KEY="<your-key>"
   export ANTHROPIC_API_KEY="<your-key>"
   export HUGGINGFACE_API_KEY="<your-key>"
   export DEEPSEEK_API_KEY="<your-key>"

   # ...or Base URL for custom LLM API service (optional)
   export CUSTOM_LLM_API_BASE="<your-base-url>"
   export CUSTOM_LLM_API_KEY="<your-key>"

   # Required: API Key for Voyage Embeddings
   export VOYAGE_API_KEY="<your-key>"

   # Optional: Configuration for testbed environment (https://github.com/aorwall/moatless-testbeds)
   export TESTBED_API_KEY="<your-key>"
   export TESTBED_BASE_URL="<your-base-url>"
   ```


## Streamlit
To launch the Streamlit app, run:

```bash
# Launch with direct file loading
moatless-streamlit path/to/trajectory.json

# Launch interactive UI (file can be selected in browser)
moatless-streamlit
```

The following badges are used to indicate the status of a node:

| Badge | Shape | Color | Description |
|-------|-------|-------|-------------|
| ⭐ | Star | Green | Node is marked as resolved |
| ❌ | X | Red | Invalid edits or failed tests |
| 🟢 | Circle | Green | Correct code spans present in the context |
| 🟡 | Circle | Yellow | Either:<br>• Found files but not spans<br>• Found spans but in wrong files<br>|

## Evaluation

To run the evaluation script

```bash
moatless-evaluate \
    --model "gpt-4o-mini" \
    --repo_base_dir /tmp/repos \
    --eval_dir "./evaluations" \
    --eval_name mts \
    --temp 0.7 \
    --num_workers 1 \
    --use_testbed \
    --feedback \
    --max_iterations 100 \
    --max_expansions 5
```

You can optionally set the `--instance_ids` to evaluate on a specific instance or a list of instances.

Use `--use_testbed` if you got access to a testbed environment. Otherwise, tests will not be run.

## Examples

### Example: Basic Flow
Basic setup similar to the moatless-tools agent.

```python
from moatless.agent import CodingAgent
from moatless.agent.code_prompts import SIMPLE_CODE_PROMPT
from moatless.benchmark.swebench import create_repository
from moatless.benchmark.utils import get_moatless_instance
from moatless.completion import CompletionModel
from moatless.file_context import FileContext
from moatless.index import CodeIndex
from moatless.search_tree import SearchTree
from moatless.actions import FindClass, FindFunction, FindCodeSnippet, SemanticSearch, RequestMoreContext, RequestCodeChange, Finish, Reject

index_store_dir = "/tmp/index_store"
repo_base_dir = "/tmp/repos"
persist_path = "trajectory.json"

instance = get_moatless_instance("django__django-16379")

completion_model = CompletionModel(model="gpt-4o", temperature=0.0)

repository = create_repository(instance)

code_index = CodeIndex.from_index_name(
    instance["instance_id"], index_store_dir=index_store_dir, file_repo=repository
)

actions = [
    FindClass(code_index=code_index, repository=repository),
    FindFunction(code_index=code_index, repository=repository),
    FindCodeSnippet(code_index=code_index, repository=repository),
    SemanticSearch(code_index=code_index, repository=repository),
    RequestMoreContext(repository=repository),
    RequestCodeChange(repository=repository, completion_model=completion_model),
    Finish(),
    Reject()
]

file_context = FileContext(repo=repository)
agent = CodingAgent(actions=actions, completion=completion_model, system_prompt=SIMPLE_CODE_PROMPT)

search_tree = SearchTree.create(
    message=instance["problem_statement"],
    agent=agent,
    file_context=file_context,
    max_expansions=1,
    max_iterations=50
)

node = search_tree.run_search()
print(node.observation.message)
```

### Example: MCTS Flow

How to setup the evaluation flow with MCTS and testbeds.

```python
from moatless.agent import CodingAgent
from moatless.benchmark.swebench import create_repository
from moatless.benchmark.utils import get_moatless_instance
from moatless.completion import CompletionModel
from moatless.discriminator import AgentDiscriminator
from moatless.feedback import FeedbackGenerator
from moatless.file_context import FileContext
from moatless.index import CodeIndex
from moatless.search_tree import SearchTree
from moatless.selector import BestFirstSelector
from moatless.actions import FindClass, FindFunction, FindCodeSnippet, SemanticSearch, RequestMoreContext, RequestCodeChange, Finish, Reject, RunTests
from moatless.value_function import ValueFunction
from testbeds.sdk import TestbedSDK
from moatless.runtime.testbed import TestbedEnvironment

index_store_dir = "/tmp/index_store"
repo_base_dir = "/tmp/repos"
persist_path = "trajectory.json"

instance = get_moatless_instance("django__django-16379")

completion_model = CompletionModel(model="gpt-4o-mini", temperature=0.7)

repository = create_repository(instance, repo_base_dir=repo_base_dir)

code_index = CodeIndex.from_index_name(
    instance["instance_id"], index_store_dir=index_store_dir, file_repo=repository
)

file_context = FileContext(repo=repository)

selector = BestFirstSelector()

value_function = ValueFunction(completion=completion_model)

discriminator = AgentDiscriminator(
    completion=completion_model,
    n_agents=5,
    n_rounds=3,
)

feedback = FeedbackGenerator()

runtime = TestbedEnvironment(
    testbed_sdk=TestbedSDK(),
    repository=repository,
    instance=instance
)

actions = [
    FindClass(code_index=code_index, repository=repository),
    FindFunction(code_index=code_index, repository=repository),
    FindCodeSnippet(code_index=code_index, repository=repository),
    SemanticSearch(code_index=code_index, repository=repository),
    RequestMoreContext(repository=repository),
    RequestCodeChange(repository=repository, completion_model=completion_model),
    RunTests(code_index=code_index, repository=repository, runtime=runtime),
    Finish(),
    Reject()
]

agent = CodingAgent(actions=actions, completion=completion_model)

search_tree = SearchTree.create(
    message=instance["problem_statement"],
    agent=agent,
    file_context=file_context,
    selector=selector,
    value_function=value_function,
    discriminator=discriminator,
    feedback_generator=feedback,
    max_iterations=100,
    max_expansions=3,
    max_depth=25,
    persist_path=persist_path,
)

node = search_tree.run_search()
print(node.observation.message)
```

### Citation
```
@misc{antoniades2024swesearchenhancingsoftwareagents,
      title={SWE-Search: Enhancing Software Agents with Monte Carlo Tree Search and Iterative Refinement}, 
      author={Antonis Antoniades and Albert Örwall and Kexun Zhang and Yuxi Xie and Anirudh Goyal and William Wang},
      year={2024},
      eprint={2410.20285},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2410.20285}, 
}
```
