{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Generating Synthetic Annotations\n",
    "\n",
    "This short guide will show you how to utilize SynDisco's `LLM annotator-agents` to generate annotations for our synthetic discussions. This will allow you to quickly and cheaply evaluate the discussions you've generated in the last guide."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First, let's create a small, fake discussion."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-04-04T13:19:16.254335Z",
     "iopub.status.busy": "2025-04-04T13:19:16.254007Z",
     "iopub.status.idle": "2025-04-04T13:19:16.261081Z",
     "shell.execute_reply": "2025-04-04T13:19:16.260241Z"
    }
   },
   "outputs": [],
   "source": [
    "import tempfile\n",
    "\n",
    "discussion_str = \"\"\"\n",
    "{\n",
    "  \"id\": \"789f7c2f-7291-457b-888a-7d2b1520454a\",\n",
    "  \"timestamp\": \"25-03-26-11-14\",\n",
    "  \"users\": [\n",
    "    \"Emma35\",\n",
    "    \"Giannis\",\n",
    "    \"Moderator\"\n",
    "  ],\n",
    "  \"moderator\": \"Moderator\",\n",
    "  \"user_prompts\": [\n",
    "    \"You are taking part in an online conversation Your name is Emma35. Your traits: username: Emma35, age: 38, sex: female, sexual_orientation: Heterosexual, demographic_group: Latino, current_employment: Registered Nurse, education_level: Bachelor's, special_instructions: , personality_characteristics: ['compassionate', 'patient', 'diligent', 'overwhelmed'] Your instructions: Act like a human would\",\n",
    "    \"You are taking part in an online conversation Your name is Giannis. Your traits: username: Giannis, age: 21, sex: male, sexual_orientation: Pansexual, demographic_group: White, current_employment: Game Developer, education_level: College, special_instructions: , personality_characteristics: ['strategic', 'meticulous', 'nerdy', 'hyper-focused'] Your instructions: Act like a human would\",\n",
    "    \"You are taking part in an online conversation Your name is Moderator. Your traits: username: Moderator, age: 41, sex: male, sexual_orientation: Pansexual, demographic_group: White, current_employment: Moderator, education_level: PhD, special_instructions: , personality_characteristics: ['strict', 'neutral', 'just'] Your instructions: You are a moderator. Oversee the conversation\"\n",
    "  ],\n",
    "  \"moderator_prompt\": \"You are taking part in an online conversation Your name is Moderator. Your traits: username: Moderator, age: 41, sex: male, sexual_orientation: Pansexual, demographic_group: White, current_employment: Moderator, education_level: PhD, special_instructions: , personality_characteristics: ['strict', 'neutral', 'just'] Your instructions: You are a moderator. Oversee the conversation\",\n",
    "  \"ctx_length\": 5,\n",
    "  \"logs\": [\n",
    "    {\n",
    "      \"name\": \"Emma35\",\n",
    "      \"text\": \"Immigrants have played a significant role in our society. Their contributions are valuable and should be celebrated.\",\n",
    "      \"model\": \"test_model\"\n",
    "    },\n",
    "    {\n",
    "      \"name\": \"Giannis\",\n",
    "      \"text\": \"That's such an ignorant comment about immigrants. She doesn't know what she's talking about, let alone appreciate the hard work and dedication of immigrants who have contributed to our country.\",\n",
    "      \"model\": \"test_model\"\n",
    "    },\n",
    "    {\n",
    "      \"name\": \"Moderator\",\n",
    "      \"text\": \"I understand both perspectives. It's important to approach such discussions with respect and understanding. Let's ensure this conversation remains constructive.\",\n",
    "      \"model\": \"test_model\"\n",
    "    }\n",
    "  ]\n",
    "}\n",
    "\"\"\"\n",
    "\n",
    "discussion_file = tempfile.NamedTemporaryFile(delete=True)\n",
    "with open(discussion_file.name, mode=\"w\") as f:\n",
    "    f.write(discussion_str)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Creating an annotator\n",
    "\n",
    "An annotator-agent is very similar to the user-agents found in the [previous guide](creating_discussion.ipynb). The only real difference is that we need to pass a different `ActorType` to the LlmAgent function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-04-04T13:19:16.263381Z",
     "iopub.status.busy": "2025-04-04T13:19:16.263086Z",
     "iopub.status.idle": "2025-04-04T13:20:41.248417Z",
     "shell.execute_reply": "2025-04-04T13:20:41.247532Z"
    }
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/media/SSD_4TB_2/dtsirmpas/software/miniforge/envs/syndisco/lib/python3.13/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
      "  from .autonotebook import tqdm as notebook_tqdm\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Device set to use cuda:1\n"
     ]
    }
   ],
   "source": [
    "from syndisco.model import TransformersModel\n",
    "from syndisco.actors import Actor, ActorType, Persona\n",
    "\n",
    "\n",
    "llm = TransformersModel(\n",
    "    model_path=\"unsloth/Llama-3.2-3B-Instruct-bnb-4bit\",\n",
    "    name=\"test_model\",\n",
    "    max_out_tokens=100,\n",
    ")\n",
    "\n",
    "\n",
    "# annotator agents are not given a username\n",
    "# in order to avoid accidental name conflicts with participants\n",
    "annotator_persona = Persona(\n",
    "    **{\n",
    "        \"username\": \"\",\n",
    "        \"age\": 38,\n",
    "        \"sex\": \"female\",\n",
    "        \"education_level\": \"Bachelor's\",\n",
    "        \"sexual_orientation\": \"Heterosexual\",\n",
    "        \"demographic_group\": \"White\",\n",
    "        \"current_employment\": \"Annotator\",\n",
    "        \"special_instructions\": \"\",\n",
    "        \"personality_characteristics\": [\"competent\"],\n",
    "    }\n",
    ")\n",
    "\n",
    "annotator = Actor(\n",
    "    model=llm,\n",
    "    persona=annotator_persona,\n",
    "    context=\"You are annotating an online discussion\",\n",
    "    instructions=(\n",
    "        \"From a scale of 1 (not toxic) to 5 (very toxic) \"\n",
    "        \"how toxic is  the following comment? Reply only with a number?\"\n",
    "    ),\n",
    "    actor_type=ActorType.ANNOTATOR,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Executing the annotation\n",
    "\n",
    "Annotations are treated as a form of discussions, where one user writes the logs of the last discussion, while the annotator-agent rates them."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-04-04T13:20:41.251523Z",
     "iopub.status.busy": "2025-04-04T13:20:41.251092Z",
     "iopub.status.idle": "2025-04-04T13:20:41.973520Z",
     "shell.execute_reply": "2025-04-04T13:20:41.972704Z"
    }
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      " 33%|███▎      | 1/3 [00:01<00:02,  1.18s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "User Emma35 posted: Immigrants have played a significant role in our\n",
      "society. Their contributions are valuable and should be celebrated.\n",
      "1\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      " 67%|██████▋   | 2/3 [00:01<00:00,  1.44it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "User Giannis posted: That's such an ignorant comment about immigrants.\n",
      "She doesn't know what she's talking about, let alone appreciate the\n",
      "hard work and dedication of immigrants who have contributed to our\n",
      "country.\n",
      "3\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100%|██████████| 3/3 [00:01<00:00,  1.60it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "User Moderator posted: I understand both perspectives. It's important\n",
      "to approach such discussions with respect and understanding. Let's\n",
      "ensure this conversation remains constructive.\n",
      "1\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    }
   ],
   "source": [
    "from syndisco.jobs import Annotation\n",
    "\n",
    "ann_conv = Annotation(\n",
    "    annotator=annotator,\n",
    "    conv_logs_path=discussion_file.name,\n",
    "    include_moderator_comments=True,\n",
    ")\n",
    "ann_conv.begin()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Like normal discussions, it is recommended to save the annotations to the disk."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-04-04T13:20:41.975983Z",
     "iopub.status.busy": "2025-04-04T13:20:41.975837Z",
     "iopub.status.idle": "2025-04-04T13:20:41.980682Z",
     "shell.execute_reply": "2025-04-04T13:20:41.980004Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"conv_id\": \"789f7c2f-7291-457b-888a-7d2b1520454a\",\n",
      "  \"timestamp\": \"25-11-17-15-55\",\n",
      "  \"annotator_model\": \"test_model\",\n",
      "  \"annotator_prompt\": {\n",
      "    \"context\": \"You are annotating an online discussion\",\n",
      "    \"instructions\": \"From a scale of 1 (not toxic) to 5 (very toxic) how toxic is  the following comment? Reply only with a number?\",\n",
      "    \"type\": \"2\",\n",
      "    \"persona\": {\n",
      "      \"username\": \"\",\n",
      "      \"age\": 38,\n",
      "      \"sex\": \"female\",\n",
      "      \"sexual_orientation\": \"Heterosexual\",\n",
      "      \"demographic_group\": \"White\",\n",
      "      \"current_employment\": \"Annotator\",\n",
      "      \"education_level\": \"Bachelor's\",\n",
      "      \"special_instructions\": \"\",\n",
      "      \"personality_characteristics\": [\n",
      "        \"competent\"\n",
      "      ]\n",
      "    }\n",
      "  },\n",
      "  \"ctx_length\": 2,\n",
      "  \"logs\": [\n",
      "    [\n",
      "      \"Immigrants have played a significant role in our society. Their contributions are valuable and should be celebrated.\",\n",
      "      \"1\"\n",
      "    ],\n",
      "    [\n",
      "      \"That's such an ignorant comment about immigrants. She doesn't know what she's talking about, let alone appreciate the hard work and dedication of immigrants who have contributed to our country.\",\n",
      "      \"3\"\n",
      "    ],\n",
      "    [\n",
      "      \"I understand both perspectives. It's important to approach such discussions with respect and understanding. Let's ensure this conversation remains constructive.\",\n",
      "      \"1\"\n",
      "    ]\n",
      "  ]\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "import json\n",
    "\n",
    "tp = tempfile.NamedTemporaryFile(delete=True)\n",
    "\n",
    "ann_conv.to_json_file(tp.name)\n",
    "\n",
    "# if you are running this on Windows, uncomment this line\n",
    "# tp.close()\n",
    "with open(tp.name, mode=\"rb\") as f:\n",
    "    print(json.dumps(json.load(f), indent=2))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "syndisco",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
