Metadata-Version: 2.1
Name: tempsentpolicy
Version: 0.0.1
Summary: Designed to ensure robust generalization across time stamps for sentiment analysis in policy-related texts
Author: Anonymous ACL SRW
Author-email: anonaclsrw@proton.me
Keywords: sentiment analysis,temporal,policy text,online text
Classifier: Development Status :: 1 - Planning
Classifier: Intended Audience :: Education
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: Unix
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Operating System :: Microsoft :: Windows
Description-Content-Type: text/markdown
License-File: license.txt
Requires-Dist: matplotlib
Requires-Dist: ruptures
Requires-Dist: pandas
Requires-Dist: tqdm
Requires-Dist: torch
Requires-Dist: transformers
Requires-Dist: datasets
Requires-Dist: scikit-learn
Requires-Dist: numpy

# tempsentpolicy



**Temp**orally-sensitive **Sent**iment analysis in **Policy** (**tempsentpolicy**) is a Python package ddesigned to ensure robust generalization across time stamps for sentiment analysis in policy-related texts. 



It accounts for the **temporal dynamics** of evolving policy discourse and evaluates sentiment models under realistic settings that mimic actual policy analysis scenarios.  



Key features include:

- Continuous time-series clustering to prioritize data points for annotation based on temporal trends.  

- Advanced model merging techniques to unify multiple models fine-tuned on distinct temporal slices.  



- Continous time series is performed using `Ruptures`  

- It supports the following merging techniques:  

    1.  Uniform Souping

    2. Greedy Souping

    3. Task Arithmetic

    4. TIES

    5. DARE

    6. Fisher Merging

    7. RegMean Merging  







## Requirements





### Required Packages 



The following Python packages are required:  



- matplotlib

- ruptures

- pandas

- tqdm

- torch

- transformers

- datasets

- scikit-learn

- numpy



These will be installed automatically when using pip install.   

Alternatively, you could directly install them  



```bash

pip install matplotlib ruptures pandas tqdm torch transformers datasets scikit-learn numpy

```



### Computational requirements



You require at least one GPU to use `tempsentpolicy`.



VRAM requirements depend on factors like batch size, model size, etc. 



However, at least 12GB of VRAM is recommended



## Installation 



To install in python, simply do the following:



```python

pip install tempsentpolicy

```



## Quick start



To learn how we used it in our experiments, go to `demo.ipynb`. Alternatively, refer to this quick guide here:   



1. Lets start off by performing continous time series clustering to select our annotated data



```python

from temporally_sensitive_sentiment_analysis_for_policy.merging import greedy_souping, ties

from temporally_sensitive_sentiment_analysis_for_policy.fine_tuning import finetune

from temporally_sensitive_sentiment_analysis_for_policy.continous_time_series_clustering import cluster_sampling, continous_time_series_clustering



df_climate=continous_time_series_clustering(df=df,time_col="timestamp_col",level="M", plot=False, penalty=0.1) # note that level "M" stands for Month



# perform continous time series clustering and selecting based on the clusters

df=cluster_sampling(df=df,sample_size=10000,stratified_col="cluster"):

df_annotated=df[df["selected"]==1].reset_index(drop=True)

```



2. Now first lets finetune a single unified model 



```python

# finetuning

finetune(df=df_annotated, model_name="microsoft/deberta-v3-large",cluster_col_name=None,

         folder_name="naive_finetuning/continous_clustering_by_month/climate_change", # note folder name to save our models

         text_col='message', label_col="sentiment",  label2id={"Anti":0,"Neutral":1,"Pro":2,"News":3},

         learning_rate=1e-5, warmup_ratio=0.05, weight_decay=0.001,

         epochs=3, batch_size=6, early_stopping_patience=2, return_val_data=False)



```



3. Now first lets finetune individual models (one for each cluster), and merge them with greedy soup. 



```python

# finetuning

df_val=finetune(df=df_annotated, model_name="microsoft/deberta-v3-large",

                cluster_col_name="cluster", # the difference is here --> we use "cluster" instead of None

                folder_name="batch_finetuning/continous_clustering_by_month/climate_change", # note folder name to save our models

                text_col='message', 

                label_col="sentiment",  

                label2id={"Anti":0,"Neutral":1,"Pro":2,"News":3},

                learning_rate=1e-5, 

                warmup_ratio=0.05, 

                weight_decay=0.001,

                epochs=8, 

                batch_size=6, 

                early_stopping_patience=2, 

                return_val_data=True) # note TRUE if we want to return validation data. 



# Now we perform greedy souping

greedy_souping(

    models_folder="models/batch_finetuning/continous_clustering_by_month/climate_change/deberta-v3-large", # where the individual models are saved

    save_path="models/merged_models/continous_clustering_by_month/climate_change/greedy_soup/deberta-v3-large", # the merged model will be saved here

    df_val=df_val,  # validation data. 

    col_label="sentiment",

    text_col='message',

    num_labels=4,

    label2id={"Anti":0,"Neutral":1,"Pro":2,"News":3})



# the model will be saved under `save_path="models/merged_models/continous_clustering_by_month/climate_change/greedy_soup/deberta-v3-large"`



```



## NOTE



This is a temporary repository for the purposes of anonymity in the ACL SRW submission. Upon acceptance, it will be converted to a package under the main authors PyPi account. Stay tuned for updates!!!   







