|
--- |
|
license: apache-2.0 |
|
--- |
|
### DreamBank Custom Architecture |
|
|
|
The repo contains the weights for the custom architecture presented in the paper [Automatic Annotation of Dream Report’s Emotional Content with Large Language Models](https://aclanthology.org/2024.clpsych-1.7/). |
|
A working example of how to load and use the model can be found below. Please refer to the [Git repo](https://github.com/lorenzoscottb/Dream_Reports_Annotation/tree/main/Experiments/Supervised_Learning) for more details. |
|
|
|
#### Use |
|
|
|
```py |
|
import torch, os |
|
import pandas as pd |
|
from tqdm import tqdm |
|
import transformers |
|
from transformers import AutoModel |
|
from transformers import AutoConfig |
|
from transformers import BertTokenizerFast |
|
from SL_utils import * |
|
|
|
Coding_emotions = { |
|
"AN": "Anger", |
|
"AP": "Apprehension", |
|
"SD": "Sadness", |
|
"CO": "Confusion", |
|
"HA": "Happiness", |
|
} |
|
|
|
emotions_list = list(Coding_emotions.keys()) |
|
|
|
test_sentences = [ |
|
"In my dream I was follwed by the scary monster.", |
|
"I was walking in a forest, sorrounded by singing birds. I was in calm and peace." |
|
] |
|
|
|
test_sentences_target = len(test_sentences)*[[0, 0, 0, 0, 0]] |
|
test_sentences_df = pd.DataFrame.from_dict( |
|
{ |
|
"report":test_sentences, |
|
"Report_as_Multilabel":test_sentences_target |
|
} |
|
) |
|
``` |
|
|
|
```py |
|
model_name = "bert-large-cased" |
|
model_config = AutoConfig.from_pretrained(model_name) |
|
tokenizer = BertTokenizerFast.from_pretrained(model_name, do_lower_case=False) |
|
testing_set = CustomDataset(test_sentences_df, tokenizer, max_length=512) |
|
|
|
test_params = { |
|
'batch_size': 2, |
|
'shuffle': True, |
|
'num_workers': 0 |
|
} |
|
|
|
testing_loader = DataLoader(testing_set, **test_params) |
|
|
|
model = BERT_PTM( |
|
model_config, |
|
model_name=model_name, |
|
n_classes=len(emotions_list), |
|
freeze_BERT=False, |
|
) |
|
|
|
# Load the models' weights from the pre-treined model |
|
model.load_state_dict(torch.load("path/to/pytorch_model.bin")) |
|
model.to("cuda") |
|
``` |
|
|
|
```py |
|
outputs, targets, ids = validation(model, testing_loader, device="cuda", return_inputs=True) |
|
|
|
corr_outputs = np.array(outputs) >= 0.5 |
|
corr_outputs_df = pd.DataFrame(corr_outputs, columns=emotions_list) |
|
corr_outputs_df = corr_outputs_df.astype(int) |
|
|
|
corr_outputs_df["report"] = decoded_ids = [decode_clean(x, tokenizer) for x in tqdm(ids)] |
|
``` |
|
|
|
### Cite |
|
If you use this model on your work or research, please cite as: |
|
```bibtex |
|
@inproceedings{bertolini-etal-2024-automatic, |
|
title = "Automatic Annotation of Dream Report{'}s Emotional Content with Large Language Models", |
|
author = "Bertolini, Lorenzo and |
|
Elce, Valentina and |
|
Michalak, Adriana and |
|
Widhoelzl, Hanna-Sophia and |
|
Bernardi, Giulio and |
|
Weeds, Julie", |
|
booktitle = "Proceedings of the 9th Workshop on Computational Linguistics and Clinical Psychology (CLPsych 2024)", |
|
month = mar, |
|
year = "2024", |
|
address = "St. Julians, Malta", |
|
publisher = "Association for Computational Linguistics", |
|
url = "https://aclanthology.org/2024.clpsych-1.7", |
|
pages = "92--107", |
|
} |
|
|
|
``` |