Cesar42 commited on
Commit
71ac511
·
1 Parent(s): d54b533

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +102 -0
README.md CHANGED
@@ -0,0 +1,102 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - dair-ai/emotion
5
+ language:
6
+ - en
7
+ metrics:
8
+ - accuracy
9
+ tags:
10
+ - emotion
11
+ ---
12
+ # Model
13
+
14
+ Model IA Berta_Base_Uncased entrened with dateset emotion
15
+
16
+ ## Model Details
17
+
18
+ Model Base: bert_base_uncased
19
+
20
+ dataset: dair-ai/emotion
21
+
22
+ Config train:
23
+
24
+ num_train_epochs= 8
25
+ learning_rate= 2e-5
26
+ weight_decay=0.01
27
+ batch_size: 64
28
+
29
+ ## Eval Exam
30
+ ```json
31
+ {
32
+ 'test_loss': 0.14830373227596283
33
+ 'test_accuracy': 0.9415
34
+ 'test_f1': 0.9411005763302622
35
+ 'test_runtime': 8.372
36
+ 'test_samples_per_second': 238.892
37
+ 'test_steps_per_second': 3.822
38
+ }
39
+ ```
40
+
41
+ ## How to Use the model:
42
+ ```python
43
+ from transformers import pipeline
44
+ model_path = "daveni/twitter-xlm-roberta-emotion-es"
45
+ emotion_analysis = pipeline("text-classification", framework="pt", model=model_path, tokenizer=model_path)
46
+ emotion_analysis("Einstein dijo: Solo hay dos cosas infinitas, el universo y los pinches anuncios de bitcoin en Twitter. Paren ya carajo aaaaaaghhgggghhh me quiero murir")
47
+ ```
48
+ ```
49
+ [{'label': 'anger', 'score': 0.48307016491889954}]
50
+ ```
51
+ ## Full classification example
52
+ ```python
53
+ from transformers import AutoModelForSequenceClassification
54
+ from transformers import AutoTokenizer, AutoConfig
55
+ import numpy as np
56
+ from scipy.special import softmax
57
+ # Preprocess text (username and link placeholders)
58
+ def preprocess(text):
59
+ new_text = []
60
+ for t in text.split(" "):
61
+ t = '@user' if t.startswith('@') and len(t) > 1 else t
62
+ t = 'http' if t.startswith('http') else t
63
+ new_text.append(t)
64
+ return " ".join(new_text)
65
+ model_path = "Cesar42/bert-base-uncased-emotion_v2"
66
+ tokenizer = AutoTokenizer.from_pretrained(model_path )
67
+ config = AutoConfig.from_pretrained(model_path )
68
+ # PT
69
+ model = AutoModelForSequenceClassification.from_pretrained(model_path )
70
+ text = "Se ha quedao bonito día para publicar vídeo, ¿no? Hoy del tema más diferente que hemos tocado en el canal."
71
+ text = preprocess(text)
72
+ print(text)
73
+ encoded_input = tokenizer(text, return_tensors='pt')
74
+ output = model(**encoded_input)
75
+ scores = output[0][0].detach().numpy()
76
+ scores = softmax(scores)
77
+ # Print labels and scores
78
+ ranking = np.argsort(scores)
79
+ ranking = ranking[::-1]
80
+ for i in range(scores.shape[0]):
81
+ l = config.id2label[ranking[i]]
82
+ s = scores[ranking[i]]
83
+ print(f"{i+1}) {l} {np.round(float(s), 4)}")
84
+ ```
85
+ Output:
86
+
87
+ ```
88
+ Se ha quedao bonito día para publicar vídeo, ¿no? Hoy del tema más diferente que hemos tocado en el canal.
89
+ 1) joy 0.7887
90
+ 2) others 0.1679
91
+ 3) surprise 0.0152
92
+ 4) sadness 0.0145
93
+ 5) anger 0.0077
94
+ 6) disgust 0.0033
95
+ 7) fear 0.0027
96
+ ```
97
+
98
+ ### Referece
99
+
100
+ * bhadresh-savani/bert-base-uncased-emotion
101
+ * [Colab Notebook](https://github.com/bhadreshpsavani/ExploringSentimentalAnalysis/blob/main/SentimentalAnalysisWithDistilbert.ipynb). bhadresh-savani
102
+