Commit
·
82b248b
1
Parent(s):
f2f8351
Update README.md
Browse files
README.md
CHANGED
@@ -16,7 +16,8 @@ Model Description
|
|
16 |
This model is based on the BERT (Bidirectional Encoder Representations from Transformers) model, specifically bert-base-uncased.
|
17 |
|
18 |
Training Procedure
|
19 |
-
The model was trained on the TripAdvisor hotel reviews dataset. Each review in the dataset is associated with a rating from 1 to 5.
|
|
|
20 |
|
21 |
Ratings of 1 and 2 were labelled as 'Negative'
|
22 |
Rating of 3 was labelled as 'Neutral'
|
@@ -26,17 +27,16 @@ The text of each review was preprocessed by lowercasing, removing punctuation, e
|
|
26 |
The model was trained with a learning rate of 2e-5, an epsilon of 1e-8, and a batch size of 6 for 5 epochs.
|
27 |
|
28 |
Evaluation
|
29 |
-
The model was evaluated using a weighted F1 score.
|
30 |
|
31 |
Usage
|
32 |
To use the model, load it and use it to classify a review. For example:
|
33 |
|
34 |
-
|
35 |
-
Copy code
|
36 |
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
37 |
|
38 |
-
tokenizer = AutoTokenizer.from_pretrained("<
|
39 |
-
model = AutoModelForSequenceClassification.from_pretrained("<
|
40 |
|
41 |
text = "The hotel was great and the staff were very friendly."
|
42 |
|
@@ -45,10 +45,7 @@ output = model(**encoded_input)
|
|
45 |
predictions = output.logits.argmax(dim=1)
|
46 |
|
47 |
print(predictions)
|
48 |
-
Replace <model-name> with the actual model name.
|
49 |
|
50 |
Limitations and Bias
|
51 |
-
The model is trained on English data, so it might not perform well on reviews in other languages.
|
52 |
-
|
53 |
-
Licensing
|
54 |
-
Please add licensing information here if applicable.
|
|
|
16 |
This model is based on the BERT (Bidirectional Encoder Representations from Transformers) model, specifically bert-base-uncased.
|
17 |
|
18 |
Training Procedure
|
19 |
+
The model was trained on the TripAdvisor hotel reviews dataset. Each review in the dataset is associated with a rating from 1 to 5.
|
20 |
+
The ratings were converted to sentiment labels as follows:
|
21 |
|
22 |
Ratings of 1 and 2 were labelled as 'Negative'
|
23 |
Rating of 3 was labelled as 'Neutral'
|
|
|
27 |
The model was trained with a learning rate of 2e-5, an epsilon of 1e-8, and a batch size of 6 for 5 epochs.
|
28 |
|
29 |
Evaluation
|
30 |
+
The model was evaluated using a weighted F1 score.
|
31 |
|
32 |
Usage
|
33 |
To use the model, load it and use it to classify a review. For example:
|
34 |
|
35 |
+
|
|
|
36 |
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
37 |
|
38 |
+
tokenizer = AutoTokenizer.from_pretrained("<Group209>")
|
39 |
+
model = AutoModelForSequenceClassification.from_pretrained("<Group209>")
|
40 |
|
41 |
text = "The hotel was great and the staff were very friendly."
|
42 |
|
|
|
45 |
predictions = output.logits.argmax(dim=1)
|
46 |
|
47 |
print(predictions)
|
|
|
48 |
|
49 |
Limitations and Bias
|
50 |
+
The model is trained on English data, so it might not perform well on reviews in other languages.
|
51 |
+
Furthermore, it might be biased towards certain phrases or words that are commonly used in the training dataset.
|
|
|
|