File size: 3,963 Bytes
af81ef6
ab0d837
af81ef6
 
 
 
ab0d837
 
 
 
af81ef6
 
 
ab0d837
 
af81ef6
ab0d837
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
af81ef6
 
ab0d837
af81ef6
ab0d837
af81ef6
 
 
ab0d837
 
 
 
 
 
 
af81ef6
 
 
ab0d837
 
 
 
 
 
 
 
 
 
 
 
af81ef6
 
 
ab0d837
 
 
 
 
af81ef6
 
 
 
 
ab0d837
 
 
 
 
 
 
af81ef6
 
 
 
ab0d837
 
 
af81ef6
 
 
ab0d837
af81ef6
 
ab0d837
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
---
language: en
license: apache-2.0
base_model: bert-base-uncased
tags:
- generated_from_trainer
- paraphrase-identification
- bert
- glue
- mrpc
metrics:
- accuracy
- f1
datasets:
- glue
model-index:
- name: bert-base-uncased-finetuned-mrpc
  results:
  - task: 
      type: text-classification
      name: Paraphrase Identification
    dataset:
      name: GLUE MRPC
      type: glue
      args: mrpc
    metrics:
      - name: Accuracy
        type: accuracy
        value: 0.8652
      - name: F1
        type: f1
        value: 0.9057
---

# BERT Fine-tuned on MRPC

This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on the MRPC (Microsoft Research Paraphrase Corpus) dataset from the GLUE benchmark. It is designed to determine whether two given sentences are semantically equivalent.

## Model description

The model uses the BERT base architecture (12 layers, 768 hidden dimensions, 12 attention heads) and has been fine-tuned specifically for the paraphrase identification task. The output layer predicts whether the input sentence pair expresses the same meaning.

Key specifications:
- Base model: bert-base-uncased
- Task type: Binary classification (paraphrase/not paraphrase)
- Training method: Fine-tuning all layers
- Language: English

## Intended uses & limitations

### Intended uses
- Paraphrase detection
- Semantic similarity assessment
- Question duplicate detection
- Content matching
- Automated text comparison

### Limitations
- Only works with English text
- Performance may degrade on out-of-domain text
- May struggle with complex or nuanced semantic relationships
- Limited to comparing pairs of sentences (not longer texts)

## Training and evaluation data

The model was trained on the Microsoft Research Paraphrase Corpus (MRPC) from the GLUE benchmark:
- Training set: 3,667 sentence pairs
- Validation set: 408 sentence pairs
- Each pair is labeled as either paraphrase (1) or non-paraphrase (0)
- Class distribution: approximately 67.4% positive (paraphrase) and 32.6% negative (non-paraphrase)

## Training procedure

### Training hyperparameters
The following hyperparameters were used during training:
- Learning rate: 3e-05
- Batch size: 8 (train and eval)
- Optimizer: AdamW (betas=(0.9,0.999), epsilon=1e-08)
- LR scheduler: Linear decay
- Number of epochs: 3
- Max sequence length: 512
- Weight decay: 0.01

### Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1     |
|:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|
| No log        | 1.0   | 459  | 0.3905         | 0.8382   | 0.8878 |
| 0.5385        | 2.0   | 918  | 0.4275         | 0.8505   | 0.8961 |
| 0.3054        | 3.0   | 1377 | 0.5471         | 0.8652   | 0.9057 |

### Framework versions
- Transformers 4.46.2
- PyTorch 2.5.1+cu121
- Datasets 3.1.0
- Tokenizers 0.20.3

## Performance analysis

The model achieves strong performance on the MRPC validation set:
- Accuracy: 86.52%
- F1 Score: 90.57%

These metrics indicate that the model is effective at identifying paraphrases while maintaining a good balance between precision and recall.

## Example usage

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("real-jiakai/bert-base-uncased-finetuned-mrpc")
model = AutoModelForSequenceClassification.from_pretrained("real-jiakai/bert-base-uncased-finetuned-mrpc")

# Example function
def check_paraphrase(sentence1, sentence2):
    inputs = tokenizer(sentence1, sentence2, return_tensors="pt", padding=True, truncation=True)
    outputs = model(**inputs)
    prediction = outputs.logits.argmax().item()
    return "Paraphrase" if prediction == 1 else "Not paraphrase"

# Example usage
sentence1 = "The cat sat on the mat."
sentence2 = "A cat was sitting on the mat."
result = check_paraphrase(sentence1, sentence2)
print(f"Result: {result}")
```