tomaarsen HF Staff commited on
Commit
a0b85cc
·
verified ·
1 Parent(s): 2a69251

Add new CrossEncoder model

Browse files
Files changed (5) hide show
  1. README.md +71 -71
  2. config.json +49 -45
  3. onnx/model.onnx +3 -0
  4. special_tokens_map.json +42 -6
  5. tokenizer.json +0 -0
README.md CHANGED
@@ -1,72 +1,72 @@
1
- ---
2
- language: en
3
- pipeline_tag: zero-shot-classification
4
- tags:
5
- - transformers
6
- datasets:
7
- - nyu-mll/multi_nli
8
- - stanfordnlp/snli
9
- metrics:
10
- - accuracy
11
- license: apache-2.0
12
- base_model:
13
- - microsoft/deberta-v3-large
14
- library_name: sentence-transformers
15
- ---
16
-
17
- # Cross-Encoder for Natural Language Inference
18
- This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class. This model is based on [microsoft/deberta-v3-large](https://huggingface.co/microsoft/deberta-v3-large)
19
-
20
- ## Training Data
21
- The model was trained on the [SNLI](https://nlp.stanford.edu/projects/snli/) and [MultiNLI](https://cims.nyu.edu/~sbowman/multinli/) datasets. For a given sentence pair, it will output three scores corresponding to the labels: contradiction, entailment, neutral.
22
-
23
- ## Performance
24
- - Accuracy on SNLI-test dataset: 92.20
25
- - Accuracy on MNLI mismatched set: 90.49
26
-
27
- For futher evaluation results, see [SBERT.net - Pretrained Cross-Encoder](https://www.sbert.net/docs/pretrained_cross-encoders.html#nli).
28
-
29
- ## Usage
30
-
31
- Pre-trained models can be used like this:
32
- ```python
33
- from sentence_transformers import CrossEncoder
34
- model = CrossEncoder('cross-encoder/nli-deberta-v3-large')
35
- scores = model.predict([('A man is eating pizza', 'A man eats something'), ('A black race car starts up in front of a crowd of people.', 'A man is driving down a lonely road.')])
36
-
37
- #Convert scores to labels
38
- label_mapping = ['contradiction', 'entailment', 'neutral']
39
- labels = [label_mapping[score_max] for score_max in scores.argmax(axis=1)]
40
- ```
41
-
42
- ## Usage with Transformers AutoModel
43
- You can use the model also directly with Transformers library (without SentenceTransformers library):
44
- ```python
45
- from transformers import AutoTokenizer, AutoModelForSequenceClassification
46
- import torch
47
-
48
- model = AutoModelForSequenceClassification.from_pretrained('cross-encoder/nli-deberta-v3-large')
49
- tokenizer = AutoTokenizer.from_pretrained('cross-encoder/nli-deberta-v3-large')
50
-
51
- features = tokenizer(['A man is eating pizza', 'A black race car starts up in front of a crowd of people.'], ['A man eats something', 'A man is driving down a lonely road.'], padding=True, truncation=True, return_tensors="pt")
52
-
53
- model.eval()
54
- with torch.no_grad():
55
- scores = model(**features).logits
56
- label_mapping = ['contradiction', 'entailment', 'neutral']
57
- labels = [label_mapping[score_max] for score_max in scores.argmax(dim=1)]
58
- print(labels)
59
- ```
60
-
61
- ## Zero-Shot Classification
62
- This model can also be used for zero-shot-classification:
63
- ```python
64
- from transformers import pipeline
65
-
66
- classifier = pipeline("zero-shot-classification", model='cross-encoder/nli-deberta-v3-large')
67
-
68
- sent = "Apple just announced the newest iPhone X"
69
- candidate_labels = ["technology", "sports", "politics"]
70
- res = classifier(sent, candidate_labels)
71
- print(res)
72
  ```
 
1
+ ---
2
+ language: en
3
+ pipeline_tag: zero-shot-classification
4
+ tags:
5
+ - transformers
6
+ datasets:
7
+ - nyu-mll/multi_nli
8
+ - stanfordnlp/snli
9
+ metrics:
10
+ - accuracy
11
+ license: apache-2.0
12
+ base_model:
13
+ - microsoft/deberta-v3-large
14
+ library_name: sentence-transformers
15
+ ---
16
+
17
+ # Cross-Encoder for Natural Language Inference
18
+ This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class. This model is based on [microsoft/deberta-v3-large](https://huggingface.co/microsoft/deberta-v3-large)
19
+
20
+ ## Training Data
21
+ The model was trained on the [SNLI](https://nlp.stanford.edu/projects/snli/) and [MultiNLI](https://cims.nyu.edu/~sbowman/multinli/) datasets. For a given sentence pair, it will output three scores corresponding to the labels: contradiction, entailment, neutral.
22
+
23
+ ## Performance
24
+ - Accuracy on SNLI-test dataset: 92.20
25
+ - Accuracy on MNLI mismatched set: 90.49
26
+
27
+ For futher evaluation results, see [SBERT.net - Pretrained Cross-Encoder](https://www.sbert.net/docs/pretrained_cross-encoders.html#nli).
28
+
29
+ ## Usage
30
+
31
+ Pre-trained models can be used like this:
32
+ ```python
33
+ from sentence_transformers import CrossEncoder
34
+ model = CrossEncoder('cross-encoder/nli-deberta-v3-large')
35
+ scores = model.predict([('A man is eating pizza', 'A man eats something'), ('A black race car starts up in front of a crowd of people.', 'A man is driving down a lonely road.')])
36
+
37
+ #Convert scores to labels
38
+ label_mapping = ['contradiction', 'entailment', 'neutral']
39
+ labels = [label_mapping[score_max] for score_max in scores.argmax(axis=1)]
40
+ ```
41
+
42
+ ## Usage with Transformers AutoModel
43
+ You can use the model also directly with Transformers library (without SentenceTransformers library):
44
+ ```python
45
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
46
+ import torch
47
+
48
+ model = AutoModelForSequenceClassification.from_pretrained('cross-encoder/nli-deberta-v3-large')
49
+ tokenizer = AutoTokenizer.from_pretrained('cross-encoder/nli-deberta-v3-large')
50
+
51
+ features = tokenizer(['A man is eating pizza', 'A black race car starts up in front of a crowd of people.'], ['A man eats something', 'A man is driving down a lonely road.'], padding=True, truncation=True, return_tensors="pt")
52
+
53
+ model.eval()
54
+ with torch.no_grad():
55
+ scores = model(**features).logits
56
+ label_mapping = ['contradiction', 'entailment', 'neutral']
57
+ labels = [label_mapping[score_max] for score_max in scores.argmax(dim=1)]
58
+ print(labels)
59
+ ```
60
+
61
+ ## Zero-Shot Classification
62
+ This model can also be used for zero-shot-classification:
63
+ ```python
64
+ from transformers import pipeline
65
+
66
+ classifier = pipeline("zero-shot-classification", model='cross-encoder/nli-deberta-v3-large')
67
+
68
+ sent = "Apple just announced the newest iPhone X"
69
+ candidate_labels = ["technology", "sports", "politics"]
70
+ res = classifier(sent, candidate_labels)
71
+ print(res)
72
  ```
config.json CHANGED
@@ -1,45 +1,49 @@
1
- {
2
- "_name_or_path": "microsoft/deberta-v3-large",
3
- "architectures": [
4
- "DebertaV2ForSequenceClassification"
5
- ],
6
- "attention_probs_dropout_prob": 0.1,
7
- "hidden_act": "gelu",
8
- "hidden_dropout_prob": 0.1,
9
- "hidden_size": 1024,
10
- "id2label": {
11
- "0": "contradiction",
12
- "1": "entailment",
13
- "2": "neutral"
14
- },
15
- "initializer_range": 0.02,
16
- "intermediate_size": 4096,
17
- "label2id": {
18
- "contradiction": 0,
19
- "entailment": 1,
20
- "neutral": 2
21
- },
22
- "layer_norm_eps": 1e-07,
23
- "max_position_embeddings": 512,
24
- "max_relative_positions": -1,
25
- "model_type": "deberta-v2",
26
- "norm_rel_ebd": "layer_norm",
27
- "num_attention_heads": 16,
28
- "num_hidden_layers": 24,
29
- "pad_token_id": 0,
30
- "pooler_dropout": 0,
31
- "pooler_hidden_act": "gelu",
32
- "pooler_hidden_size": 1024,
33
- "pos_att_type": [
34
- "p2c",
35
- "c2p"
36
- ],
37
- "position_biased_input": false,
38
- "position_buckets": 256,
39
- "relative_attention": true,
40
- "share_att_key": true,
41
- "torch_dtype": "float32",
42
- "transformers_version": "4.11.3",
43
- "type_vocab_size": 0,
44
- "vocab_size": 128100
45
- }
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "DebertaV2ForSequenceClassification"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "hidden_act": "gelu",
7
+ "hidden_dropout_prob": 0.1,
8
+ "hidden_size": 1024,
9
+ "id2label": {
10
+ "0": "contradiction",
11
+ "1": "entailment",
12
+ "2": "neutral"
13
+ },
14
+ "initializer_range": 0.02,
15
+ "intermediate_size": 4096,
16
+ "label2id": {
17
+ "contradiction": 0,
18
+ "entailment": 1,
19
+ "neutral": 2
20
+ },
21
+ "layer_norm_eps": 1e-07,
22
+ "legacy": true,
23
+ "max_position_embeddings": 512,
24
+ "max_relative_positions": -1,
25
+ "model_type": "deberta-v2",
26
+ "norm_rel_ebd": "layer_norm",
27
+ "num_attention_heads": 16,
28
+ "num_hidden_layers": 24,
29
+ "pad_token_id": 0,
30
+ "pooler_dropout": 0,
31
+ "pooler_hidden_act": "gelu",
32
+ "pooler_hidden_size": 1024,
33
+ "pos_att_type": [
34
+ "p2c",
35
+ "c2p"
36
+ ],
37
+ "position_biased_input": false,
38
+ "position_buckets": 256,
39
+ "relative_attention": true,
40
+ "sentence_transformers": {
41
+ "activation_fn": "torch.nn.modules.linear.Identity",
42
+ "version": "4.1.0.dev0"
43
+ },
44
+ "share_att_key": true,
45
+ "torch_dtype": "float32",
46
+ "transformers_version": "4.52.0.dev0",
47
+ "type_vocab_size": 0,
48
+ "vocab_size": 128100
49
+ }
onnx/model.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:522031148bd5696cf67ca4a682ee56090cd5219558f29b62b311ea345e9f0fb7
3
+ size 1742010231
special_tokens_map.json CHANGED
@@ -1,10 +1,46 @@
1
  {
2
- "bos_token": "[CLS]",
3
- "cls_token": "[CLS]",
4
- "eos_token": "[SEP]",
5
- "mask_token": "[MASK]",
6
- "pad_token": "[PAD]",
7
- "sep_token": "[SEP]",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  "unk_token": {
9
  "content": "[UNK]",
10
  "lstrip": false,
 
1
  {
2
+ "bos_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "[CLS]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "[SEP]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "[MASK]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "[PAD]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "[SEP]",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
  "unk_token": {
45
  "content": "[UNK]",
46
  "lstrip": false,
tokenizer.json CHANGED
The diff for this file is too large to render. See raw diff