Add new CrossEncoder model

Browse files

Files changed (5) hide show

README.md +71 -71
config.json +49 -45
onnx/model.onnx +3 -0
special_tokens_map.json +42 -6
tokenizer.json +0 -0

README.md CHANGED Viewed

@@ -1,72 +1,72 @@
----
-language: en
-pipeline_tag: zero-shot-classification
-tags:
-- transformers
-datasets:
-- nyu-mll/multi_nli
-- stanfordnlp/snli
-metrics:
-- accuracy
-license: apache-2.0
-base_model:
-- microsoft/deberta-v3-large
-library_name: sentence-transformers
----
-# Cross-Encoder for Natural Language Inference
-This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class. This model is based on [microsoft/deberta-v3-large](https://huggingface.co/microsoft/deberta-v3-large)
-## Training Data
-The model was trained on the [SNLI](https://nlp.stanford.edu/projects/snli/) and [MultiNLI](https://cims.nyu.edu/~sbowman/multinli/) datasets. For a given sentence pair, it will output three scores corresponding to the labels: contradiction, entailment, neutral.
-## Performance
-- Accuracy on SNLI-test dataset: 92.20
-- Accuracy on  MNLI mismatched set: 90.49
-For futher evaluation results, see [SBERT.net - Pretrained Cross-Encoder](https://www.sbert.net/docs/pretrained_cross-encoders.html#nli).
-## Usage
-Pre-trained models can be used like this:
-```python
-from sentence_transformers import CrossEncoder
-model = CrossEncoder('cross-encoder/nli-deberta-v3-large')
-scores = model.predict([('A man is eating pizza', 'A man eats something'), ('A black race car starts up in front of a crowd of people.', 'A man is driving down a lonely road.')])
-#Convert scores to labels
-label_mapping = ['contradiction', 'entailment', 'neutral']
-labels = [label_mapping[score_max] for score_max in scores.argmax(axis=1)]
-```
-## Usage with Transformers AutoModel
-You can use the model also directly with Transformers library (without SentenceTransformers library):
-```python
-from transformers import AutoTokenizer, AutoModelForSequenceClassification
-import torch
-model = AutoModelForSequenceClassification.from_pretrained('cross-encoder/nli-deberta-v3-large')
-tokenizer = AutoTokenizer.from_pretrained('cross-encoder/nli-deberta-v3-large')
-features = tokenizer(['A man is eating pizza', 'A black race car starts up in front of a crowd of people.'], ['A man eats something', 'A man is driving down a lonely road.'],  padding=True, truncation=True, return_tensors="pt")
-model.eval()
-with torch.no_grad():
-    scores = model(**features).logits
-    label_mapping = ['contradiction', 'entailment', 'neutral']
-    labels = [label_mapping[score_max] for score_max in scores.argmax(dim=1)]
-    print(labels)
-```
-## Zero-Shot Classification
-This model can also be used for zero-shot-classification:
-```python
-from transformers import pipeline
-classifier = pipeline("zero-shot-classification", model='cross-encoder/nli-deberta-v3-large')
-sent = "Apple just announced the newest iPhone X"
-candidate_labels = ["technology", "sports", "politics"]
-res = classifier(sent, candidate_labels)
-print(res)
 ```

+---
+language: en
+pipeline_tag: zero-shot-classification
+tags:
+- transformers
+datasets:
+- nyu-mll/multi_nli
+- stanfordnlp/snli
+metrics:
+- accuracy
+license: apache-2.0
+base_model:
+- microsoft/deberta-v3-large
+library_name: sentence-transformers
+---
+# Cross-Encoder for Natural Language Inference
+This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class. This model is based on [microsoft/deberta-v3-large](https://huggingface.co/microsoft/deberta-v3-large)
+## Training Data
+The model was trained on the [SNLI](https://nlp.stanford.edu/projects/snli/) and [MultiNLI](https://cims.nyu.edu/~sbowman/multinli/) datasets. For a given sentence pair, it will output three scores corresponding to the labels: contradiction, entailment, neutral.
+## Performance
+- Accuracy on SNLI-test dataset: 92.20
+- Accuracy on  MNLI mismatched set: 90.49
+For futher evaluation results, see [SBERT.net - Pretrained Cross-Encoder](https://www.sbert.net/docs/pretrained_cross-encoders.html#nli).
+## Usage
+Pre-trained models can be used like this:
+```python
+from sentence_transformers import CrossEncoder
+model = CrossEncoder('cross-encoder/nli-deberta-v3-large')
+scores = model.predict([('A man is eating pizza', 'A man eats something'), ('A black race car starts up in front of a crowd of people.', 'A man is driving down a lonely road.')])
+#Convert scores to labels
+label_mapping = ['contradiction', 'entailment', 'neutral']
+labels = [label_mapping[score_max] for score_max in scores.argmax(axis=1)]
+```
+## Usage with Transformers AutoModel
+You can use the model also directly with Transformers library (without SentenceTransformers library):
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+import torch
+model = AutoModelForSequenceClassification.from_pretrained('cross-encoder/nli-deberta-v3-large')
+tokenizer = AutoTokenizer.from_pretrained('cross-encoder/nli-deberta-v3-large')
+features = tokenizer(['A man is eating pizza', 'A black race car starts up in front of a crowd of people.'], ['A man eats something', 'A man is driving down a lonely road.'],  padding=True, truncation=True, return_tensors="pt")
+model.eval()
+with torch.no_grad():
+    scores = model(**features).logits
+    label_mapping = ['contradiction', 'entailment', 'neutral']
+    labels = [label_mapping[score_max] for score_max in scores.argmax(dim=1)]
+    print(labels)
+```
+## Zero-Shot Classification
+This model can also be used for zero-shot-classification:
+```python
+from transformers import pipeline
+classifier = pipeline("zero-shot-classification", model='cross-encoder/nli-deberta-v3-large')
+sent = "Apple just announced the newest iPhone X"
+candidate_labels = ["technology", "sports", "politics"]
+res = classifier(sent, candidate_labels)
+print(res)
 ```

config.json CHANGED Viewed

@@ -1,45 +1,49 @@
-{
-  "_name_or_path": "microsoft/deberta-v3-large",
-  "architectures": [
-    "DebertaV2ForSequenceClassification"
-  ],
-  "attention_probs_dropout_prob": 0.1,
-  "hidden_act": "gelu",
-  "hidden_dropout_prob": 0.1,
-  "hidden_size": 1024,
-  "id2label": {
-    "0": "contradiction",
-    "1": "entailment",
-    "2": "neutral"
-  },
-  "initializer_range": 0.02,
-  "intermediate_size": 4096,
-  "label2id": {
-    "contradiction": 0,
-    "entailment": 1,
-    "neutral": 2
-  },
-  "layer_norm_eps": 1e-07,
-  "max_position_embeddings": 512,
-  "max_relative_positions": -1,
-  "model_type": "deberta-v2",
-  "norm_rel_ebd": "layer_norm",
-  "num_attention_heads": 16,
-  "num_hidden_layers": 24,
-  "pad_token_id": 0,
-  "pooler_dropout": 0,
-  "pooler_hidden_act": "gelu",
-  "pooler_hidden_size": 1024,
-  "pos_att_type": [
-    "p2c",
-    "c2p"
-  ],
-  "position_biased_input": false,
-  "position_buckets": 256,
-  "relative_attention": true,
-  "share_att_key": true,
-  "torch_dtype": "float32",
-  "transformers_version": "4.11.3",
-  "type_vocab_size": 0,
-  "vocab_size": 128100
-}

+{
+  "architectures": [
+    "DebertaV2ForSequenceClassification"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 1024,
+  "id2label": {
+    "0": "contradiction",
+    "1": "entailment",
+    "2": "neutral"
+  },
+  "initializer_range": 0.02,
+  "intermediate_size": 4096,
+  "label2id": {
+    "contradiction": 0,
+    "entailment": 1,
+    "neutral": 2
+  },
+  "layer_norm_eps": 1e-07,
+  "legacy": true,
+  "max_position_embeddings": 512,
+  "max_relative_positions": -1,
+  "model_type": "deberta-v2",
+  "norm_rel_ebd": "layer_norm",
+  "num_attention_heads": 16,
+  "num_hidden_layers": 24,
+  "pad_token_id": 0,
+  "pooler_dropout": 0,
+  "pooler_hidden_act": "gelu",
+  "pooler_hidden_size": 1024,
+  "pos_att_type": [
+    "p2c",
+    "c2p"
+  ],
+  "position_biased_input": false,
+  "position_buckets": 256,
+  "relative_attention": true,
+  "sentence_transformers": {
+    "activation_fn": "torch.nn.modules.linear.Identity",
+    "version": "4.1.0.dev0"
+  },
+  "share_att_key": true,
+  "torch_dtype": "float32",
+  "transformers_version": "4.52.0.dev0",
+  "type_vocab_size": 0,
+  "vocab_size": 128100
+}

onnx/model.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:522031148bd5696cf67ca4a682ee56090cd5219558f29b62b311ea345e9f0fb7
+size 1742010231

special_tokens_map.json CHANGED Viewed

@@ -1,10 +1,46 @@
 {
-  "bos_token": "[CLS]",
-  "cls_token": "[CLS]",
-  "eos_token": "[SEP]",
-  "mask_token": "[MASK]",
-  "pad_token": "[PAD]",
-  "sep_token": "[SEP]",
   "unk_token": {
     "content": "[UNK]",
     "lstrip": false,

 {
+  "bos_token": {
+    "content": "[CLS]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "cls_token": {
+    "content": "[CLS]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "[SEP]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "mask_token": {
+    "content": "[MASK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "[PAD]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "sep_token": {
+    "content": "[SEP]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
   "unk_token": {
     "content": "[UNK]",
     "lstrip": false,

tokenizer.json CHANGED Viewed

The diff for this file is too large to render. See raw diff