Add new CrossEncoder model

Browse files

Files changed (7) hide show

README.md +68 -68
config.json +46 -42
merges.txt +1 -1
onnx/model.onnx +3 -0
special_tokens_map.json +51 -1
tokenizer.json +0 -0
tokenizer_config.json +66 -1

README.md CHANGED Viewed

@@ -1,69 +1,69 @@
----
-language: en
-pipeline_tag: zero-shot-classification
-tags:
-- transformers
-datasets:
-- nyu-mll/multi_nli
-- stanfordnlp/snli
-metrics:
-- accuracy
-license: apache-2.0
-base_model:
-- microsoft/deberta-base
-library_name: sentence-transformers
----
-# Cross-Encoder for Natural Language Inference
-This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.
-## Training Data
-The model was trained on the [SNLI](https://nlp.stanford.edu/projects/snli/) and [MultiNLI](https://cims.nyu.edu/~sbowman/multinli/) datasets. For a given sentence pair, it will output three scores corresponding to the labels: contradiction, entailment, neutral.
-## Performance
-For evaluation results, see [SBERT.net - Pretrained Cross-Encoder](https://www.sbert.net/docs/pretrained_cross-encoders.html#nli).
-## Usage
-Pre-trained models can be used like this:
-```python
-from sentence_transformers import CrossEncoder
-model = CrossEncoder('cross-encoder/nli-deberta-base')
-scores = model.predict([('A man is eating pizza', 'A man eats something'), ('A black race car starts up in front of a crowd of people.', 'A man is driving down a lonely road.')])
-#Convert scores to labels
-label_mapping = ['contradiction', 'entailment', 'neutral']
-labels = [label_mapping[score_max] for score_max in scores.argmax(axis=1)]
-```
-## Usage with Transformers AutoModel
-You can use the model also directly with Transformers library (without SentenceTransformers library):
-```python
-from transformers import AutoTokenizer, AutoModelForSequenceClassification
-import torch
-model = AutoModelForSequenceClassification.from_pretrained('cross-encoder/nli-deberta-base')
-tokenizer = AutoTokenizer.from_pretrained('cross-encoder/nli-deberta-base')
-features = tokenizer(['A man is eating pizza', 'A black race car starts up in front of a crowd of people.'], ['A man eats something', 'A man is driving down a lonely road.'],  padding=True, truncation=True, return_tensors="pt")
-model.eval()
-with torch.no_grad():
-    scores = model(**features).logits
-    label_mapping = ['contradiction', 'entailment', 'neutral']
-    labels = [label_mapping[score_max] for score_max in scores.argmax(dim=1)]
-    print(labels)
-```
-## Zero-Shot Classification
-This model can also be used for zero-shot-classification:
-```python
-from transformers import pipeline
-classifier = pipeline("zero-shot-classification", model='cross-encoder/nli-deberta-base')
-sent = "Apple just announced the newest iPhone X"
-candidate_labels = ["technology", "sports", "politics"]
-res = classifier(sent, candidate_labels)
-print(res)
 ```

+---
+language: en
+pipeline_tag: zero-shot-classification
+tags:
+- transformers
+datasets:
+- nyu-mll/multi_nli
+- stanfordnlp/snli
+metrics:
+- accuracy
+license: apache-2.0
+base_model:
+- microsoft/deberta-base
+library_name: sentence-transformers
+---
+# Cross-Encoder for Natural Language Inference
+This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.
+## Training Data
+The model was trained on the [SNLI](https://nlp.stanford.edu/projects/snli/) and [MultiNLI](https://cims.nyu.edu/~sbowman/multinli/) datasets. For a given sentence pair, it will output three scores corresponding to the labels: contradiction, entailment, neutral.
+## Performance
+For evaluation results, see [SBERT.net - Pretrained Cross-Encoder](https://www.sbert.net/docs/pretrained_cross-encoders.html#nli).
+## Usage
+Pre-trained models can be used like this:
+```python
+from sentence_transformers import CrossEncoder
+model = CrossEncoder('cross-encoder/nli-deberta-base')
+scores = model.predict([('A man is eating pizza', 'A man eats something'), ('A black race car starts up in front of a crowd of people.', 'A man is driving down a lonely road.')])
+#Convert scores to labels
+label_mapping = ['contradiction', 'entailment', 'neutral']
+labels = [label_mapping[score_max] for score_max in scores.argmax(axis=1)]
+```
+## Usage with Transformers AutoModel
+You can use the model also directly with Transformers library (without SentenceTransformers library):
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+import torch
+model = AutoModelForSequenceClassification.from_pretrained('cross-encoder/nli-deberta-base')
+tokenizer = AutoTokenizer.from_pretrained('cross-encoder/nli-deberta-base')
+features = tokenizer(['A man is eating pizza', 'A black race car starts up in front of a crowd of people.'], ['A man eats something', 'A man is driving down a lonely road.'],  padding=True, truncation=True, return_tensors="pt")
+model.eval()
+with torch.no_grad():
+    scores = model(**features).logits
+    label_mapping = ['contradiction', 'entailment', 'neutral']
+    labels = [label_mapping[score_max] for score_max in scores.argmax(dim=1)]
+    print(labels)
+```
+## Zero-Shot Classification
+This model can also be used for zero-shot-classification:
+```python
+from transformers import pipeline
+classifier = pipeline("zero-shot-classification", model='cross-encoder/nli-deberta-base')
+sent = "Apple just announced the newest iPhone X"
+candidate_labels = ["technology", "sports", "politics"]
+res = classifier(sent, candidate_labels)
+print(res)
 ```

config.json CHANGED Viewed

@@ -1,42 +1,46 @@
-{
-  "_name_or_path": "microsoft/deberta-base",
-  "architectures": [
-    "DebertaForSequenceClassification"
-  ],
-  "attention_probs_dropout_prob": 0.1,
-  "hidden_act": "gelu",
-  "hidden_dropout_prob": 0.1,
-  "hidden_size": 768,
-  "id2label": {
-    "0": "contradiction",
-    "1": "entailment",
-    "2": "neutral"
-  },
-  "initializer_range": 0.02,
-  "intermediate_size": 3072,
-  "label2id": {
-    "contradiction": 0,
-    "entailment": 1,
-    "neutral": 2
-  },
-  "layer_norm_eps": 1e-07,
-  "max_position_embeddings": 512,
-  "max_relative_positions": -1,
-  "model_type": "deberta",
-  "num_attention_heads": 12,
-  "num_hidden_layers": 12,
-  "pad_token_id": 0,
-  "pooler_dropout": 0,
-  "pooler_hidden_act": "gelu",
-  "pooler_hidden_size": 768,
-  "pos_att_type": [
-    "c2p",
-    "p2c"
-  ],
-  "position_biased_input": false,
-  "relative_attention": true,
-  "tokenizer_class": "DebertaTokenizerFast",
-  "transformers_version": "4.7.0",
-  "type_vocab_size": 0,
-  "vocab_size": 50265
-}

+{
+  "architectures": [
+    "DebertaForSequenceClassification"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 768,
+  "id2label": {
+    "0": "contradiction",
+    "1": "entailment",
+    "2": "neutral"
+  },
+  "initializer_range": 0.02,
+  "intermediate_size": 3072,
+  "label2id": {
+    "contradiction": 0,
+    "entailment": 1,
+    "neutral": 2
+  },
+  "layer_norm_eps": 1e-07,
+  "legacy": true,
+  "max_position_embeddings": 512,
+  "max_relative_positions": -1,
+  "model_type": "deberta",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 12,
+  "pad_token_id": 0,
+  "pooler_dropout": 0,
+  "pooler_hidden_act": "gelu",
+  "pooler_hidden_size": 768,
+  "pos_att_type": [
+    "c2p",
+    "p2c"
+  ],
+  "position_biased_input": false,
+  "relative_attention": true,
+  "sentence_transformers": {
+    "activation_fn": "torch.nn.modules.linear.Identity",
+    "version": "4.1.0.dev0"
+  },
+  "tokenizer_class": "DebertaTokenizerFast",
+  "transformers_version": "4.52.0.dev0",
+  "type_vocab_size": 0,
+  "vocab_size": 50265
+}

merges.txt CHANGED Viewed

@@ -1,4 +1,4 @@
-#version: 0.2 - Trained by `huggingface/tokenizers`
 Ġ t
 Ġ a
 h e

+#version: 0.2
 Ġ t
 Ġ a
 h e

onnx/model.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e21552123e1329ef20edc8b64d02c0dca67396496cbcc86391dea9ae5d13c9b1
+size 557350444

special_tokens_map.json CHANGED Viewed

	@@ -1 +1,51 @@
1	- {"bos_token": {"content": "[CLS]", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}, "eos_token": {"content": "[SEP]", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}, "unk_token": {"content": "[UNK]", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}, "sep_token": {"content": "[SEP]", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}, "pad_token": {"content": "[PAD]", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}, "cls_token": {"content": "[CLS]", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}, "mask_token": {"content": "[MASK]", "single_word": false, "lstrip": true, "rstrip": false, "normalized": true}}

+{
+  "bos_token": {
+    "content": "[CLS]",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "cls_token": {
+    "content": "[CLS]",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "[SEP]",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "mask_token": {
+    "content": "[MASK]",
+    "lstrip": true,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "[PAD]",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "sep_token": {
+    "content": "[SEP]",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "[UNK]",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json CHANGED Viewed

	@@ -1 +1,66 @@
1	- {"unk_token": {"content": "[UNK]", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "bos_token": {"content": "[CLS]", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "eos_token": {"content": "[SEP]", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "add_prefix_space": false, "errors": "replace", "sep_token": {"content": "[SEP]", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "cls_token": {"content": "[CLS]", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "pad_token": {"content": "[PAD]", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "mask_token": {"content": "[MASK]", "single_word": false, "lstrip": true, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "do_lower_case": false, "vocab_type": "gpt2", "model_max_length": 512, "special_tokens_map_file": null, "name_or_path": "nli-deberta-base/"}

+{
+  "add_prefix_space": false,
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "3": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50264": {
+      "content": "[MASK]",
+      "lstrip": true,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "bos_token": "[CLS]",
+  "clean_up_tokenization_spaces": false,
+  "cls_token": "[CLS]",
+  "do_lower_case": false,
+  "eos_token": "[SEP]",
+  "errors": "replace",
+  "extra_special_tokens": {},
+  "mask_token": "[MASK]",
+  "max_length": 512,
+  "model_max_length": 512,
+  "pad_to_multiple_of": null,
+  "pad_token": "[PAD]",
+  "pad_token_type_id": 0,
+  "padding_side": "right",
+  "sep_token": "[SEP]",
+  "stride": 0,
+  "tokenizer_class": "DebertaTokenizer",
+  "truncation_side": "right",
+  "truncation_strategy": "longest_first",
+  "unk_token": "[UNK]",
+  "vocab_type": "gpt2"
+}