yasirdemircan commited on
Commit
a359158
·
verified ·
1 Parent(s): 750fe15

Add SetFit model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,184 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: sentence-transformers/paraphrase-mpnet-base-v2
3
+ library_name: setfit
4
+ metrics:
5
+ - accuracy
6
+ pipeline_tag: text-classification
7
+ tags:
8
+ - setfit
9
+ - sentence-transformers
10
+ - text-classification
11
+ - generated_from_setfit_trainer
12
+ widget:
13
+ - text: X9.31 PRNG seed keys Triple-DES (112 bit) Generated by gathering entropy.
14
+ - text: PRNG seed key Pre-loaded during the manufacturing process, compiled in the
15
+ binary.
16
+ - text: ANSI X9.31 PRNG key Triple DES key Generated internally by non-approved RNG
17
+ Volatile memory only (plaintext) Zeroized when the module reboots.
18
+ - text: All CSPs are injected during manufacture.
19
+ - text: The internal DRBG state value of the RNG is stored in NVRAM for persistent
20
+ use.
21
+ inference: true
22
+ ---
23
+
24
+ # SetFit with sentence-transformers/paraphrase-mpnet-base-v2
25
+
26
+ This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
27
+
28
+ The model has been trained using an efficient few-shot learning technique that involves:
29
+
30
+ 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
31
+ 2. Training a classification head with features from the fine-tuned Sentence Transformer.
32
+
33
+ ## Model Details
34
+
35
+ ### Model Description
36
+ - **Model Type:** SetFit
37
+ - **Sentence Transformer body:** [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2)
38
+ - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
39
+ - **Maximum Sequence Length:** 512 tokens
40
+ - **Number of Classes:** 2 classes
41
+ <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
42
+ <!-- - **Language:** Unknown -->
43
+ <!-- - **License:** Unknown -->
44
+
45
+ ### Model Sources
46
+
47
+ - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
48
+ - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
49
+ - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
50
+
51
+ ### Model Labels
52
+ | Label | Examples |
53
+ |:---------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
54
+ | negative | <ul><li>'PRNG ANSI X9.31 Key K1, K2 Internal 3DES Key Automatically Generated per seeding This is an internal key used for ANSI X9.31 192 bits Internal Key generate from the seed and seed key'</li><li>'ANSI X9.31 PRNG key AES-128 Generated internally by the Kernel.'</li><li>'The seed key is used as an input to the X9.31 RNG, a deterministic random number generator, and is generally not stored long term.'</li></ul> |
55
+ | positive | <ul><li>'The ANSI X9.31 RNG is seeded using a 128-bit AES seed key generated external to the module.'</li><li>'An AES-256 seed key generated during manufacturing is used to initialize the RNG in the encryption algorithm.'</li><li>'PRNG seed key is static during the lifetime of the module.'</li></ul> |
56
+
57
+ ## Uses
58
+
59
+ ### Direct Use for Inference
60
+
61
+ First install the SetFit library:
62
+
63
+ ```bash
64
+ pip install setfit
65
+ ```
66
+
67
+ Then you can load this model and run inference.
68
+
69
+ ```python
70
+ from setfit import SetFitModel
71
+
72
+ # Download from the 🤗 Hub
73
+ model = SetFitModel.from_pretrained("yasirdemircan/setfit_rng_v5")
74
+ # Run inference
75
+ preds = model("All CSPs are injected during manufacture.")
76
+ ```
77
+
78
+ <!--
79
+ ### Downstream Use
80
+
81
+ *List how someone could finetune this model on their own dataset.*
82
+ -->
83
+
84
+ <!--
85
+ ### Out-of-Scope Use
86
+
87
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
88
+ -->
89
+
90
+ <!--
91
+ ## Bias, Risks and Limitations
92
+
93
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
94
+ -->
95
+
96
+ <!--
97
+ ### Recommendations
98
+
99
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
100
+ -->
101
+
102
+ ## Training Details
103
+
104
+ ### Training Set Metrics
105
+ | Training set | Min | Median | Max |
106
+ |:-------------|:----|:--------|:----|
107
+ | Word count | 6 | 20.8222 | 59 |
108
+
109
+ | Label | Training Sample Count |
110
+ |:---------|:----------------------|
111
+ | negative | 21 |
112
+ | positive | 24 |
113
+
114
+ ### Training Hyperparameters
115
+ - batch_size: (16, 16)
116
+ - num_epochs: (4, 4)
117
+ - max_steps: -1
118
+ - sampling_strategy: oversampling
119
+ - body_learning_rate: (2e-05, 1e-05)
120
+ - head_learning_rate: 0.01
121
+ - loss: CosineSimilarityLoss
122
+ - distance_metric: cosine_distance
123
+ - margin: 0.25
124
+ - end_to_end: False
125
+ - use_amp: False
126
+ - warmup_proportion: 0.1
127
+ - l2_weight: 0.01
128
+ - seed: 42
129
+ - eval_max_steps: -1
130
+ - load_best_model_at_end: True
131
+
132
+ ### Training Results
133
+ | Epoch | Step | Training Loss | Validation Loss |
134
+ |:------:|:----:|:-------------:|:---------------:|
135
+ | 0.0294 | 1 | 0.2472 | - |
136
+ | 1.0 | 34 | - | 0.2296 |
137
+ | 1.4706 | 50 | 0.0969 | - |
138
+ | 2.0 | 68 | - | 0.3144 |
139
+ | 2.9412 | 100 | 0.0006 | - |
140
+ | 3.0 | 102 | - | 0.3090 |
141
+ | 4.0 | 136 | - | 0.3083 |
142
+
143
+ ### Framework Versions
144
+ - Python: 3.10.15
145
+ - SetFit: 1.2.0.dev0
146
+ - Sentence Transformers: 3.3.1
147
+ - Transformers: 4.45.2
148
+ - PyTorch: 2.5.1+cu124
149
+ - Datasets: 2.19.1
150
+ - Tokenizers: 0.20.1
151
+
152
+ ## Citation
153
+
154
+ ### BibTeX
155
+ ```bibtex
156
+ @article{https://doi.org/10.48550/arxiv.2209.11055,
157
+ doi = {10.48550/ARXIV.2209.11055},
158
+ url = {https://arxiv.org/abs/2209.11055},
159
+ author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
160
+ keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
161
+ title = {Efficient Few-Shot Learning Without Prompts},
162
+ publisher = {arXiv},
163
+ year = {2022},
164
+ copyright = {Creative Commons Attribution 4.0 International}
165
+ }
166
+ ```
167
+
168
+ <!--
169
+ ## Glossary
170
+
171
+ *Clearly define terms in order to be accessible across audiences.*
172
+ -->
173
+
174
+ <!--
175
+ ## Model Card Authors
176
+
177
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
178
+ -->
179
+
180
+ <!--
181
+ ## Model Card Contact
182
+
183
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
184
+ -->
config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "sentence-transformers/paraphrase-mpnet-base-v2",
3
+ "architectures": [
4
+ "MPNetModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "eos_token_id": 2,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 3072,
14
+ "layer_norm_eps": 1e-05,
15
+ "max_position_embeddings": 514,
16
+ "model_type": "mpnet",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 1,
20
+ "relative_attention_num_buckets": 32,
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.45.2",
23
+ "vocab_size": 30527
24
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.3.1",
4
+ "transformers": "4.45.2",
5
+ "pytorch": "2.5.1+cu124"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
config_setfit.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "normalize_embeddings": false,
3
+ "labels": [
4
+ "negative",
5
+ "positive"
6
+ ]
7
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:aba1a3febe1d337cb9daa6b774111b0bdfb5c64ff158bee9a9a1d05cd2cdf5ed
3
+ size 437967672
model_head.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:63c7aa63a7a0904f0ca3803bd3508b835b6af58f19c5937cf14934400f78b83d
3
+ size 7055
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "<s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "</s>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "<mask>",
25
+ "lstrip": true,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "<pad>",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "</s>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "[UNK]",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "<s>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "<pad>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "</s>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "104": {
28
+ "content": "[UNK]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "30526": {
36
+ "content": "<mask>",
37
+ "lstrip": true,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "bos_token": "<s>",
45
+ "clean_up_tokenization_spaces": false,
46
+ "cls_token": "<s>",
47
+ "do_basic_tokenize": true,
48
+ "do_lower_case": true,
49
+ "eos_token": "</s>",
50
+ "mask_token": "<mask>",
51
+ "model_max_length": 512,
52
+ "never_split": null,
53
+ "pad_token": "<pad>",
54
+ "sep_token": "</s>",
55
+ "strip_accents": null,
56
+ "tokenize_chinese_chars": true,
57
+ "tokenizer_class": "MPNetTokenizer",
58
+ "unk_token": "[UNK]"
59
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff