File size: 10,299 Bytes
c24a629
 
 
 
 
 
 
 
 
 
 
 
2c14868
 
 
 
 
c24a629
 
 
 
 
 
 
 
 
 
 
 
 
2c14868
c24a629
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2c14868
 
 
 
 
 
 
 
 
 
c24a629
 
 
 
 
 
2c14868
c24a629
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2c14868
c24a629
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2c14868
c24a629
 
016e2fe
c24a629
 
 
 
 
 
 
 
 
 
 
 
 
 
016e2fe
 
2c14868
 
 
 
 
016e2fe
 
 
2c14868
 
 
 
 
 
 
 
 
c24a629
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
---
base_model: mini1013/master_domain
library_name: setfit
metrics:
- accuracy
pipeline_tag: text-classification
tags:
- setfit
- sentence-transformers
- text-classification
- generated_from_setfit_trainer
widget:
- text: 땀흡수 스포츠 운동  머리 헤어밴드 여자아이 고등학생 여성 검정 드렉온미
- text: 니켄 접이형 다리털제거기 1p 숱제거기 다리털면도 옵션없음 제이에이치코리아
- text: 관리 눈썹면도기 면도 미용 니켄 일자형 눈썹칼 옵션없음 프렌드리빙
- text: 천사의 웨딩드레스는 빠르게 승인받을  있는 로즈 레드 신부  10001N548703 중_로즈 레드 선배
- text: 립브러쉬 실리콘 립스머지 휴대용 투명 미리
inference: true
model-index:
- name: SetFit with mini1013/master_domain
  results:
  - task:
      type: text-classification
      name: Text Classification
    dataset:
      name: Unknown
      type: unknown
      split: test
    metrics:
    - type: accuracy
      value: 0.6375838926174496
      name: Accuracy
---

# SetFit with mini1013/master_domain

This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [mini1013/master_domain](https://huggingface.co/mini1013/master_domain) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
2. Training a classification head with features from the fine-tuned Sentence Transformer.

## Model Details

### Model Description
- **Model Type:** SetFit
- **Sentence Transformer body:** [mini1013/master_domain](https://huggingface.co/mini1013/master_domain)
- **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
- **Maximum Sequence Length:** 512 tokens
- **Number of Classes:** 8 classes
<!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
<!-- - **Language:** Unknown -->
<!-- - **License:** Unknown -->

### Model Sources

- **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
- **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
- **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)

### Model Labels
| Label | Examples                                                                                                                                                                                                 |
|:------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 2.0   | <ul><li>'이니스프리 샤워 볼 1ea 이니스프리'</li><li>'3컬러 양면 귀이개 택1 TW51DC3F0 블랙 블루 블루 트리플도매'</li><li>'프리미엄 실리콘 니플 가리개 여성 니플 패치 원형 알스상회'</li></ul>                                                                     |
| 0.0   | <ul><li>'카페인 커피 샴푸바만들기 교육용 수제비누 키트 DIY 자원순환 업사이클링 옵션없음 처음(CHOEUM)'</li><li>'플로럴워터 - 로즈마리워터 1 리터 옵션없음 주식회사 월터엔터프라이즈'</li><li>'봄봄솝 바다 비누 만들기 DIY 조개 집콕 미술 (6개 완성, 조개몰드포함) 옵션없음 봄상회'</li></ul>              |
| 6.0   | <ul><li>'고양이귀 세안 헤어밴드 5p세트 KD-8679 목욕용 세면 샤워용 극세사 옵션없음 초이스리테일 5'</li><li>'긴머리 샤워캡 PEVA 방수 도트 헤어캡 핑크도트 허승호'</li><li>'편리한 찍찍이타입 머리밴드 스카이 옵션없음 와이엠테크(YM tech)'</li></ul>                                    |
| 5.0   | <ul><li>'에이브 면분첩 - 중형 옵션없음 하민하이'</li><li>'마스크 2 TYPE NEW갸름마스크턱볼살용 얼굴 턱볼살 옵션없음 유남상사'</li><li>'보정웨어 TYPE 턱볼살땡 몸매관리 2 마스크 얼굴 옵션없음 최상용'</li></ul>                                                            |
| 7.0   | <ul><li>'가루 파우더 케이스 30g 노세범 땀띠 파우더 소분 공병 (스푼 ) 30g 선데이베리베스트'</li><li>'면봉보관함 화장솜 케이스 디스펜서 통 옵션없음 홍스지니몰'</li><li>'실리콘공병 보틀 고리형 4종세트 추가금X 그루비스윔 수영장 여행 헬스장 캠핑용 소분용기 옵션없음 스퀘어오브에이치'</li></ul>                |
| 3.0   | <ul><li>'눈썹 족집게 오렌지 C 1p 청결용품 눈관리 핀셋 옵션없음 비즈파크'</li><li>'텐웨이브 쌍꺼풀테이프 레이스 티안나는 누드쌍테 단면쌍테 쌍커풀테이프 옵션없음 텐웨이브'</li><li>'1+1+1 할인 일자형 눈썹정리 눈썹칼 3P 옵션없음 버닝365마켓'</li></ul>                                      |
| 4.0   | <ul><li>'이레즈미 타투스티커 초대형 (여성용) 긴팔 옵션없음 알렉산더(ALEXANDER)'</li><li>'미니 타투 스티커 헤나 도안 형광 야광 HC-016 컬러타투 CC시리즈_CC-028 블루밍마켓'</li><li>'2주지속 리얼 문신 팔손가락 타투스티커 티안나는 반영구 방수 헤나 문신 나비 세트 A6 ( 2장세트 ) 에테르넬'</li></ul> |
| 1.0   | <ul><li>'립펜슬 실버 고급립솔 화장브러쉬 옵션없음 엔에이티글로벌'</li><li>'아이라인붓 애교살브러쉬 눈썹브러쉬 1100-5 아이라인브러시 옵션없음 동묘야시장'</li><li>'아이브로우브러쉬 8pcs Cardcaptor 세트 파운데이션 섀도우 브로우 Pincel 8pcs_CHINA 드림비정선'</li></ul>                    |

## Evaluation

### Metrics
| Label   | Accuracy |
|:--------|:---------|
| **all** | 0.6376   |

## Uses

### Direct Use for Inference

First install the SetFit library:

```bash
pip install setfit
```

Then you can load this model and run inference.

```python
from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("mini1013/master_cate_bt5_test")
# Run inference
preds = model("립브러쉬 실리콘 립스머지 휴대용 투명 미리")
```

<!--
### Downstream Use

*List how someone could finetune this model on their own dataset.*
-->

<!--
### Out-of-Scope Use

*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->

<!--
## Bias, Risks and Limitations

*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->

<!--
### Recommendations

*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->

## Training Details

### Training Set Metrics
| Training set | Min | Median  | Max |
|:-------------|:----|:--------|:----|
| Word count   | 3   | 10.0538 | 20  |

| Label | Training Sample Count |
|:------|:----------------------|
| 0.0   | 12                    |
| 1.0   | 12                    |
| 2.0   | 12                    |
| 3.0   | 19                    |
| 4.0   | 20                    |
| 5.0   | 27                    |
| 6.0   | 13                    |
| 7.0   | 15                    |

### Training Hyperparameters
- batch_size: (512, 512)
- num_epochs: (50, 50)
- max_steps: -1
- sampling_strategy: oversampling
- num_iterations: 60
- body_learning_rate: (2e-05, 1e-05)
- head_learning_rate: 0.01
- loss: CosineSimilarityLoss
- distance_metric: cosine_distance
- margin: 0.25
- end_to_end: False
- use_amp: False
- warmup_proportion: 0.1
- l2_weight: 0.01
- seed: 42
- eval_max_steps: -1
- load_best_model_at_end: False

### Training Results
| Epoch  | Step | Training Loss | Validation Loss |
|:------:|:----:|:-------------:|:---------------:|
| 0.0625 | 1    | 0.4921        | -               |
| 3.125  | 50   | 0.2813        | -               |
| 6.25   | 100  | 0.0272        | -               |
| 9.375  | 150  | 0.0167        | -               |
| 12.5   | 200  | 0.002         | -               |
| 15.625 | 250  | 0.0001        | -               |
| 18.75  | 300  | 0.0001        | -               |
| 21.875 | 350  | 0.0001        | -               |
| 25.0   | 400  | 0.0001        | -               |
| 28.125 | 450  | 0.0001        | -               |
| 31.25  | 500  | 0.0001        | -               |
| 34.375 | 550  | 0.0001        | -               |
| 37.5   | 600  | 0.0001        | -               |
| 40.625 | 650  | 0.0001        | -               |
| 43.75  | 700  | 0.0001        | -               |
| 46.875 | 750  | 0.0001        | -               |
| 50.0   | 800  | 0.0001        | -               |

### Framework Versions
- Python: 3.10.12
- SetFit: 1.1.0
- Sentence Transformers: 3.3.1
- Transformers: 4.44.2
- PyTorch: 2.2.0a0+81ea7a4
- Datasets: 3.2.0
- Tokenizers: 0.19.1

## Citation

### BibTeX
```bibtex
@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
```

<!--
## Glossary

*Clearly define terms in order to be accessible across audiences.*
-->

<!--
## Model Card Authors

*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->

<!--
## Model Card Contact

*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->