metadata

base_model: mini1013/master_domain
library_name: setfit
metrics:
  - accuracy
pipeline_tag: text-classification
tags:
  - setfit
  - sentence-transformers
  - text-classification
  - generated_from_setfit_trainer
widget:
  - text: 듀크레이 덱시안 메드 아이리드 크림 15ml 피부과 옵션없음 비타콕
  - text: 라벤더 일회용 여성로션 3ML 옵션없음 동양유통
  - text: KAHI 멀티밤 리필키트 x 2개 옵션없음 에프엔지트렌드
  - text: 토니어 유기농 호호바 오일 30ml 옵션없음 주식회사 아람케이
  - text: 치카이치코 누드 판타지 화이트닝 크림 55ml 옵션없음 다물다선
inference: true
model-index:
  - name: SetFit with mini1013/master_domain
    results:
      - task:
          type: text-classification
          name: Text Classification
        dataset:
          name: Unknown
          type: unknown
          split: test
        metrics:
          - type: accuracy
            value: 0.821590909090909
            name: Accuracy

SetFit with mini1013/master_domain

This is a SetFit model that can be used for Text Classification. This SetFit model uses mini1013/master_domain as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: mini1013/master_domain
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 11 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
6.0	'가히 멀티 밤 리필형 9g x 1개(본품) + 9g x 3개(리필) 옵션없음 주식회사 제이제이몰' '김정문알로에 큐어플러스 인텐시브 2x 크림 50g 3개 옵션없음 리틀리아' 'Good Molecules 젠틀 레티놀 크림 레티놀과 바쿠치올이 함유된 나이트 과색소 침 옵션없음 비포유'
1.0	'크리니크 드라마티컬리 디퍼런트 모이스처라이징 젤 125ml(건성, 중복합) 옵션없음 옐로우로켓' '크리니크 드라마티컬리 디퍼런트 모이스처라이징 젤 125ml(건성, 중복합성) 옵션없음 샹양무역 유한회사' '[케이스훼손] 더 후 공진향 인양 로션 110ml (케이스훼손) 인양 로션. 주식회사 포러스'
10.0	'한율 송담 탄력 기초 2종 세트 (스킨+에멀젼) 기초 스킨 로션 여성 부모님화장품 스킨+에멀젼+아이크림+크림 홈뷰티샵' '오휘 더 퍼스트 제너츄어 3종 스페셜 세트 옵션없음 브라우니박스2' '쟝블랑 그린티 밸런싱 여성 3종세트 옵션없음 아 이리스'
7.0	'[SKINFOOD] 캐롯 카로틴 카밍 워터패드 30매 (NEW 집게+패드케이스 ) 당근 (주)더블유컨셉코리아' '메디힐 티트리 트러블 패드 100매 + 리필 100매 옵션없음 미뇨네' '프리업 원더 포어 클리어 패드 휴대용 키트 10개입 옵션없음 주식회사 브랜드커머스'
4.0	'에뛰드 모이스트풀 콜라겐 아이 크림 28ml Moistfull Collagen Eye Cream 옵션없음 월드세븐' '마티나겝하르트 아보카도 아이크림 15ml 옵션없음 포비티엘' '가히 아이밤 옵션없음 남영오'
9.0	'안나홀츠 호호바오일 에코서트인증 유기농 압착 비정제 천연 호호바오일 60ml 2병 옵션없음 (주)안나홀츠' '스킨아이 유기농 티트리 오일 옵션없음 폴슨 주식회사(FOLSN Inc.)' '[3개세트] 유기농 티트리 오일 10ml 옵션없음 주식회사 보나쥬르'
0.0	'멀티밤스틱 주름지우개 보툴레닌 기가스틱 넥스젠바이오' '벨라수 데콜테 넥크림 50ml 벨라수' '종근당 CKD 레티노 콜라겐 저분자 300 괄사 목주름 크림 50ml 동의함 일랑팩토리'
8.0	'AHC 누드톤업크림 내추럴글로우 40ml 옵션없음 가온' 'AHC 아우라 시크릿 톤업크림 50g 옵션없음 마리공주' 'AHC 톤업크림 아우라 시크릿 50g 옵션없음 쇼핑사거리'
2.0	'자트인사이트 울트라 셋팅 진짜 픽서 50ml 2개 옵션없음 솔마켓' 'ECLADO (1+1) NK-CX 프로틴 포텐 부스터 100ml 뿌리는 단백질 [1+1]NK-CX 포텐부스터 하이그래' 'CNP 차앤박 프로폴리스 에너지 앰플 미스트 250ml 1개 옵션없음 주식회사 아이지비'
3.0	'네이처리퍼블릭 리얼 스퀴즈 알로에 베라 토너 150ml(신형) 옵션없음 마켓유' '허브 솔루션 위치하젤 토너 500ml / 1개 허브 솔루션 알로에 베라 토너 500ml 듀얼샵' '르네셀 멀티 펩타이드 토너(재고정리) 옵션없음 숙이네 잡화'
5.0	'브링그린 알로에 99% 수딩 젤 300ml(민감성)/JL 옵션없음 주식회사 제이엘' '브링그린 알로에 99% 수딩젤 300ml 옵션없음 모현' '350211 포어 슈링커 바쿠치올 세럼 50ml 옵션없음 제이에프무역'

Evaluation

Metrics

Label	Accuracy
all	0.8216

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("mini1013/master_cate_bt8_test")
# Run inference
preds = model("라벤더 일회용 여성로션 3ML 옵션없음 동양유통")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	4	9.2179	23

Label	Training Sample Count
0.0	18
1.0	18
2.0	22
3.0	20
4.0	32
5.0	30
6.0	40
7.0	23
8.0	17
9.0	14
10.0	23

Training Hyperparameters

batch_size: (512, 512)
num_epochs: (40, 40)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 50
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0385	1	0.4822	-
1.9231	50	0.3286	-
3.8462	100	0.0503	-
5.7692	150	0.028	-
7.6923	200	0.0213	-
9.6154	250	0.0084	-
11.5385	300	0.0002	-
13.4615	350	0.0001	-
15.3846	400	0.0001	-
17.3077	450	0.0001	-
19.2308	500	0.0001	-
21.1538	550	0.0001	-
23.0769	600	0.0001	-
25.0	650	0.0001	-
26.9231	700	0.0	-
28.8462	750	0.0	-
30.7692	800	0.0	-
32.6923	850	0.0	-
34.6154	900	0.0	-
36.5385	950	0.0	-
38.4615	1000	0.0	-

Framework Versions

Python: 3.10.12
SetFit: 1.1.0
Sentence Transformers: 3.3.1
Transformers: 4.44.2
PyTorch: 2.2.0a0+81ea7a4
Datasets: 3.2.0
Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}