SetFit with BAAI/bge-base-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-base-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: BAAI/bge-base-en-v1.5
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 5 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
1	'###Instruction: Multi-class classification, answer with one of the labels: [delete, keep, speedy delete, comment] : ###Input: Q16350309: Adrianu (Q16350309) : Wikimedia disambiguation page : ( [MASK]
4	'###Instruction: Multi-class classification, answer with one of the labels: [delete, keep, speedy delete, comment] : ###Input: Q10135731: Template:Rfd links Merged with Q7150841 . Kittenono ( talk ) 16:36, 8 August 2013 (UTC)' '###Instruction: Multi-class classification, answer with one of the labels: [delete, keep, speedy delete, comment] : ###Input: Q13164625: Guodian Chu Slips (Q13164625) : archaeological discovery in 1993 in Hubei, China : ( [MASK]
0	'###Instruction: Multi-class classification, answer with one of the labels: [delete, keep, speedy delete, comment] : ###Input: Q60781646: Yellow Vests movement in France (Q60781646) : Spontaneous social movement in France : ( [MASK]
3	'###Instruction: Multi-class classification, answer with one of the labels: [delete, keep, speedy delete, comment] : ###Input: Q16017531: Template:Rfd links Merged into Q13135852 . -- DracoRoboter ([[User talk:DracoRoboter
2	'###Instruction: Multi-class classification, answer with one of the labels: [delete, keep, speedy delete, comment] : ###Input: Q118210393: Malgorzata (Q118210393) : female given name : ( [MASK]

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("research-dump/bge-base-en-v1.5_wikidata_ent_masked_wikidata_ent_masked")
# Run inference
preds = model("###Instruction: Multi-class classification, answer with one of the labels: [delete, keep, speedy delete, comment] : ###Input:  Q11843502: Template:Rfd links Merged with Q4470435 . Succu ([[User talk:Succu| int:Talkpagelinktext ]]) 19:36, 12 February 2014 (UTC)")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	28	53.7838	2279

Label	Training Sample Count
0	2
1	733
2	18
3	56
4	190

Training Hyperparameters

batch_size: (8, 2)
num_epochs: (5, 5)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 10
body_learning_rate: (1e-05, 1e-05)
head_learning_rate: 5e-05
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: True
use_amp: True
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0004	1	0.041	-
0.2002	500	0.1861	0.1338
0.4003	1000	0.0927	0.1352
0.6005	1500	0.0539	0.1385
0.8006	2000	0.0414	0.1415
1.0008	2500	0.0284	0.1429
1.2010	3000	0.0218	0.1359
1.4011	3500	0.0204	0.1388
1.6013	4000	0.0184	0.1486
1.8014	4500	0.0157	0.1465
2.0016	5000	0.0116	0.1530
2.2018	5500	0.0088	0.1492
2.4019	6000	0.0078	0.1582
2.6021	6500	0.0081	0.1680
2.8022	7000	0.0062	0.1487
3.0024	7500	0.0053	0.1466
3.2026	8000	0.004	0.1462
3.4027	8500	0.0039	0.1489
3.6029	9000	0.0025	0.1507
3.8030	9500	0.0014	0.1487
4.0032	10000	0.0015	0.1471
4.2034	10500	0.0017	0.1433
4.4035	11000	0.001	0.1434
4.6037	11500	0.0013	0.1425
4.8038	12000	0.0007	0.1436

Framework Versions

Python: 3.12.7
SetFit: 1.1.1
Sentence Transformers: 3.4.1
Transformers: 4.48.2
PyTorch: 2.6.0+cu124
Datasets: 3.2.0
Tokenizers: 0.21.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

research-dump
/

bge-base-en-v1.5_wikidata_ent_masked_wikidata_ent_masked