SetFit with BAAI/bge-base-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-base-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
1
  • '###Instruction: Multi-class classification, answer with one of the labels: [delete, keep, speedy delete, comment] : ###Input: Q16350309: Adrianu (Q16350309) : Wikimedia disambiguation page : ( [MASK]
4
  • '###Instruction: Multi-class classification, answer with one of the labels: [delete, keep, speedy delete, comment] : ###Input: Q10135731: Template:Rfd links Merged with Q7150841 . Kittenono ( talk ) 16:36, 8 August 2013 (UTC)'
  • '###Instruction: Multi-class classification, answer with one of the labels: [delete, keep, speedy delete, comment] : ###Input: Q13164625: Guodian Chu Slips (Q13164625) : archaeological discovery in 1993 in Hubei, China : ( [MASK]
0
  • '###Instruction: Multi-class classification, answer with one of the labels: [delete, keep, speedy delete, comment] : ###Input: Q60781646: Yellow Vests movement in France (Q60781646) : Spontaneous social movement in France : ( [MASK]
3
  • '###Instruction: Multi-class classification, answer with one of the labels: [delete, keep, speedy delete, comment] : ###Input: Q16017531: Template:Rfd links Merged into Q13135852 . -- DracoRoboter ([[User talk:DracoRoboter
2
  • '###Instruction: Multi-class classification, answer with one of the labels: [delete, keep, speedy delete, comment] : ###Input: Q118210393: Malgorzata (Q118210393) : female given name : ( [MASK]

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("research-dump/bge-base-en-v1.5_wikidata_ent_masked_wikidata_ent_masked")
# Run inference
preds = model("###Instruction: Multi-class classification, answer with one of the labels: [delete, keep, speedy delete, comment] : ###Input:  Q11843502: Template:Rfd links Merged with Q4470435 . Succu ([[User talk:Succu| int:Talkpagelinktext ]]) 19:36, 12 February 2014 (UTC)")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 28 53.7838 2279
Label Training Sample Count
0 2
1 733
2 18
3 56
4 190

Training Hyperparameters

  • batch_size: (8, 2)
  • num_epochs: (5, 5)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 10
  • body_learning_rate: (1e-05, 1e-05)
  • head_learning_rate: 5e-05
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: True
  • use_amp: True
  • warmup_proportion: 0.1
  • l2_weight: 0.01
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0004 1 0.041 -
0.2002 500 0.1861 0.1338
0.4003 1000 0.0927 0.1352
0.6005 1500 0.0539 0.1385
0.8006 2000 0.0414 0.1415
1.0008 2500 0.0284 0.1429
1.2010 3000 0.0218 0.1359
1.4011 3500 0.0204 0.1388
1.6013 4000 0.0184 0.1486
1.8014 4500 0.0157 0.1465
2.0016 5000 0.0116 0.1530
2.2018 5500 0.0088 0.1492
2.4019 6000 0.0078 0.1582
2.6021 6500 0.0081 0.1680
2.8022 7000 0.0062 0.1487
3.0024 7500 0.0053 0.1466
3.2026 8000 0.004 0.1462
3.4027 8500 0.0039 0.1489
3.6029 9000 0.0025 0.1507
3.8030 9500 0.0014 0.1487
4.0032 10000 0.0015 0.1471
4.2034 10500 0.0017 0.1433
4.4035 11000 0.001 0.1434
4.6037 11500 0.0013 0.1425
4.8038 12000 0.0007 0.1436

Framework Versions

  • Python: 3.12.7
  • SetFit: 1.1.1
  • Sentence Transformers: 3.4.1
  • Transformers: 4.48.2
  • PyTorch: 2.6.0+cu124
  • Datasets: 3.2.0
  • Tokenizers: 0.21.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
9
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for research-dump/bge-base-en-v1.5_wikidata_ent_masked_wikidata_ent_masked

Finetuned
(346)
this model