Unifeed / README.md
JordanTallon's picture
Push model using huggingface_hub.
2a48c96 verified
|
raw
history blame
19.7 kB
metadata
library_name: setfit
tags:
  - setfit
  - sentence-transformers
  - text-classification
  - generated_from_setfit_trainer
metrics:
  - f1
  - precision
  - recall
  - accuracy
widget:
  - text: >-
      Since the start of the coronavirus outbreak, Trump has hampered efforts to
      slow the virus’s spread and encouraged Americans’ restlessness under
      quarantine.
  - text: ' It has to be particularly described what he is looking for said Asha Rangappa who was a counter intelligence agent for the FBI and now a Yale Law School professor A judge isn t going to sign off some sort of blanket warrant that tells Facebook to turn over everything '
  - text: >-
      Now in response to these very serious crises it seems to me that we have
      two choices First we can throw up our hands in despair We can say I am not
      going to get involved 
  - text: "Over the past week, activists, some of who are believed to be affiliated with Black Lives Matter have\_rioted\_across the country following the death of George Floyd in police custody, wreaking havoc and destruction against America’s towns, cities, and local communities.\_"
  - text: >-
      Working-class Americans, like those who make up the majority of South Bend
      residents, have secured the largest wage hikes in the nation compared to
      all other economic demographic groups — a direct result of Trump
      tightening the labor market.
pipeline_tag: text-classification
inference: true
base_model: BAAI/bge-small-en-v1.5
model-index:
  - name: SetFit with BAAI/bge-small-en-v1.5
    results:
      - task:
          type: text-classification
          name: Text Classification
        dataset:
          name: Unknown
          type: unknown
          split: test
        metrics:
          - type: f1
            value: 0.6952861952861953
            name: F1
          - type: precision
            value: 0.6952861952861953
            name: Precision
          - type: recall
            value: 0.6952861952861953
            name: Recall
          - type: accuracy
            value: 0.6952861952861953
            name: Accuracy

SetFit with BAAI/bge-small-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-small-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
center
  • 'A leading economist who vouched for Democratic presidential candidate Elizabeth Warren’s healthcare reform plan told Reuters on Thursday he doubts its staggering cost can be fully covered alongside her other government programs.'
  • 'U.S. President Donald Trump is doing well and is very healthy, White House adviser Kellyanne Conway told Fox News on Thursday, after a U.S. military official who worked at the White House was found to have been infected with the novel coronavirus.'
  • 'Alabama has the most restrictive abortion law in the U.S., banning abortion at any stage of pregnancy and for any reason, including in cases of rape and incest.'
left
  • 'Meet the shadowy accountants who do Trump’s taxes and help him seem richer than he is'
  • 'When did vaccines become politicized? Amid a measles outbreak, suddenly Republicans support anti-vaxxers.'
  • 'Last summer, the Republican White House announced plans to roll back the tougher standards, making it easier for the automotive industry to sell less efficient vehicles that pollute more.'
right
  • 'Joe Biden told Wall Street donors to his campaign that he planned to reverse most of President Donald Trump’s tax cuts.'
  • 'For far too many on the left, chaos is the point. Destruction is the goal. They prefer the unknown madness that lies ahead to whatever is still managing to (barely) hold us together in the present.'
  • 'Cuba’s health ministry initially vowed an investigation into Paloma Dominguez Caballero’s death; last week, state media published a report essentially absolving the government of any wrongdoing, categorically stating that nothing was wrong with the vaccine Dominguez received.'

Evaluation

Metrics

Label F1 Precision Recall Accuracy
all 0.6953 0.6953 0.6953 0.6953

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("JordanTallon/Unifeed")
# Run inference
preds = model("Since the start of the coronavirus outbreak, Trump has hampered efforts to slow the virus’s spread and encouraged Americans’ restlessness under quarantine.")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 6 33.1655 86
Label Training Sample Count
center 802
left 784
right 788

Training Hyperparameters

  • batch_size: (32, 32)
  • num_epochs: (3, 3)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 20
  • body_learning_rate: (2e-05, 2e-05)
  • head_learning_rate: 2e-05
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0003 1 0.2552 -
0.0168 50 0.2613 -
0.0337 100 0.2653 -
0.0505 150 0.2574 -
0.0674 200 0.2455 -
0.0842 250 0.2583 -
0.1011 300 0.2736 -
0.1179 350 0.2341 -
0.1348 400 0.2524 -
0.1516 450 0.2429 -
0.1685 500 0.2579 -
0.1853 550 0.2363 -
0.2022 600 0.2789 -
0.2190 650 0.186 -
0.2358 700 0.2425 -
0.2527 750 0.1963 -
0.2695 800 0.1858 -
0.2864 850 0.1499 -
0.3032 900 0.2219 -
0.3201 950 0.1376 -
0.3369 1000 0.1115 -
0.3538 1050 0.1205 -
0.3706 1100 0.1398 -
0.3875 1150 0.1585 -
0.4043 1200 0.1328 -
0.4212 1250 0.0954 -
0.4380 1300 0.0707 -
0.4549 1350 0.2214 -
0.4717 1400 0.1351 -
0.4885 1450 0.1249 -
0.5054 1500 0.1656 -
0.5222 1550 0.1573 -
0.5391 1600 0.1103 -
0.5559 1650 0.0787 -
0.5728 1700 0.126 -
0.5896 1750 0.0876 -
0.6065 1800 0.1687 -
0.6233 1850 0.1319 -
0.6402 1900 0.0815 -
0.6570 1950 0.09 -
0.6739 2000 0.0471 -
0.6907 2050 0.1032 -
0.7075 2100 0.0858 -
0.7244 2150 0.0859 -
0.7412 2200 0.0946 -
0.7581 2250 0.0618 -
0.7749 2300 0.0233 -
0.7918 2350 0.0148 -
0.8086 2400 0.0367 -
0.8255 2450 0.0111 -
0.8423 2500 0.0034 -
0.8592 2550 0.0174 -
0.8760 2600 0.0304 -
0.8929 2650 0.0303 -
0.9097 2700 0.0031 -
0.9265 2750 0.0058 -
0.9434 2800 0.0034 -
0.9602 2850 0.0011 -
0.9771 2900 0.0013 -
0.9939 2950 0.0296 -
1.0108 3000 0.0008 -
1.0276 3050 0.0189 -
1.0445 3100 0.0295 -
1.0613 3150 0.0276 -
1.0782 3200 0.0008 -
1.0950 3250 0.0008 -
1.1119 3300 0.0009 -
1.1287 3350 0.0009 -
1.1456 3400 0.0008 -
1.1624 3450 0.0099 -
1.1792 3500 0.0009 -
1.1961 3550 0.0299 -
1.2129 3600 0.0007 -
1.2298 3650 0.001 -
1.2466 3700 0.0009 -
1.2635 3750 0.0008 -
1.2803 3800 0.001 -
1.2972 3850 0.0009 -
1.3140 3900 0.0008 -
1.3309 3950 0.0007 -
1.3477 4000 0.0007 -
1.3646 4050 0.03 -
1.3814 4100 0.0008 -
1.3982 4150 0.0012 -
1.4151 4200 0.0292 -
1.4319 4250 0.0006 -
1.4488 4300 0.0007 -
1.4656 4350 0.0006 -
1.4825 4400 0.0007 -
1.4993 4450 0.0008 -
1.5162 4500 0.0008 -
1.5330 4550 0.0015 -
1.5499 4600 0.0032 -
1.5667 4650 0.0015 -
1.5836 4700 0.0006 -
1.6004 4750 0.0006 -
1.6173 4800 0.0021 -
1.6341 4850 0.0013 -
1.6509 4900 0.0006 -
1.6678 4950 0.0006 -
1.6846 5000 0.0013 -
1.7015 5050 0.0006 -
1.7183 5100 0.0007 -
1.7352 5150 0.0005 -
1.7520 5200 0.0005 -
1.7689 5250 0.0006 -
1.7857 5300 0.0005 -
1.8026 5350 0.0005 -
1.8194 5400 0.0005 -
1.8363 5450 0.0004 -
1.8531 5500 0.0066 -
1.8699 5550 0.0005 -
1.8868 5600 0.0006 -
1.9036 5650 0.0005 -
1.9205 5700 0.0005 -
1.9373 5750 0.0014 -
1.9542 5800 0.0006 -
1.9710 5850 0.0004 -
1.9879 5900 0.0006 -
2.0047 5950 0.0005 -
2.0216 6000 0.0006 -
2.0384 6050 0.0005 -
2.0553 6100 0.0004 -
2.0721 6150 0.0012 -
2.0889 6200 0.0004 -
2.1058 6250 0.0005 -
2.1226 6300 0.0004 -
2.1395 6350 0.0005 -
2.1563 6400 0.0005 -
2.1732 6450 0.0005 -
2.1900 6500 0.0004 -
2.2069 6550 0.0004 -
2.2237 6600 0.0005 -
2.2406 6650 0.0004 -
2.2574 6700 0.0005 -
2.2743 6750 0.0004 -
2.2911 6800 0.0005 -
2.3080 6850 0.0007 -
2.3248 6900 0.0004 -
2.3416 6950 0.0018 -
2.3585 7000 0.0004 -
2.3753 7050 0.0004 -
2.3922 7100 0.0004 -
2.4090 7150 0.0004 -
2.4259 7200 0.0004 -
2.4427 7250 0.0005 -
2.4596 7300 0.0004 -
2.4764 7350 0.0005 -
2.4933 7400 0.0012 -
2.5101 7450 0.0026 -
2.5270 7500 0.0004 -
2.5438 7550 0.0003 -
2.5606 7600 0.0004 -
2.5775 7650 0.0004 -
2.5943 7700 0.0004 -
2.6112 7750 0.0004 -
2.6280 7800 0.0004 -
2.6449 7850 0.0004 -
2.6617 7900 0.0004 -
2.6786 7950 0.0003 -
2.6954 8000 0.0004 -
2.7123 8050 0.0004 -
2.7291 8100 0.0004 -
2.7460 8150 0.0004 -
2.7628 8200 0.0004 -
2.7796 8250 0.0004 -
2.7965 8300 0.0005 -
2.8133 8350 0.0004 -
2.8302 8400 0.0004 -
2.8470 8450 0.0004 -
2.8639 8500 0.0004 -
2.8807 8550 0.0004 -
2.8976 8600 0.0004 -
2.9144 8650 0.0004 -
2.9313 8700 0.0004 -
2.9481 8750 0.0004 -
2.9650 8800 0.0004 -
2.9818 8850 0.0004 -
2.9987 8900 0.0003 -

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.3
  • Sentence Transformers: 2.2.2
  • Transformers: 4.35.2
  • PyTorch: 2.1.0+cu121
  • Datasets: 2.16.1
  • Tokenizers: 0.15.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}