SentenceTransformer based on microsoft/unixcoder-base-unimodal

This is a sentence-transformers model finetuned from microsoft/unixcoder-base-unimodal on the soco_train_java dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: RobertaModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("buelfhood/SOCO-Java-UnixCoder-Softmax-PairClass-VAST-ep2-bs32-noEval")
# Run inference
sentences = [
    '\n\n\n\nimport java.util.*;\nimport java.net.*;\nimport java.io.*;\nimport javax.swing.*;\n\npublic class PasswordCombination\n{\n    private int      pwdCounter = 0;\n    private    int  startTime;\n    private String   str1,str2,str3;\n    private String   url = "http://sec-crack.cs.rmit.edu./SEC/2/";\n    private String   loginPwd;\n    private String[] password;\n    private HoldSharedData data;\n    private char[] chars = {\'A\',\'B\',\'C\',\'D\',\'E\',\'F\',\'G\',\'H\',\'I\',\'J\',\'K\',\'L\',\'M\',\n                            \'N\',\'O\',\'P\',\'Q\',\'R\',\'S\',\'T\',\'U\',\'V\',\'W\',\'X\',\'Y\',\'Z\',\n                            \'a\',\'b\',\'c\',\'d\',\'e\',\'f\',\'g\',\'h\',\'i\',\'j\',\'k\',\'l\',\'m\',\n                            \'n\',\'o\',\'p\',\'q\',\'r\',\'s\',\'t\',\'u\',\'v\',\'w\',\'x\',\'y\',\'z\'};\n\n    public PasswordCombination()\n    {\n        System.out.println("Programmed by   for INTE1070 Assignment 2");\n\n        String input = JOptionPane.showInputDialog( "Enter number of threads" );\n        if(  input == null  )\n           System.exit(0);\n\n        int numOfConnections = Integer.parseInt( input );\n        startTime = System.currentTimeMillis();\n        int pwdCounter  = 52*52*52 + 52*52 + 52;\n        password = new String[pwdCounter];\n\n        doPwdCombination();\n\n        System.out.println("Total Number of Passwords Generated: " + pwdCounter);\n        createConnectionThread( numOfConnections );\n    }\n\n    private void doPwdCombination()\n    {\n        for( int i = 0; i < 52; i ++ )\n        {\n            str1 = "" + chars[i];\n            password[pwdCounter++] = "" + chars[i];\n            System.err.print( str1 + " | " );\n\n            for( int j = 0; j < 52; j ++ )\n            {\n                str2 = str1 + chars[j];\n                password[pwdCounter++] = str1 + chars[j];\n\n                for( int k = 0; k < 52; k ++ )\n                {\n                    str3 = str2 + chars[k];\n                    password[pwdCounter++] = str2 + chars[k];\n                }\n            }\n        }\n\n        System.err.println( "\\n" );\n    }\n\n    private void loadPasswords( )\n    {\n        FileReader     fRead;\n        BufferedReader buf;\n        String         line = null;\n        String         fileName = "words";\n\n        try\n        {\n            fRead = new FileReader( fileName );\n            buf = new BufferedReader(fRead);\n\n            while((line = buf.readLine( )) != null)\n            {\n                password[pwdCounter++] = line;\n            }\n        }\n        catch(FileNotFoundException e)\n        {\n            System.err.println("File not found: " + fileName);\n        }\n        catch(IOException ioe)\n        {\n            System.err.println("IO Error " + ioe);\n        }\n    }\n\n    private void createConnectionThread( int input )\n    {\n        data = new HoldSharedData( startTime, password, pwdCounter );\n\n        int numOfThreads = input;\n        int batch = pwdCounter/numOfThreads + 1;\n        numOfThreads = pwdCounter/batch + 1;\n        System.out.println("Number of Connection Threads Used:" + numOfThreads);\n        ConnectionThread[] connThread = new ConnectionThread[numOfThreads];\n\n        for( int index = 0; index < numOfThreads; index ++ )\n        {\n            connThread[index] = new ConnectionThread( url, index, batch, data );\n            connThread[index].conn();\n        }\n    }\n}  ',
    '\nimport java.util.*;\n\n\npublic class Cracker\n{\n   private char[] letters = {\'a\', \'b\', \'c\', \'d\', \'e\', \'f\', \'g\', \'h\', \'i\', \'j\', \'k\', \'l\', \'m\', \'n\', \'o\', \'p\', \'q\', \'r\', \'s\', \'t\', \'u\', \'v\', \'w\', \'x\', \'y\', \'z\', \'A\', \'B\', \'C\', \'D\', \'E\', \'F\', \'G\', \'H\', \'I\', \'J\', \'K\', \'L\', \'M\', \'N\', \'O\', \'P\', \'Q\', \'R\', \'S\', \'T\', \'U\', \'V\', \'W\', \'X\', \'Y\', \'Z\'};\n   private Vector v;\n\n   public Cracker()\n   {\n      v = new Vector( 52);\n   }\n   public void loadLetters()\n   {\n      int i;\n\n      for( i = 0; i < letters.length; i++)\n      {\n\t String s = new StringBuffer().append( letters[i]).toString();\n         v.add( s);\n      }\n   }\n   public Vector getVictor()\n   {\n      return ;\n   }\n   public void loadPairs()\n   {\n      int i,j;\n\n      for( i = 0; i < letters.length - 1; i++)\n      {\n         for( j = i + 1; j < letters.length; j++)\n         {\n            String s1 = new StringBuffer().append( letters[i]).append( letters[j]).toString();\n\t    String s2 = new StringBuffer().append( letters[j]).append( letters[i]).toString();\n\t    v.add( s1);\n\t    v.add( s2);\n\t }\n      }\n      for( i = 0; i < letters.length; i++)\n      {\n         String s3 = new StringBuffer().append( letters[i]).append( letters[i]).toString();\n\t v.add( s3);\n      }\n   }\n   public void loadTriples()\n   {\n      int i, j, k;\n      \n      for( i = 0; i < letters.length; i++)\n      {\n         String s4 = new StringBuffer().append( letters[i]).append( letters[i]).append( letters[i]).toString();\n\t v.add( s4);\n      }\n      for( i = 0; i < letters.length - 1; i++)\n      {\n         for( j = i + 1; j < letters.length; j++)\n\t {\n\t    String s5 = new StringBuffer().append( letters[i]).append( letters[j]).append( letters[j]).toString();\n\t    String s6 = new StringBuffer().append( letters[j]).append( letters[i]).append( letters[j]).toString();\n\t    String s7 = new StringBuffer().append( letters[j]).append( letters[j]).append( letters[i]).toString();\n\t    String s8 = new StringBuffer().append( letters[j]).append( letters[i]).append( letters[i]).toString();\n\t    String s9 = new StringBuffer().append( letters[i]).append( letters[j]).append( letters[i]).toString();\n\t    String s10 = new StringBuffer().append( letters[i]).append( letters[i]).append( letters[j]).toString();\n\t    v.add( s5);\n\t    v.add( s6);\n\t    v.add( s7);\n\t    v.add( s8);\n\t    v.add( s9);\n\t    v.add( s10);\n\t }\n      }\n      for( i = 0; i < letters.length - 2; i++)\n      {\n         for( j = i + 1; j < letters.length - 1; j++)\n\t {\n\t    for( k = i + 2; k < letters.length; k++)\n\t    {\n\t       String s11 = new StringBuffer().append( letters[i]).append( letters[j]).append(letters[k]).toString();\n\t       String s12 = new StringBuffer().append( letters[i]).append( letters[k]).append(letters[j]).toString();\n\t       String s13 = new StringBuffer().append( letters[k]).append( letters[j]).append(letters[i]).toString();\n\t       String s14 = new StringBuffer().append( letters[k]).append( letters[i]).append(letters[j]).toString();\n\t       String s15 = new StringBuffer().append( letters[j]).append( letters[i]).append(letters[k]).toString();\n\t       String s16 = new StringBuffer().append( letters[j]).append( letters[k]).append(letters[i]).toString();\n\t       v.add( s11);\n\t       v.add( s12);\n\t       v.add( s13);\n\t       v.add( s14);\n\t       v.add( s15);\n\t       v.add( s16);\n\t    }\n\t }\n      }\n   }\n         \n   public static void main( String[] args)\n   {\n      Cracker cr = new Cracker();\n      cr.loadLetters();\n      cr.loadPairs();\n      cr.loadTriples();\n      System.out.println(" far "+cr.getVictor().size()+" elements loaded");\n   }\n}\n      \n',
    '\nimport java.util.*;\n\n\npublic class Cracker\n{\n   private char[] letters = {\'a\', \'b\', \'c\', \'d\', \'e\', \'f\', \'g\', \'h\', \'i\', \'j\', \'k\', \'l\', \'m\', \'n\', \'o\', \'p\', \'q\', \'r\', \'s\', \'t\', \'u\', \'v\', \'w\', \'x\', \'y\', \'z\', \'A\', \'B\', \'C\', \'D\', \'E\', \'F\', \'G\', \'H\', \'I\', \'J\', \'K\', \'L\', \'M\', \'N\', \'O\', \'P\', \'Q\', \'R\', \'S\', \'T\', \'U\', \'V\', \'W\', \'X\', \'Y\', \'Z\'};\n   private Vector v;\n\n   public Cracker()\n   {\n      v = new Vector( 52);\n   }\n   public void loadLetters()\n   {\n      int i;\n\n      for( i = 0; i < letters.length; i++)\n      {\n\t String s = new StringBuffer().append( letters[i]).toString();\n         v.add( s);\n      }\n   }\n   public Vector getVictor()\n   {\n      return ;\n   }\n   public void loadPairs()\n   {\n      int i,j;\n\n      for( i = 0; i < letters.length - 1; i++)\n      {\n         for( j = i + 1; j < letters.length; j++)\n         {\n            String s1 = new StringBuffer().append( letters[i]).append( letters[j]).toString();\n\t    String s2 = new StringBuffer().append( letters[j]).append( letters[i]).toString();\n\t    v.add( s1);\n\t    v.add( s2);\n\t }\n      }\n      for( i = 0; i < letters.length; i++)\n      {\n         String s3 = new StringBuffer().append( letters[i]).append( letters[i]).toString();\n\t v.add( s3);\n      }\n   }\n   public void loadTriples()\n   {\n      int i, j, k;\n      \n      for( i = 0; i < letters.length; i++)\n      {\n         String s4 = new StringBuffer().append( letters[i]).append( letters[i]).append( letters[i]).toString();\n\t v.add( s4);\n      }\n      for( i = 0; i < letters.length - 1; i++)\n      {\n         for( j = i + 1; j < letters.length; j++)\n\t {\n\t    String s5 = new StringBuffer().append( letters[i]).append( letters[j]).append( letters[j]).toString();\n\t    String s6 = new StringBuffer().append( letters[j]).append( letters[i]).append( letters[j]).toString();\n\t    String s7 = new StringBuffer().append( letters[j]).append( letters[j]).append( letters[i]).toString();\n\t    String s8 = new StringBuffer().append( letters[j]).append( letters[i]).append( letters[i]).toString();\n\t    String s9 = new StringBuffer().append( letters[i]).append( letters[j]).append( letters[i]).toString();\n\t    String s10 = new StringBuffer().append( letters[i]).append( letters[i]).append( letters[j]).toString();\n\t    v.add( s5);\n\t    v.add( s6);\n\t    v.add( s7);\n\t    v.add( s8);\n\t    v.add( s9);\n\t    v.add( s10);\n\t }\n      }\n      for( i = 0; i < letters.length - 2; i++)\n      {\n         for( j = i + 1; j < letters.length - 1; j++)\n\t {\n\t    for( k = i + 2; k < letters.length; k++)\n\t    {\n\t       String s11 = new StringBuffer().append( letters[i]).append( letters[j]).append(letters[k]).toString();\n\t       String s12 = new StringBuffer().append( letters[i]).append( letters[k]).append(letters[j]).toString();\n\t       String s13 = new StringBuffer().append( letters[k]).append( letters[j]).append(letters[i]).toString();\n\t       String s14 = new StringBuffer().append( letters[k]).append( letters[i]).append(letters[j]).toString();\n\t       String s15 = new StringBuffer().append( letters[j]).append( letters[i]).append(letters[k]).toString();\n\t       String s16 = new StringBuffer().append( letters[j]).append( letters[k]).append(letters[i]).toString();\n\t       v.add( s11);\n\t       v.add( s12);\n\t       v.add( s13);\n\t       v.add( s14);\n\t       v.add( s15);\n\t       v.add( s16);\n\t    }\n\t }\n      }\n   }\n         \n   public static void main( String[] args)\n   {\n      Cracker cr = new Cracker();\n      cr.loadLetters();\n      cr.loadPairs();\n      cr.loadTriples();\n      System.out.println(" far "+cr.getVictor().size()+" elements loaded");\n   }\n}\n      \n',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

soco_train_java

  • Dataset: soco_train_java at 44ca4ff
  • Size: 33,411 training samples
  • Columns: label, text_1, and text_2
  • Approximate statistics based on the first 1000 samples:
    label text_1 text_2
    type int string string
    details
    • 0: ~99.80%
    • 1: ~0.20%
    • min: 51 tokens
    • mean: 457.49 tokens
    • max: 512 tokens
    • min: 512 tokens
    • mean: 512.0 tokens
    • max: 512 tokens
  • Samples:
    label text_1 text_2
    0
    import java.net.;
    import java.io.
    ;
    import java.util.*;


    public class Dictionary {

    public static void main(String args[])
    {
    int i,j,k;
    String pass = new String();
    String UserPass = new String();
    String status = new String();
    String status1 = new String();
    BasicAuth auth = new BasicAuth();
    URLConnection connect;
    int start,end,diff;
    try {
    URL url = new URL ("http://sec-crack.cs.rmit.edu./SEC/2/");



    start =System.currentTimeMillis();

    BufferedReader dis = new BufferedReader(new FileReader("words"));


    while ((pass = dis.readLine()) != null)
    {


    UserPass= auth.encode("",pass);

    connect = url.openConnection();
    connect.setDoInput(true);
    connect.setDoOutput(true);

    connect.setRequestProperty("Host","sec-crack.cs.rmit.edu.");
    connect.setRequestProperty("Get","/SEC/2/ HTTP/1.1");
    connect.setRequestProperty(...


    import java.;
    import java.io.
    ;
    import java.util.;

    public class BruteForce
    {
    public final static int TOTAL_TIMES=52
    52*52;
    public char[] passwd;
    public static void main(String[] args) throws IOException
    {
    BruteForce bf=new BruteForce();
    System.out.println(" cracking...");
    time1=new Date().getTime();
    bf.doBruteForce(time1);
    time2=new Date().getTime();
    System.out.println("Finish cracking.");
    System.out.println(" password found.");
    System.out.println("costs "+(time2-time1)+" milliseconds");
    System.exit(1);
    }

    void doBruteForce(int time1) throws IOException
    {
    passwd=new char[3];
    Runtime rt=Runtime.getRuntime();
    num=0;
    for(int i=(int)'z';i>=(int)'A';i--)
    {
    if(i==96)
    i=90;
    passwd[0]=(char)i;
    for(int j=(int)'z';j>=(int)'A';j--)
    {
    if(j==96)
    j=90;
    passwd[1]=(char)j;
    for(int k=(int)'z';k>=(int)'A';k--)
    {
    if(k==96)
    k=90;
    passwd[2]=(char)k;
    String password=new String(passwd);
    try
    ...
    0
    import java.util.*;


    public class Cracker
    {
    private char[] letters = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'};
    private Vector v;

    public Cracker()
    {
    v = new Vector( 52);
    }
    public void loadLetters()
    {
    int i;

    for( i = 0; i < letters.length; i++)
    {
    String s = new StringBuffer().append( letters[i]).toString();
    v.add( s);
    }
    }
    public Vector getVictor()
    {
    return ;
    }
    public void loadPairs()
    {
    int i,j;

    for( i = 0; i < letters.length - 1; i++)
    {
    for( j = i + 1; j < letters.length; j++)
    {
    String s1 = new StringBuffer().append( letters[i]).append( letters[j]).toString();
    String s2 = new StringBuffer().append( letters[j]).append( letters[i])....


    import java.;
    import java.io.
    ;
    import java.util.;

    public class BruteForce
    {
    public final static int TOTAL_TIMES=52
    52*52;
    public char[] passwd;
    public static void main(String[] args) throws IOException
    {
    BruteForce bf=new BruteForce();
    System.out.println(" cracking...");
    time1=new Date().getTime();
    bf.doBruteForce(time1);
    time2=new Date().getTime();
    System.out.println("Finish cracking.");
    System.out.println(" password found.");
    System.out.println("costs "+(time2-time1)+" milliseconds");
    System.exit(1);
    }

    void doBruteForce(int time1) throws IOException
    {
    passwd=new char[3];
    Runtime rt=Runtime.getRuntime();
    num=0;
    for(int i=(int)'z';i>=(int)'A';i--)
    {
    if(i==96)
    i=90;
    passwd[0]=(char)i;
    for(int j=(int)'z';j>=(int)'A';j--)
    {
    if(j==96)
    j=90;
    passwd[1]=(char)j;
    for(int k=(int)'z';k>=(int)'A';k--)
    {
    if(k==96)
    k=90;
    passwd[2]=(char)k;
    String password=new String(passwd);
    try
    ...
    0

    import java.io.;
    import java.
    ;
    import java.util.StringTokenizer;

    public class Dictionary
    {
    public static void main(String args[])
    {
    final String DICT_FILE = "/usr/share/lib/dict/words";
    String basic_url = "http://sec-crack.cs.rmit.edu./SEC/2/";
    String password;
    String s = null;
    int num_tries = 0;

    try
    {

    BufferedReader dict_word = new BufferedReader
    (new FileReader (DICT_FILE));


    while((password = dict_word.readLine())!= null)
    {
    try
    {

    Process p = Runtime.getRuntime().exec("wget --http-user= --http-passwd=" + password + " " + basic_url);

    BufferedReader stdInput = new BufferedReader(new
    InputStreamReader(p.getInputStream()));

    BufferedReader stdError = new BufferedReader(new
    InputStreamReader(p.g...


    import java.;
    import java.io.
    ;
    import java.util.;

    public class BruteForce
    {
    public final static int TOTAL_TIMES=52
    52*52;
    public char[] passwd;
    public static void main(String[] args) throws IOException
    {
    BruteForce bf=new BruteForce();
    System.out.println(" cracking...");
    time1=new Date().getTime();
    bf.doBruteForce(time1);
    time2=new Date().getTime();
    System.out.println("Finish cracking.");
    System.out.println(" password found.");
    System.out.println("costs "+(time2-time1)+" milliseconds");
    System.exit(1);
    }

    void doBruteForce(int time1) throws IOException
    {
    passwd=new char[3];
    Runtime rt=Runtime.getRuntime();
    num=0;
    for(int i=(int)'z';i>=(int)'A';i--)
    {
    if(i==96)
    i=90;
    passwd[0]=(char)i;
    for(int j=(int)'z';j>=(int)'A';j--)
    {
    if(j==96)
    j=90;
    passwd[1]=(char)j;
    for(int k=(int)'z';k>=(int)'A';k--)
    {
    if(k==96)
    k=90;
    passwd[2]=(char)k;
    String password=new String(passwd);
    try
    ...
  • Loss: SoftmaxLoss

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 32
  • num_train_epochs: 2

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 2
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss
0.4785 500 0.0175
0.9569 1000 0.012
1.4354 1500 0.0098
1.9139 2000 0.0037

Framework Versions

  • Python: 3.11.11
  • Sentence Transformers: 4.1.0
  • Transformers: 4.52.4
  • PyTorch: 2.8.0.dev20250319+cu128
  • Accelerate: 1.7.0
  • Datasets: 3.6.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers and SoftmaxLoss

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
9
Safetensors
Model size
125M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for buelfhood/SOCO-Java-UnixCoder-Softmax-PairClass-VAST-ep2-bs32-noEval

Finetuned
(7)
this model

Dataset used to train buelfhood/SOCO-Java-UnixCoder-Softmax-PairClass-VAST-ep2-bs32-noEval