commitGen / README.md
seniruk's picture
Update README.md
1b2d0d0 verified
metadata
datasets:
  - bigcode/commitpackft
language:
  - en
base_model:
  - Qwen/Qwen2.5-Coder-1.5B-Instruct
  • Developed by: seniruk
  • License: apache-2.0
  • Finetuned from model : unsloth/Qwen2.5-Coder-1.5B-Instruct

This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.


base_model: unsloth/Qwen2.5-Coder-1.5B-Instruct tags: - text-generation-inference - transformers - unsloth - qwen2 - trl license: apache-2.0 language: - en


datasets: - bigcode/commitpackft

Purpose

Used for generating high quality commit messages for a given git difference

Model Description

Generated by fine tuning Qwen2.5-Coder-1.5B-Instruct on bigcode/commitpackft dataset for 2 epochs Trained on a total of 277 Languages Achieved a final training loss in the range of 1- 1.7 (due to data set not containing equal data rows for each language) For common languages(python, java ,javascripts,c etc) loss went for a minimum of 1.0335

Environmental Impact

  • Hardware Type: geforce RTX 4060 TI - 16GB]
  • Hours used: 10 Hours
  • Cloud Provider: local

Results

Logo Logo

Inference input format (If using API mostly)

<|im_start|>system
You are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>
<|im_start|>user
{instructions}
{git_diff}<|im_end|>
<|im_start|>assistant

And the model will predict the rest of the content -> {assistant output}<|im_end|>