Rawkney commited on
Commit
65bb48e
·
verified ·
1 Parent(s): 0f8497f

Update model card

Browse files
Files changed (1) hide show
  1. README.md +33 -25
README.md CHANGED
@@ -1,51 +1,59 @@
1
  ---
2
  language: en
3
- license: apache-2.0
4
  tags:
5
- - mathematics
6
- - chain-of-thought
7
- - question-answering
 
8
  ---
9
 
10
- # KnullAI v2 - Fine-tuned on GAIR/o1-journey
11
 
12
- This model is a fine-tuned version of KnullAI v2, specifically trained on mathematical problem-solving using the GAIR/o1-journey dataset.
13
 
14
  ## Training Data
15
- The model was fine-tuned on the GAIR/o1-journey dataset, which contains:
16
- - Mathematical questions
17
- - Detailed answers
18
- - Step-by-step explanations (Chain of Thought)
 
 
 
 
 
 
 
 
19
 
20
  ## Usage
21
  ```python
22
  from transformers import AutoModelForCausalLM, AutoTokenizer
23
 
24
  # Load model and tokenizer
25
- model = AutoModelForCausalLM.from_pretrained("Rawkney/knullAi_v2")
26
- tokenizer = AutoTokenizer.from_pretrained("Rawkney/knullAi_v2")
27
 
28
  # Example usage
29
- question = "What is the area of a triangle with vertices at (0,0), (3,0), and (0,4)?"
30
- input_text = f"Question: {question}\nAnswer:"
31
- inputs = tokenizer(input_text, return_tensors="pt")
32
 
33
  outputs = model.generate(
34
  inputs["input_ids"],
35
- max_length=512,
36
  temperature=0.7,
37
- top_p=0.9
 
38
  )
39
 
40
  response = tokenizer.decode(outputs[0], skip_special_tokens=True)
41
  print(response)
42
  ```
43
 
44
- ## Training Procedure
45
- - Fine-tuned using the Transformers library
46
- - Training parameters:
47
- - Learning rate: 2e-5
48
- - Epochs: 3
49
- - Batch size: 2
50
- - Gradient accumulation steps: 4
51
- - Mixed precision training (fp16)
 
1
  ---
2
  language: en
 
3
  tags:
4
+ - arxiv
5
+ - research-papers
6
+ - text-generation
7
+ license: apache-2.0
8
  ---
9
 
10
+ # KnullAI v2 - Fine-tuned on ArXiver Dataset
11
 
12
+ This model is a fine-tuned version of KnullAI v2, specifically trained on the ArXiver dataset containing research paper information.
13
 
14
  ## Training Data
15
+ The model was fine-tuned on the neuralwork/arxiver dataset, which contains:
16
+ - Paper titles
17
+ - Abstracts
18
+ - Authors
19
+ - Publication dates
20
+ - Links
21
+
22
+ ## Model Details
23
+ - Base model: Rawkney/knullAi_v2
24
+ - Training type: Causal language modeling
25
+ - Hardware: T4 GPU
26
+ - Mixed precision: FP16
27
 
28
  ## Usage
29
  ```python
30
  from transformers import AutoModelForCausalLM, AutoTokenizer
31
 
32
  # Load model and tokenizer
33
+ model = AutoModelForCausalLM.from_pretrained("YOUR_REPO_ID")
34
+ tokenizer = AutoTokenizer.from_pretrained("YOUR_REPO_ID")
35
 
36
  # Example usage
37
+ title = "Your paper title"
38
+ input_text = f"Title: {title}\nAbstract:"
39
+ inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
40
 
41
  outputs = model.generate(
42
  inputs["input_ids"],
43
+ max_length=256,
44
  temperature=0.7,
45
+ top_p=0.9,
46
+ pad_token_id=tokenizer.eos_token_id
47
  )
48
 
49
  response = tokenizer.decode(outputs[0], skip_special_tokens=True)
50
  print(response)
51
  ```
52
 
53
+ ## Training Parameters
54
+ - Learning rate: 1e-5
55
+ - Epochs: 1
56
+ - Batch size: 1
57
+ - Gradient accumulation steps: 16
58
+ - Mixed precision training (fp16)
59
+ - Max sequence length: 512