ainow-mk commited on
Commit
d8d0d92
Β·
verified Β·
1 Parent(s): 77267ac

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +75 -3
README.md CHANGED
@@ -1,3 +1,75 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+ README.md for Hugging Face - MK-LLM-Mistral
5
+ This README will help contributors, developers, and AI enthusiasts understand your MK-LLM-Mistral project.
6
+
7
+ πŸš€ MK-LLM-Mistral: The First Macedonian LLM
8
+ πŸ“’ MK-LLM-Mistral is the first Macedonian Language Large Language Model πŸ‡²πŸ‡°, developed by AI Now - Association for Artificial Intelligence in Macedonia.
9
+
10
+ πŸ”— Website: www.ainow.mk
11
+ πŸ“© Contact: [email protected]
12
+ πŸ›  GitHub Repository: MK-LLM Project
13
+
14
+ πŸ“Œ Model Overview
15
+ Model Name: MK-LLM-Mistral
16
+ Base Model: Mistral-7B
17
+ Language: Macedonian πŸ‡²πŸ‡°
18
+ Fine-tuned on: Wikipedia, news articles, legal documents, and public datasets in Macedonian
19
+ Tasks: Chatbot, Text Completion, Q&A, Macedonian NLP tasks
20
+ πŸ“Œ How to Use the Model Locally
21
+ 1️⃣ Install Required Libraries
22
+ bash
23
+ Copy
24
+ Edit
25
+ pip install transformers torch huggingface_hub
26
+ 2️⃣ Load the Model in Python
27
+ python
28
+ Copy
29
+ Edit
30
+ from transformers import AutoModelForCausalLM, AutoTokenizer
31
+ import torch
32
+
33
+ # Load Model
34
+ MODEL_NAME = "ainowmk/MK-LLM-Mistral"
35
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
36
+ model = AutoModelForCausalLM.from_pretrained(MODEL_NAME)
37
+
38
+ # Move model to GPU if available
39
+ device = "cuda" if torch.cuda.is_available() else "cpu"
40
+ model.to(device)
41
+
42
+ # Test the Model
43
+ input_text = "Π—Π΄Ρ€Π°Π²ΠΎ, ΠΊΠ°ΠΊΠΎ си?"
44
+ inputs = tokenizer(input_text, return_tensors="pt").to(device)
45
+ outputs = model.generate(**inputs, max_length=100)
46
+
47
+ # Decode and print the result
48
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
49
+ πŸ“Œ Model Files
50
+ File Name Description
51
+ pytorch_model.bin The fine-tuned weights of the model
52
+ config.json Configuration for the model architecture
53
+ tokenizer.json Tokenizer used for the Macedonian language
54
+ README.md Documentation for the model
55
+ .gitattributes Git LFS tracking for large files
56
+ πŸ“Œ Training Details
57
+ Dataset: Collected Macedonian texts (Wikipedia, news, government websites)
58
+ Training Compute: GPU-based training on NVIDIA A100
59
+ Training Time: Estimated XX hours
60
+ Fine-tuned using: Hugging Face Transformers & PyTorch
61
+ πŸ“Œ Contributing
62
+ MK-LLM-Mistral is an open-source project, and contributions are welcome! 🎯
63
+
64
+ Open issues on GitHub
65
+ Submit pull requests for improvements
66
+ Join discussions on Hugging Face Community
67
+ πŸ’‘ If you want to help in data collection, fine-tuning, or evaluation, reach out at [email protected]
68
+
69
+ πŸ“Œ License
70
+ This model is licensed under Apache 2.0.
71
+ You are free to use, distribute, and modify it, but attribution is required.
72
+
73
+ πŸš€ Let’s build the future of Macedonian AI together! πŸ‡²πŸ‡°
74
+ πŸ‘‰ AI Now - Association for Artificial Intelligence in Macedonia
75
+ πŸ“© [email protected] | πŸ”— www.ainow.mk