Transformers
GGUF
Marathi
English
conversational
amitagh commited on
Commit
9ca6b24
·
1 Parent(s): 944b4fe
Files changed (1) hide show
  1. README.md +109 -0
README.md ADDED
@@ -0,0 +1,109 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ license: llama3
4
+ datasets:
5
+ - smallstepai/marathi-instruction-tuning-alpaca
6
+ - ai4bharat/indic-align
7
+ language:
8
+ - mr
9
+ - en
10
+ ---
11
+
12
+ # Model Card for Model ID
13
+
14
+ <!-- -->
15
+
16
+
17
+
18
+ ## Model Details
19
+ Shivneri Marathi LLM is being built with the wish to bring the benefits of Generative AI to non-English (especially Marathi) speaking population of India.
20
+ Marathi has the third largest number of native speakers in India, after Hindi and Bengali.
21
+ Almost 83 million people speak the language.
22
+ This is a preliminary version of our Marathi LLM (Large Language Model)!
23
+ Built on the mighty Gemma 7B base model, Shivneri LLM can generate creative and informative text in both Marathi and English. This is just the beginning – we're constantly improving Shivneri, and even more exciting features are on the horizon!
24
+
25
+
26
+ ### Model Description
27
+
28
+ <!-- -->
29
+
30
+ This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
31
+
32
+ - **Developed by:** Amit Ghadge
33
+ - **Funded by [optional]:** [More Information Needed]
34
+ - **Shared by [optional]:** [Amit Ghadge]
35
+ - **Model type:** [ Decoder-only large language model (LLM) with a transformer architecture]
36
+ - **Language(s) (NLP):** [Marathi, English]
37
+ - **License:** [More Information Needed]
38
+ - **Finetuned from model [optional]:** [Meta-Llama-3-8B-Instruct]
39
+
40
+ ### Model Sources [optional]
41
+
42
+ <!-- Provide the basic links for the model. -->
43
+
44
+ - **Repository:** [https://github.com/amitagh/shivneri-llm]
45
+ - **Paper [optional]:** [https://medium.com/@amitagh/shivneri-marathi-llm-e823f0a045d8]
46
+ - **Demo [optional]:** [Coming soon]
47
+
48
+ ## Uses
49
+
50
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
51
+ This is a very preliminary version. Please use with caution. Would suggest to more updates and final models to try out.
52
+
53
+
54
+ ## Training Details
55
+
56
+ ### Training Data
57
+
58
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
59
+
60
+ [SFT with Lora on mentioned datasets above]
61
+
62
+ ### Training Procedure
63
+
64
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
65
+ SFT with Lora
66
+
67
+
68
+
69
+
70
+ ### Model Architecture and Objective
71
+
72
+ [ Decoder-only large language model (LLM) with a transformer architecture]
73
+
74
+ ### Compute Infrastructure
75
+
76
+ [A100 80 GB]
77
+
78
+ ## Meet the Developers
79
+
80
+ Get to know the creators behind this innovative model and follow their contributions to the field:
81
+
82
+ - [Amit Ghadge](https://www.linkedin.com/in/amit-ghadge-a162a115/)
83
+
84
+ ## Model Release Date May 1st, 2024.
85
+
86
+ Status This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback.
87
+
88
+ ## License
89
+ The model inherits the license from meta-llama3.
90
+
91
+ ## How to use
92
+ Use pretty much remains the same as original Meta-Llama-3-8B-Instruct model. Visit its page for more details.
93
+ With this model you can now use Marathi prompts and build conversational apps using it.
94
+
95
+ ## Citation [optional]
96
+
97
+ If you use this model in your research, please cite:
98
+
99
+ ```bibtex
100
+ @misc{amitghadge2024ShivneriLLMv01,
101
+ title={Shivneri-LLM: Your Bilingual Marathi and English Text Generation LLM},
102
+ author={Amit Ghadge},
103
+ year={2024},
104
+ eprint={https://medium.com/@amitagh/shivneri-marathi-llm-e823f0a045d8},
105
+
106
+ }
107
+ ```
108
+
109
+ We hope this model serves as a valuable tool in your NLP toolkit and look forward to seeing the advancements it will enable in the understanding and generation of the Marathi language.