Improve model card

This PR improves the model card by adding relevant metadata, including the pipeline tag, library name, and license. It also adds a link to the paper and the GitHub repository, as well as a basic code snippet demonstrating how to use the model.

Files changed (1) hide show

README.md +44 -20

README.md CHANGED Viewed

@@ -1,13 +1,15 @@
 ---
 library_name: transformers
 tags: []
 ---
 # Model Card for Model ID
 <!-- Provide a quick summary of what the model is/does. -->
 ## Model Details
@@ -17,21 +19,21 @@ tags: []
 This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
 ### Model Sources [optional]
 <!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
 ## Uses
@@ -41,19 +43,19 @@ This is the model card of a 🤗 transformers model that has been pushed on the
 <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
 ### Downstream Use [optional]
 <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
 ### Out-of-Scope Use
 <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
 ## Bias, Risks, and Limitations
@@ -71,7 +73,20 @@ Users (both direct and downstream) should be made aware of the risks, biases and
 Use the code below to get started with the model.
-[More Information Needed]
 ## Training Details
@@ -79,7 +94,7 @@ Use the code below to get started with the model.
 <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
 ### Training Procedure
@@ -92,7 +107,7 @@ Use the code below to get started with the model.
 #### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
 #### Speeds, Sizes, Times [optional]
@@ -174,7 +189,15 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
 **BibTeX:**
-[More Information Needed]
 **APA:**
@@ -196,4 +219,5 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
 ## Model Card Contact
-[More Information Needed]

 ---
 library_name: transformers
+license: mit
 tags: []
+pipeline_tag: text-generation
 ---
+```markdown
 # Model Card for Model ID
 <!-- Provide a quick summary of what the model is/does. -->
+Sky-T1-32B-Preview is a 32B parameter model described in the paper [LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters!](https://hf.co/papers/2502.07374)
 ## Model Details
 This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
+- **Developed by:** [NovaSky Team]
+- **Funded by [optional]:** [Berkeley Sky Computing Lab]
+- **Shared by [optional]:** [NovaSky-AI]
+- **Model type:** [Qwen2ForCausalLM]
+- **Language(s) (NLP):** [English]
+- **License:** [MIT]
+- **Finetuned from model [optional]:** [Qwen2.5-32B-Instruct]
 ### Model Sources [optional]
 <!-- Provide the basic links for the model. -->
+- **Repository:** https://github.com/NovaSky-AI/SkyThought
+- **Paper [optional]:** [LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters!](https://hf.co/papers/2502.07374)
+- **Demo [optional]:**  http://164.152.23.196:3000/
 ## Uses
 <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+Text generation, reasoning
 ### Downstream Use [optional]
 <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+Text generation, reasoning
 ### Out-of-Scope Use
 <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+Malicious uses
 ## Bias, Risks, and Limitations
 Use the code below to get started with the model.
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model_id = "NovaSky-AI/Sky-T1-32B-Preview"
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
+prompt = "This is a test prompt"
+inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
+generated_ids = model.generate(**inputs)
+decoded_output = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
+```
 ## Training Details
 <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+https://huggingface.co/datasets/NovaSky-AI/Sky-T1_data_17k
 ### Training Procedure
 #### Training Hyperparameters
+- **Training regime:** [bf16 mixed precision] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
 #### Speeds, Sizes, Times [optional]
 **BibTeX:**
+```
+@misc{sky_t1_2025,
+  author       = {NovaSky Team},
+  title        = {Sky-T1: Train your own O1 preview model within $450},
+  howpublished = {https://novasky-ai.github.io/posts/sky-t1},
+  note         = {Accessed: 2025-01-09},
+  year         = {2025}
+}
+```
 **APA:**
 ## Model Card Contact
+[More Information Needed]
+```