SustcZhangYX
/

EnvGPT

Environmental Science

Model card Files Files and versions

SustcZhangYX commited on Nov 23, 2024

Commit

3bc2906

·

1 Parent(s): fbac713

Add README

Files changed (2) hide show

LOGO.PNG +0 -0
README.md +81 -1

LOGO.PNG ADDED Viewed

README.md CHANGED Viewed

@@ -6,4 +6,84 @@ language:
 - en
 tags:
 - Environmental Science
----

 - en
 tags:
 - Environmental Science
+---
+<div align="center">
+<img src="LOGO.PNG" width="600px">
+<h1 align="center"><font face="Arial">EnvGPT: A Framework for Applying Large Language Models in Environmental Science</font></h1>
+</div>
+EnvGPT, based on LLaMA 3.1-8B-instruct, is the first domain-specific large language model tailored for environmental science tasks.
+*Environmental science presents unique challenges for LLMs due to its interdisciplinary nature. EnvGPT was developed to address these challenges by leveraging domain-specific instruction datasets and benchmarks.*
+## 🚀 Getting Started
+### Download the model
+Download the model: [EnvGPT](https://huggingface.co/SustcZhangYX/EnvGPT)
+```shell
+git lfs install
+git clone https://huggingface.co/SustcZhangYX/EnvGPT
+```
+### Model Usage
+Here is a Python code snippet that demonstrates how to load the tokenizer and model and generate text using EnvGPT.
+```python
+import transformers
+import torch
+# Set the path to your local model
+model_path = "YOUR_LOCAL_MODEL_PATH"
+pipeline = transformers.pipeline(
+    "text-generation",
+    model=model_path,  # Use local model path
+    model_kwargs={"torch_dtype": torch.bfloat16},
+    device_map="auto",
+)
+messages = [
+    {"role": "system", "content": "You are an expert assistant in environmental science, EnvGPT.You are a helpful assistant."},
+    {"role": "user", "content": "What is the definition of environmental science?"},
+]
+# Pass top_p and temperature directly in the pipeline call
+outputs = pipeline(
+    messages,
+    max_new_tokens=512,
+    top_p=0.7,  # Add nucleus sampling
+    temperature=0.9,  # Add temperature control
+)
+print(outputs[0]["generated_text"])
+```
+This code demonstrates how to load the tokenizer and model from your local path, define environmental science-specific prompts, and generate responses using sampling techniques like top-p and temperature.
+## 🌏 Acknowledgement
+EnvGPT is fine-tuned based on the open-sourced [LLaMA](https://huggingface.co/meta-llama). We thank Meta AI for their contributions to the community.
+## ❗Disclaimer
+This project is intended solely for academic research and exploration. Please note that, like all large language models, this model may exhibit limitations, including potential inaccuracies or hallucinations in generated outputs.
+## Limitations
+- The model may produce hallucinated outputs or inaccuracies, which are inherent to large language models.
+- The model's identity has not been specifically optimized and may generate content that resembles outputs from other LLaMA-based models or similar architectures.
+- Generated outputs can vary between attempts due to sensitivity to prompt phrasing and token context.
+## 🚩Citation
+If you use EnvGPT in your research or applications, please cite this work as follows:
+```Markdown
+[Placeholder for Citation]
+Please refer to the forthcoming publication for details about EnvGPT.
+This section will be updated with the citation once the paper is officially published.
+```