klyang commited on
Commit
9e16948
·
1 Parent(s): d95d055

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +80 -0
README.md CHANGED
@@ -1,3 +1,83 @@
1
  ---
2
  license: mit
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ language:
4
+ - en
5
+ metrics:
6
+ - f1
7
+ tags:
8
+ - medical
9
  ---
10
+ # Introduction
11
+
12
+ MentaLLaMA-33B-lora is part of the [MentaLLaMA](https://github.com/SteveKGYang/MentalLLaMA) project, the first open-source large language model (LLM) series for
13
+ interpretable mental health analysis with instruction-following capability. This model is finetuned based on the Vicuna-33B foundation model and the full IMHI instruction tuning data,
14
+ but tuned with LoRA due to limited computational resources.
15
+ The model is expected to make complex mental health analyses for various mental health conditions and give reliable explanations for each of its predictions.
16
+ It is fine-tuned on the IMHI dataset with 75K high-quality natural language instructions to boost its performance in downstream tasks.
17
+ We perform a comprehensive evaluation on the IMHI benchmark with 20K test samples. The result shows that MentalLLaMA approaches state-of-the-art discriminative
18
+ methods in correctness and generates high-quality explanations.
19
+
20
+ # Ethical Consideration
21
+
22
+ Although experiments on MentaLLaMA show promising performance on interpretable mental health analysis, we stress that
23
+ all predicted results and generated explanations should only used
24
+ for non-clinical research, and the help-seeker should get assistance
25
+ from professional psychiatrists or clinical practitioners. In addition,
26
+ recent studies have indicated LLMs may introduce some potential
27
+ bias, such as gender gaps. Meanwhile, some incorrect prediction results, inappropriate explanations, and over-generalization
28
+ also illustrate the potential risks of current LLMs. Therefore, there
29
+ are still many challenges in applying the model to real-scenario
30
+ mental health monitoring systems.
31
+
32
+ ## Other Models in MentaLLaMA
33
+
34
+ In addition to MentaLLaMA-33B-lora, the MentaLLaMA project includes another model: MentaLLaMA-chat-7B, MentalBART, MentalT5.
35
+
36
+ - **MentaLLaMA-chat-13B**: This model is finetuned based on the Meta LLaMA2-chat-13B foundation model and the full IMHI instruction tuning data. The training data covers
37
+ 10 mental health analysis tasks.
38
+
39
+ - **MentaLLaMA-chat-7B**: This model is finetuned based on the Meta LLaMA2-chat-7B foundation model and the full IMHI instruction tuning data. The training data covers
40
+ 10 mental health analysis tasks.
41
+
42
+ - **MentalBART**: This model is finetuned based on the BART-large foundation model and the full IMHI-completion data. The training data covers
43
+ 10 mental health analysis tasks. This model doesn't have instruction-following ability but is more lightweight and performs well in interpretable mental health analysis
44
+ in a completion-based manner.
45
+
46
+ - **MentalT5**: This model is finetuned based on the T5-large foundation model and the full IMHI-completion data. The training data covers
47
+ 10 mental health analysis tasks. This model doesn't have instruction-following ability but is more lightweight and performs well in interpretable mental health analysis
48
+ in a completion-based manner.
49
+
50
+ ## Usage
51
+
52
+ You can use the MentaLLaMA-33B-lora model in your Python project with the Hugging Face Transformers library. Here is a simple example of how to load the model:
53
+
54
+ Since our model is based on the Vicuna-33B foundation model, you need to first download the Vicuna-33B model [here](https://huggingface.co/lmsys/vicuna-33b-v1.3),
55
+ and put it under the `./Vicuna-33B` dir.
56
+
57
+ ```python
58
+ from transformers import LlamaTokenizer, LlamaForCausalLM
59
+ tokenizer = LlamaTokenizer.from_pretrained('klyang/MentaLLaMA-chat-13B')
60
+ model = LlamaForCausalLM.from_pretrained('klyang/MentaLLaMA-chat-13B', device_map='auto')
61
+ ```
62
+
63
+ In this example, LlamaTokenizer is used to load the tokenizer, and LlamaForCausalLM is used to load the model. The `device_map='auto'` argument is used to automatically
64
+ use the GPU if it's available.
65
+
66
+ ## License
67
+
68
+ MentaLLaMA-33B-lora is licensed under MIT. For more details, please see the MIT file.
69
+
70
+ ## Citation
71
+
72
+ If you use MentaLLaMA-33B-lora in your work, please cite our paper:
73
+
74
+ ```bibtex
75
+ @misc{yang2023mentalllama,
76
+ title={MentalLLaMA: Interpretable Mental Health Analysis on Social Media with Large Language Models},
77
+ author={Kailai Yang and Tianlin Zhang and Ziyan Kuang and Qianqian Xie and Sophia Ananiadou},
78
+ year={2023},
79
+ eprint={2309.13567},
80
+ archivePrefix={arXiv},
81
+ primaryClass={cs.CL}
82
+ }
83
+ ```