xcjthu commited on
Commit
b68acaf
·
verified ·
1 Parent(s): c9bac65

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -7
README.md CHANGED
@@ -20,11 +20,12 @@ library_name: transformers
20
 
21
  ## What's New
22
  - [2025.06.06] **MiniCPM4** series are released! This model achieves ultimate efficiency improvements while maintaining optimal performance at the same scale! It can achieve over 5x generation acceleration on typical end-side chips! You can find the technical report [here](https://github.com/OpenBMB/MiniCPM/tree/main/report/MiniCPM_4_Technical_Report.pdf).🔥🔥🔥
23
- - [2025.06.09] **MiniCPM4-8B-mlx** is available and you can run MiniCPM4 on your Apple devices! Thanks to [pzc163](https://huggingface.co/pzc163) for providing this converted model version and related usage instructions.
24
 
25
  ## MiniCPM4 Series
26
  MiniCPM4 series are highly efficient large language models (LLMs) designed explicitly for end-side devices, which achieves this efficiency through systematic innovation in four key dimensions: model architecture, training data, training algorithms, and inference systems.
27
  - [MiniCPM4-8B-mlx](https://huggingface.co/openbmb/MiniCPM4-8B-mlx): MiniCPM4-8B in mlx format, which can used for Apple silicon. (**<-- you are here**)
 
28
  - [MiniCPM4-8B](https://huggingface.co/openbmb/MiniCPM4-8B): The flagship of MiniCPM4, with 8B parameters, trained on 8T tokens.
29
  - [MiniCPM4-0.5B](https://huggingface.co/openbmb/MiniCPM4-0.5B): The small version of MiniCPM4, with 0.5B parameters, trained on 1T tokens.
30
  - [MiniCPM4-8B-Eagle-FRSpec](https://huggingface.co/openbmb/MiniCPM4-8B-Eagle-FRSpec): Eagle head for FRSpec, accelerating speculative inference for MiniCPM4-8B.
@@ -56,19 +57,19 @@ MiniCPM 4 is an extremely efficient edge-side large model that has undergone eff
56
  - ArkInfer -- Cross-platform Deployment System: Supports efficient deployment across multiple backend environments, providing flexible cross-platform adaptation capabilities
57
 
58
 
59
- ## How to Run MiniCPM-8B-mlx
60
 
61
- Here is a guide on how to run the `MiniCPM-8B-mlx` model from the command line using `mlx-lm`. You can use mlx-lm to interact with the `MiniCPM-8B-mlx` model directly from your command line. This is a powerful tool that allows you to quickly test and use LLMs in the MLX format.
62
 
63
  ### Basic Usage
64
- Here is a specific example. This command will load the `openbmb/MiniCPM-8B-mlx` model and generate text based on the prompt you provide: "hello, pls tell me which one is the most powerful LLM in the World".
65
  ```Bash
66
- mlx_lm.generate --model openbmb/MiniCPM-8B-mlx --prompt "hello, pls tell me which one is the most powerful LLM in the World"
67
  ```
68
 
69
  ### MLX-LM Command Line Parameters
70
  - `mlx_lm.generate`: This is the primary command in the mlx-lm toolkit used for text generation.
71
- - `--model openbmb/MiniCPM-8B-mlx`: This parameter specifies the model to be loaded. `openbmb/MiniCPM-8B-mlx` is the model's identifier on the Hugging Face Hub. mlx-lm will automatically download and cache the model from there.
72
  - `--prompt "..."`: This parameter is used to provide the initial text that you want the model to respond to or complete.
73
  - `--max-tokens`: Sets the maximum number of tokens to generate. For example, `--max-tokens 200` will limit the output to 200 tokens.
74
  - `--temp`: Controls the randomness of the output. Higher temperature values (like 0.8) will produce more diverse and creative outputs, while lower values (like 0.2) will make the output more deterministic and focused. The default value is usually 0.6.
@@ -78,7 +79,7 @@ mlx_lm.generate --model openbmb/MiniCPM-8B-mlx --prompt "hello, pls tell me whic
78
 
79
  The following command will use a higher temperature value and limit the output length:
80
  ```bash
81
- mlx_lm.generate --model openbmb/MiniCPM-8B-mlx \
82
  --prompt "tell me a story about a robot who discovered music" \
83
  --max-tokens 500 \
84
  --temp 0.8
 
20
 
21
  ## What's New
22
  - [2025.06.06] **MiniCPM4** series are released! This model achieves ultimate efficiency improvements while maintaining optimal performance at the same scale! It can achieve over 5x generation acceleration on typical end-side chips! You can find the technical report [here](https://github.com/OpenBMB/MiniCPM/tree/main/report/MiniCPM_4_Technical_Report.pdf).🔥🔥🔥
23
+ - [2025.06.09] **MiniCPM4-8B-mlx** and **MiniCPM4-0.5B-mlx** are available and you can run MiniCPM4 on your Apple devices! Thanks to [pzc163](https://huggingface.co/pzc163) for providing this converted model version and related usage instructions.
24
 
25
  ## MiniCPM4 Series
26
  MiniCPM4 series are highly efficient large language models (LLMs) designed explicitly for end-side devices, which achieves this efficiency through systematic innovation in four key dimensions: model architecture, training data, training algorithms, and inference systems.
27
  - [MiniCPM4-8B-mlx](https://huggingface.co/openbmb/MiniCPM4-8B-mlx): MiniCPM4-8B in mlx format, which can used for Apple silicon. (**<-- you are here**)
28
+ - [MiniCPM4-0.5B-mlx](https://huggingface.co/openbmb/MiniCPM4-0.5B-mlx): MiniCPM4-0.5B in mlx format, which can used for Apple silicon.
29
  - [MiniCPM4-8B](https://huggingface.co/openbmb/MiniCPM4-8B): The flagship of MiniCPM4, with 8B parameters, trained on 8T tokens.
30
  - [MiniCPM4-0.5B](https://huggingface.co/openbmb/MiniCPM4-0.5B): The small version of MiniCPM4, with 0.5B parameters, trained on 1T tokens.
31
  - [MiniCPM4-8B-Eagle-FRSpec](https://huggingface.co/openbmb/MiniCPM4-8B-Eagle-FRSpec): Eagle head for FRSpec, accelerating speculative inference for MiniCPM4-8B.
 
57
  - ArkInfer -- Cross-platform Deployment System: Supports efficient deployment across multiple backend environments, providing flexible cross-platform adaptation capabilities
58
 
59
 
60
+ ## How to Run MiniCPM4-8B-mlx
61
 
62
+ Here is a guide on how to run the `MiniCPM4-8B-mlx` model from the command line using `mlx-lm`. You can use mlx-lm to interact with the `MiniCPM4-8B-mlx` model directly from your command line. This is a powerful tool that allows you to quickly test and use LLMs in the MLX format.
63
 
64
  ### Basic Usage
65
+ Here is a specific example. This command will load the `openbmb/MiniCPM4-8B-mlx` model and generate text based on the prompt you provide: "hello, pls tell me which one is the most powerful LLM in the World".
66
  ```Bash
67
+ mlx_lm.generate --model openbmb/MiniCPM4-8B-mlx --prompt "hello, pls tell me which one is the most powerful LLM in the World"
68
  ```
69
 
70
  ### MLX-LM Command Line Parameters
71
  - `mlx_lm.generate`: This is the primary command in the mlx-lm toolkit used for text generation.
72
+ - `--model openbmb/MiniCPM4-8B-mlx`: This parameter specifies the model to be loaded. `openbmb/MiniCPM4-8B-mlx` is the model's identifier on the Hugging Face Hub. mlx-lm will automatically download and cache the model from there.
73
  - `--prompt "..."`: This parameter is used to provide the initial text that you want the model to respond to or complete.
74
  - `--max-tokens`: Sets the maximum number of tokens to generate. For example, `--max-tokens 200` will limit the output to 200 tokens.
75
  - `--temp`: Controls the randomness of the output. Higher temperature values (like 0.8) will produce more diverse and creative outputs, while lower values (like 0.2) will make the output more deterministic and focused. The default value is usually 0.6.
 
79
 
80
  The following command will use a higher temperature value and limit the output length:
81
  ```bash
82
+ mlx_lm.generate --model openbmb/MiniCPM4-8B-mlx \
83
  --prompt "tell me a story about a robot who discovered music" \
84
  --max-tokens 500 \
85
  --temp 0.8