omer-w
/

baichuan2-13b-chat-fastllm

Model card Files Files and versions Community

baichuan2-13b-chat-fastllm / README.md

omer-w's picture

Update README.md

aed7914 over 1 year ago

|

history blame contribute delete

973 Bytes

	---
	license: bsd
	---

	Converted INT8/INT4 files for [fastllm](https://github.com/ztxz16/fastllm) with [baichuan2-13b-chat](https://huggingface.co/baichuan-inc/Baichuan2-13B-Chat)

	Directly download from Baidu Netdisk:

	Link：https://pan.baidu.com/s/1Xsiif_1VzDyWFei1u5oJcA
	Code：wxbo

	Updated time: 2023/09/11


	```

	baichuan2-13b-chat-int8.flm:

	\|===============================+======================+======================\|
	\| 0 NVIDIA GeForce ... Off \| 00000000:05:00.0 Off \| N/A \|
	\| 31% 36C P8 28W / 250W \| 15420MiB / 22528MiB \| 0% Default \|
	+-------------------------------+----------------------+----------------------+

	```

	```python
	from fastllm_pytools import llm
	model = llm.model("baichuan2-13b-chat-int8.flm")
	for response in model.stream_response("介绍一下南京"):
	print(response, flush = True, end = "")
	```
	> Note: please use the lastest version of FastLLM (no later than 2023/09/11 main branch)