vincentoh
/

llama3-70b-GGUF

Model card Files Files and versions Community

llama3-70b-GGUF / README.md

vincentoh's picture

Update README.md

3cf3b2b verified 9 months ago

|

history blame contribute delete

2.8 kB

	```

	# code
	https://huggingface.co/vincentoh/llama3_70b_no_robot_fsdp_qlora


	# model
	wget "https://huggingface.co/vincentoh/llama3-70b-GGUF/blob/main/vincentoh/llama3-70b-GGUF"

	# memory usage
	Thu May 16 15:53:07 2024
	+---------------------------------------------------------------------------------------+
	\| NVIDIA-SMI 535.129.03 Driver Version: 535.129.03 CUDA Version: 12.2 \|
	\|-----------------------------------------+----------------------+----------------------+
	\| GPU Name Persistence-M \| Bus-Id Disp.A \| Volatile Uncorr. ECC \|
	\| Fan Temp Perf Pwr:Usage/Cap \| Memory-Usage \| GPU-Util Compute M. \|
	\| \| \| MIG M. \|
	\|=========================================+======================+======================\|
	\| 0 NVIDIA H100 PCIe On \| 00000000:08:00.0 Off \| 0 \|
	\| N/A 37C P0 76W / 350W \| 40441MiB / 81559MiB \| 24% Default \|
	\| \| \| Disabled \|
	+-----------------------------------------+----------------------+----------------------+

	+---------------------------------------------------------------------------------------+
	\| Processes: \|
	\| GPU GI CI PID Type Process name GPU Memory \|
	\| ID ID Usage \|
	\|=======================================================================================\|
	\| 0 N/A N/A 17735 C ./main 40428MiB \|
	+---------------------------------------------------------------------------------------+


	# token speed
	<\|begin_of_text\|>Why is the sky blue? The sky is blue due to a phenomenon called Rayleigh scattering. This scattering refers to the scattering of electromagnetic radiation (light) by particles much smaller than the wavelength of the light. The short-wavelength blue light is scattered more than the other colors of visible light, resulting in more blue light reaching the observer than the other colors of light.<\|end_of_text\|> [end of text]

	llama_print_timings: load time = 6244.37 ms
	llama_print_timings: sample time = 4.39 ms / 69 runs ( 0.06 ms per token, 15710.38 tokens per second)
	llama_print_timings: prompt eval time = 90.86 ms / 7 tokens ( 12.98 ms per token, 77.05 tokens per second)
	llama_print_timings: eval time = 2334.73 ms / 68 runs ( 34.33 ms per token, 29.13 tokens per second)
	llama_print_timings: total time = 2486.72 ms / 75 tokens
	Log end


	```