q5_k_m: Recommended. Uses Q6_K for half of the attention.wv and feed_forward.w2 tensors, else Q5_K 71fc4da verified datatab commited on Feb 28, 2024
q3_k_m: Uses Q4_K for the attention.wv, attention.wo, and feed_forward.w2 tensors, else Q3_K d79d13a verified datatab commited on Feb 28, 2024
q4_k_m: Recommended. Uses Q6_K for half of the attention.wv and feed_forward.w2 tensors, else Q4_K d1a2db2 verified datatab commited on Feb 28, 2024