as far I know(fix me if I am wrong) llama-quantize now supports layer pruning via the --prune-layers flagso is possible to prune model?
· Sign up or log in to comment