aws-neuron
/

optimum-neuron-cache

Model card Files Files and versions Community

optimum-neuron-cache / inference-cache-config

3 contributors

History: 62 commits

dacorvo's picture

dacorvo HF staff

Add phi4 cached configurations

c564534 verified 16 days ago

gpt2.json

398 Bytes

Add more gpt2 configurations 12 months ago
granite.json

1.3 kB

Add configuration for granite models 3 months ago
llama-variants.json

1.45 kB

Add DeepSeek distilled versions of LLama 8B about 2 months ago
llama.json

1.67 kB

Update inference-cache-config/llama.json 6 months ago
llama2-70b.json

287 Bytes

Create llama2-70b.json 9 months ago
llama3-70b.json

584 Bytes

Add DeepSeek distilled model about 2 months ago
llama3.1-70b.json

289 Bytes

Rename inference-cache-config/Llama3.1-70b.json to inference-cache-config/llama3.1-70b.json 6 months ago
mistral-variants.json

1.04 kB

Remove obsolete mistral variants 6 months ago
mistral.json

1.87 kB

Update inference-cache-config/mistral.json 3 months ago
mixtral.json

583 Bytes

Update inference-cache-config/mixtral.json 6 months ago
phi4.json

556 Bytes

Add phi4 cached configurations 16 days ago
qwen2.5-large.json

849 Bytes

Update inference-cache-config/qwen2.5-large.json about 2 months ago
qwen2.5.json

2.69 kB

Add DeepSeek distilled models about 2 months ago
stable-diffusion.json

1.91 kB

Update inference-cache-config/stable-diffusion.json 6 months ago