Add Deepseek-R1-Distill-Qwen-32B results #3278

gcp · 2025-02-17T10:40:20Z

At 15.1% this is reasonable, much better than the LLama-70B Distill. This was with Q4(KL) and a 32k Q8 context, which all fits on a 24GB GPU and is fast to run. So it's also possible to use 64k context on this model with a bit more quantization on the context.

I might try edit-format: whole and maybe Q5_K_S with a quantized or smaller context to see if that is even better.

CLAassistant · 2025-02-17T10:40:30Z

All committers have signed the CLA.

gcp · 2025-02-18T12:24:04Z

A run with a Q5KS quant scored a tad higher, 16% exactly: gcp@1b054b0

gcp mentioned this pull request Feb 17, 2025

Benchmark request: DeepSeek-R1-Distill-Llama-70B #3167

Closed

gcp added 2 commits February 17, 2025 11:50

Add DeepSeek-R1-Distill-Qwen.

51d4183

Add results from the benchmark.

1cbe2a2

gcp force-pushed the deepseek-qwen branch from 8c7e91b to 1cbe2a2 Compare February 17, 2025 10:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Deepseek-R1-Distill-Qwen-32B results #3278

Add Deepseek-R1-Distill-Qwen-32B results #3278

gcp commented Feb 17, 2025

CLAassistant commented Feb 17, 2025 •

edited

Loading

gcp commented Feb 18, 2025

Add Deepseek-R1-Distill-Qwen-32B results #3278

Are you sure you want to change the base?

Add Deepseek-R1-Distill-Qwen-32B results #3278

Conversation

gcp commented Feb 17, 2025

CLAassistant commented Feb 17, 2025 • edited Loading

gcp commented Feb 18, 2025

CLAassistant commented Feb 17, 2025 •

edited

Loading