Every processed story in chronological order, with the newest coverage first. Filter by tag, source, or score to drill in.
MiniPIC removes the requirement for identical prefixes to reuse KV cache entries, enabling efficient caching of recurring structured inputs in retrieval-augmented and agentic workloads without the large server-side code changes or host-to-device transfer overhead of prior PIC approaches.
This benchmark directly addresses a gap the post identifies — the lack of tool-calling quality evaluations for popular local GGUF quants — and provides concrete, reproducible evidence that KV cache quantization level and context length have measurable effects on tool-calling accuracy for Qwen3.6-35B-A3B.