Zoom researchers detail a “chain of draft” method to let LLMs accurately solve reasoning problems with as little as 7.6% of the tokens used by current methods.

Tea · 15 hours ago

Zoom researchers detail a “chain of draft” method to let LLMs accurately solve reasoning problems with as little as 7.6% of the tokens used by current methods.

TxTechnician@lemmy.ml · 7 hours ago

Answer the question directly. Do not return any
preamble, explanation, or reasoning.

Chain-of-Thought
Think step by step to answer the following question.
Return the answer at the end of the response after a
separator ####.

Chain-of-Draft
Think step by step, but only keep a minimum draft for
each thinking step, with 5 words at most. Return the
answer at the end of the response after a separator
 ####.

Thats interesting. Good tip.

https://arxiv.org/pdf/2502.18600

oktoberpaard@feddit.nl · edit-2 14 hours ago

Looking at their repo, they’ve tested this with LLM models that have not been trained to generate chain of thought outputs, by varying the system prompts. It’s therefore more of a proof of concept, but I can imagine that if you train a model to do this natively it could work.

Using the same prompt with QwQ made no difference for me (the chain of thought was still very long and quite verbose), while using it with Qwen2.5 Coder made the output extremely terse and not very useful for open-ended questions.

Zoom researchers detail a “chain of draft” method to let LLMs accurately solve reasoning problems with as little as 7.6% of the tokens used by current methods.

Zoom researchers detail a “chain of draft” method to let LLMs accurately solve reasoning problems with as little as 7.6% of the tokens used by current methods.

Chain of Draft: Thinking Faster by Writing Less