Abstract: Large Language Models (LLMs) typically reason via Chain-of-Thought (CoT) prompting or explicit training. Though many LLMs achieve similar accuracy on challenging tasks, such as math problem ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results