It’s been almost a year since DeepSeek made a major AI splash.
In January, the Chinese company reported that one of its large language models rivaled an OpenAI counterpart on math and coding benchmarks designed to evaluate multi-step problem solving capabilities, or what the AI field calls “reasoning.” DeepSeek’s buzziest claim was that it achieved this performance while keeping costs low. The implication: AI model improvements didn’t always need massive computing infrastructure or the very best computer chips but might be achieved by efficient use of cheaper hardware. A slew of research followed that headline-grabbing announcement, all trying to better understand DeepSeek models’ reasoning methods, improve them and even
→ Continue reading at Science News