A look under the hood of DeepSeek’s AI models doesn’t provide all the answers

It’s been almost a year since DeepSeek made a major AI splash.

In January, the Chinese company reported that one of its large language models rivaled an OpenAI counterpart on math and coding benchmarks designed to evaluate multi-step problem solving capabilities, or what the AI field calls “reasoning.” DeepSeek’s buzziest claim was that it achieved this performance while keeping costs low. The implication: AI model improvements didn’t always need massive computing infrastructure or the very best computer chips but might be achieved by efficient use of cheaper hardware. A slew of research followed that headline-grabbing announcement, all trying to better understand DeepSeek models’ reasoning methods, improve them and even

→ Continue reading at Science News

More from author

Related posts

Advertisment

Latest posts

Some irritability is normal. Here’s when it’s not

Many of us know the feeling: a sudden rush of anger over a seemingly minor thing like a colleague’s irksome email, getting a customer...

Huge relatives of white sharks lived earlier than thought

Some 115 million years ago, a veritable fleet of giant predators prowled the waters near Australia. There were long-necked plesiosaurs, snaggletoothed pliosaurs with massive...

For the First Time, Mutations in a Single Gene Have Been Linked to Mental Illness

A team of physicians specializing in genetics and neurology discovered that mental illnesses such as schizophrenia are closely linked to mutations in the GRIN2A...