Small Language Models Are the New Rage, Researchers Say

The original version of this story appeared in Quanta Magazine.

Large language models work well because they’re so large. The latest models from OpenAI, Meta, and DeepSeek use hundreds of billions of “parameters”—the adjustable knobs that determine connections among data and get tweaked during the training process. With more parameters, the models are better able to identify patterns and connections, which in turn makes them more powerful and accurate.

But this power comes at a cost. Training a model with hundreds of billions of parameters takes huge computational resources. To train its Gemini 1.0 Ultra model, for example, Google reportedly spent $191 million. Large language models (LLMs) also require considerable computational

→ Continue reading at Wired - Science

More from author

Related posts

Advertisment

Latest posts

US stock futures rise amid temporary tariff exemptions for tech products | CNN Business

CNN  —  Stock futures rose Sunday after a temporary reprieve from tariffs on electronic imports from China by...

Trump trade policies face pushback as recession fears grow | CNN Business

CNN  —  Commerce Secretary Howard Lutnick said the Trump administration’s decision on Friday to exempt electronic devices —...

The Stock Market Imploded, But This OpenAI Tool Sees It as Opportunity

Disclosure: Our goal is to feature products and services that we think you'll find interesting and useful. If you purchase them, Entrepreneur may...