Microsoft’s small language model outperforms larger models on standardized math tests

News8Plus8th March 2024

8,558 2 minutes read

Credit: Deepak Gautam from Pexels

A small crew of AI researchers at Microsoft reviews that the corporate’s Orca-Math small language mannequin outperforms different, bigger fashions on standardized math exams. The group has printed a paper on the arXiv preprint server describing their testing of Orca-Math on the Grade Faculty Math 8K (GSM8K) benchmark and the way it fared in comparison with well-known LLMs.

Many standard LLMs reminiscent of ChatGPT are identified for his or her spectacular conversational abilities—much less well-known is that almost all of them also can clear up math phrase issues. AI researchers have examined their talents at such duties by pitting them towards the GSM8K, a dataset of 8,500 grade-school math phrase issues that require multistep reasoning to resolve, together with their appropriate solutions.

On this new examine, the analysis crew at Microsoft examined Orca-Math, an AI software developed by one other crew at Microsoft particularly designed to deal with math phrase issues, and in contrast the outcomes with bigger AI fashions.

Microsoft factors out on its Research Blog post that there’s a main distinction between standard LLMs reminiscent of ChatGPT and Orca-Math. The previous is a big language mannequin and the latter is a small language mannequin—the distinction is within the variety of parameters which can be used; sometimes within the 1000’s or a number of million for SLMs, quite than the billions or trillions utilized by LLMs. One other distinction is that, as its identify suggests, Orca-Math was designed particularly to resolve math issues; thus, it can’t be used to hold on conversations or reply random questions.

Orca-Math is comparatively massive in comparison with different SLMs, with 7 billion parameters, however nonetheless a lot smaller than many of the well-known LLMs. Nevertheless, it nonetheless managed to attain 86.81% on the GSM8k, near GPT-4-0613, which received 97.0%. Others, reminiscent of Llama-2, didn’t fare almost as nicely, with scores as little as 14.6%.

Microsoft reveals that it was capable of garner such a excessive rating by utilizing higher-quality coaching information than is on the market to general-use LLMs and since it used an interactive studying course of the AI crew at Microsoft has been creating—a course of that frequently improves outcomes by utilizing suggestions from a trainer. The crew at Microsoft concludes that SLMs can carry out in addition to LLMs on sure functions when developed below specialised circumstances.

Extra data:
Arindam Mitra et al, Orca-Math: Unlocking the potential of SLMs in Grade Faculty Math, arXiv (2024). DOI: 10.48550/arxiv.2402.14830

Orca-Math: www.microsoft.com/en-us/resear … odel-specialization/
twitter.com/Arindam1408/status/1764761895473762738

Journal data:
arXiv

Quotation:
Microsoft’s small language mannequin outperforms bigger fashions on standardized math exams (2024, March 8)
retrieved 8 March 2024
from https://techxplore.com/information/2024-03-microsoft-small-language-outperforms-larger.html

This doc is topic to copyright. Aside from any truthful dealing for the aim of personal examine or analysis, no
half could also be reproduced with out the written permission. The content material is offered for data functions solely.

Click Here To Join Our Telegram Channel

Source link

In case you have any considerations or complaints relating to this text, please tell us and the article might be eliminated quickly.

Raise A Concern

Microsoft’s small language model outperforms larger models on standardized math tests

Read Next

Scientists convert chicken fat into energy storage devices

Degradation-adaptive neural network for jointly single image dehazing and desnowing

A better way to control shape-shifting soft robots

Prototype browser extension adds Wikipedia-like citations on YouTube to curb misinformation

Scientists convert chicken fat into energy storage devices

Degradation-adaptive neural network for jointly single image dehazing and desnowing

A better way to control shape-shifting soft robots

Prototype browser extension adds Wikipedia-like citations on YouTube to curb misinformation

India’s First Cathode Active Material Manufacturer, Altmin Announces an Investment Outlay $100 Million

Ayodhya Ram Mandir Innaugration: This is how Ram Lala fought a long legal battle in the court.

Surgical robot developed at Nebraska launches into space

Read Next

Scientists convert chicken fat into energy storage devices

Degradation-adaptive neural network for jointly single image dehazing and desnowing

A better way to control shape-shifting soft robots

Prototype browser extension adds Wikipedia-like citations on YouTube to curb misinformation

Major accident during Shiv procession in Kota, 14 children got burnt due to electric shock, condition of one is critical

Researchers enhance peripheral vision in AI models

Related Articles

India’s First Cathode Active Material Manufacturer, Altmin Announces an Investment Outlay $100 Million

Ayodhya Ram Mandir Innaugration: This is how Ram Lala fought a long legal battle in the court.

Surgical robot developed at Nebraska launches into space