DeepMind develops SAFE, an AI-based app that can fact-check LLMs

8,559 2 minutes read

Credit: CC0 Public Area

A group of synthetic intelligence specialists at Google’s DeepMind has developed an AI-based system referred to as SAFE that can be utilized to truth verify the outcomes of LLMs corresponding to ChatGPT. The group has printed a paper describing the brand new AI system and the way nicely it carried out on the arXiv preprint server.

Massive language fashions corresponding to ChatGPT have been within the information quite a bit over the previous couple of years—they will write papers, give solutions to questions and even resolve math issues. However they undergo from one main drawback: accuracy. Each end result obtained by an LLM have to be checked manually to make sure that the outcomes are right, an attribute that significantly reduces their worth.

On this new effort, the researchers at DeepMind created an AI utility that may verify the outcomes of solutions given by LLMs and level out inaccuracies routinely.

One of many predominant ways in which human customers of LLMs fact-check outcomes is by investigating AI responses utilizing a search engine corresponding to Google to search out applicable sources for verification. The group at DeepMind took the identical method. They created an LLM that breaks down claims or information in a solution offered by the unique LLM after which used Google Search to search out websites that could possibly be used for verification after which in contrast the 2 solutions to find out accuracy. They name their new system Search-Augmented Factuality Evaluator (SAFE).

To check their system, the analysis group used it to confirm roughly 16,000 information contained in solutions given by a number of LLMs. They in contrast their outcomes in opposition to human (crowdsourced) fact-checkers and located that SAFE matched the findings of the people 72% of the time. When testing disagreements between SAFE and the human checkers, the researchers discovered SAFE to be the one which was right 76% of the time.

The group at DeepMind has made the code for SAFE accessible to be used by anybody who chooses to make the most of its capabilities by posting in on the open-source web site GitHub.

Extra info:
Jerry Wei et al, Lengthy-form factuality in giant language fashions, arXiv (2024). DOI: 10.48550/arxiv.2403.18802

Code launch: github.com/google-deepmind/long-form-factuality

Journal info:
arXiv

Quotation:
DeepMind develops SAFE, an AI-based app that may fact-check LLMs (2024, March 29)
retrieved 29 March 2024
from https://techxplore.com/information/2024-03-deepmind-safe-ai-based-app.html

This doc is topic to copyright. Other than any honest dealing for the aim of personal examine or analysis, no
half could also be reproduced with out the written permission. The content material is offered for info functions solely.

Click Here To Join Our Telegram Channel

Source link

In case you have any considerations or complaints relating to this text, please tell us and the article will likely be eliminated quickly.

Raise A Concern