‘Self-Taught Evaluator’: Meta releases new AI tools for autonomous AI development – Times of India

muhammadharisazam151@gmail.com October 19, 2024

0 2 2 minutes read

Representative Image (Picture credit: Reuters)

Meta, the parent company of Facebook, unveiled a set of innovative AI models developed by its research division on Friday, reported Reuters.
Among the standout tools is the “Self-Taught Evaluator,” which may reduce the need for human involvement in the AI development process. This development is an important step towards creating AI systems capable of learning from their own mistakes, potentially paving the way for more autonomous and intelligent digital agents.
In addition to the Self-Taught Evaluator, Meta also released updates to its image-identification Segment Anything model, a tool for accelerating response generation times in large language models (LLMs), and datasets designed to support the discovery of new inorganic materials.
First introduced in an August research paper, the Self-Taught Evaluator uses the same “chain of thought” technique employed by OpenAI‘s latest models. This approach involves breaking complex tasks into smaller steps to increase accuracy in fields like science, coding, and mathematics.
Crucially, Meta’s researchers trained the evaluator entirely on AI-generated data, eliminating the need for human input during the training phase.
According to Meta researchers, the ability of AI to evaluate other AI models accurately opens new possibilities for autonomous AI systems that can self-improve. This could lead to the development of digital assistants capable of performing a wide range of tasks without human intervention.
Self-improving AI models may also reduce reliance on the costly and time-consuming process of Reinforcement Learning from Human Feedback (RLHF), which involves specialised human annotators verifying data and checking AI-generated answers for accuracy.
Jason Weston, one of Meta’s researchers, expressed hope that as AI becomes more advanced, it will become increasingly capable of verifying its own work, surpassing human accuracy.
Other companies, such as Google and Anthropic, have also been exploring the concept of Reinforcement Learning from AI Feedback (RLAIF).
However, unlike Meta, these companies have been more cautious in releasing their models to the public.