By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
inkeinspires.cominkeinspires.cominkeinspires.com
Notification Show More
Font ResizerAa
  • Home
  • Breaking News
    Breaking NewsShow More
    2 climbers die on Mount Everest; expedition organizers still deciding if and when bodies will be retrieved
    May 16, 2025
    Global hunger hits new high amid conflict, extreme weather: UN | News
    May 16, 2025
    Save big during HexClad’s Memorial Day sale
    May 16, 2025
    Microsoft seeks to placate EU with pledges to unbundle Teams, Office
    May 16, 2025
    TPLF says Ethiopia ban threatens Tigray peace deal
    May 16, 2025
  • Business
    BusinessShow More
    81% of Workers Fear Job Loss This Year — How Should You Prepare?
    May 16, 2025
    Uber and Volkswagen are partnering to launch autonomous shared rides
    May 16, 2025
    Novo Nordisk ousts boss as Ozempic maker battles profit slump
    May 16, 2025
    Bond vigilantes killed Trump’s reciprocal tariffs—and they’re weighing in on GOP’s push for tax cuts
    May 16, 2025
    India’s sugar production reaches 257.44 lakh tonnes as of May 15; positive outlook for 2025–26
    May 16, 2025
  • Entertainment
    EntertainmentShow More
    Reacts To Soulja Boy, Speaks After Temp. Custody To Halle
    May 16, 2025
    The Feel-Bad Horror Movie Of 2025 Will Make You Squirm
    May 16, 2025
    Salma Hayek Shares Bond With Blake Lively and Ryan Reynolds’ Kids
    May 16, 2025
    Alice Evans Receives $5K In Fan Donations After Cry For Help
    May 16, 2025
    Did Georgie and Mandy End Season 1 With a Break Up Over Their Age Gap?
    May 16, 2025
  • Gadgets
    GadgetsShow More
    CES 2025: 41 Products You Can Buy Right Now
    January 13, 2025
    I can’t wait try out these 3 great plant tech gadgets that I saw at CES 2025
    January 13, 2025
    6 on Your Side Consumer Confidence: Kitchen gadgets to upgrade family recipes – ABC 6 News
    January 13, 2025
    35+ Best New Products, Tech and Gadgets
    January 13, 2025
    These gadgets kept me connected and working through a 90-mile backpacking trip
    January 13, 2025
  • Health
    HealthShow More
    Boost Speed & Endurance Indoors inkeinspires
    May 16, 2025
    Do Women Or Men Lose Weight Faster? What The Science Says inkeinspires
    May 15, 2025
    The 5/3/1 Training Method: A Proven Strength-Building System inkeinspires
    May 15, 2025
    The 5-4-3-2-1 Track Workout: Build Speed & Endurance inkeinspires
    May 15, 2025
    The 30-60-90 Track Workout: Boost Speed & Endurance inkeinspires
    May 15, 2025
  • Sports
    SportsShow More
    Guardiola says Crystal Palace clash is ‘massively important’ for Manchester City
    May 16, 2025
    Arsenal set for Kai Havertz boost but key quartet doubtful for Newcastle clash
    May 16, 2025
    Parma v Napoli – Line-ups, stats and preview
    May 16, 2025
    Jamie Vardy treating Leicester farewell as just another game
    May 16, 2025
    Arsenal v Newcastle – Line-ups, stats and preview
    May 16, 2025
  • Technology
    TechnologyShow More
    Sigma BF hands-on: Minimal to a fault
    May 16, 2025
    Best Internet Providers in Maryland
    May 16, 2025
    Shark CryoGlow Review: Chill Out
    May 16, 2025
    The best USB-C hub for 2025
    May 16, 2025
    Today’s NYT Connections: Sports Edition Hints, Answers for May 16 #235
    May 16, 2025
  • Posts
    • Post Layouts
    • Gallery Layouts
    • Video Layouts
    • Audio Layouts
    • Post Sidebar
    • Review
      • User Rating
    • Content Features
    • Table of Contents
  • Contact US
  • Pages
    • Blog Index
    • Search Page
    • Customize Interests
    • My Bookmarks
    • 404 Page
Reading: Less supervision, better results: Study shows AI models generalize more effectively on their own
Share
Font ResizerAa
inkeinspires.cominkeinspires.com
  • Entertainment
Search
  • Home
  • Categories
    • Breaking News
    • Business
    • Sports
    • Technology
    • Entertainment
    • Gadgets
    • Health
  • Contact
Have an existing account? Sign In
Follow US
inkeinspires.com > Technology > Less supervision, better results: Study shows AI models generalize more effectively on their own
Technology

Less supervision, better results: Study shows AI models generalize more effectively on their own

MTHANNACH
Last updated: February 12, 2025 9:34 pm
MTHANNACH Published February 12, 2025
Share
SHARE

Join our daily and weekly newsletters for the latest updates and the exclusive content on AI coverage. Learn more


Linguistic models can better generalize when left to create their own solutions, a new study By the University of Hong Kong and the University of California, Berkeley, watch. The results, which apply both to important language models (LLMS) and vision language models (VLM), question one of the main beliefs in the LLM community – that models require examples of training marked by hand. In fact, researchers show that the formation of models on too many hand -made examples can have negative effects on the capacity of the model to become widespread in invisible data.

SFT VS RL in model training

For a long time, the supervised fine setting (SFT) was the gold stallion for the LLMS and the VLM training. Once a model is pre-formed on raw text and image data, companies and laboratories have generally interact on a large set of data of hand-made examples in the question format / response or request / response. After SFT, the model may undergo additional training steps, such as Reinforcement of learning human feedback (RLHF), where the model tries to learn implicit human preferences based on signals such as the classification of responses or to love / hate the model’s responses.

SFT is useful for directing the behavior of a model to the type of tasks for which the creators of the model have designed it. However, data collection is a slow and expensive process, which is a bottleneck for many companies and laboratories.

Recent developments in LLM have aroused interest in learning approaches by pure strengthening (RL), where the model receives a task and left to learn it alone without hand -made examples. The most important case is Deepseek-R1, the competitor Openai O1 which mainly used the learning of strengthening to learn complex reasoning tasks.

Generalization vs memorization

One of the main problems of automatic learning systems (ML) is an over-adjustment, where the model works well on its training data but fails to become widespread in invisible examples. During the training, the model gives the false impression of having learned the task, while in practice, it has just memorized its examples of training. In large and complex AI models, the separation of the generalization of memorization can be difficult.

The new study focuses on the generalization capacities of RL and SFT training in textual and visual reasoning tasks. For textual reasoning, an LLM formed on a set of rules should be able to generalize to the variants of these rules. In the visual reasoning, a VLM must remain consistent in the performance of the task in relation to the modifications of different aspects of the visual entry, such as the color and the spatial arrangement.

In their experiences, researchers used two representative tasks. The first was GeneralPoints, a reference that assesses the arithmetic reasoning capacities of a model. The model receives four cards, in the form of textual descriptions or images, and is invited to combine them to reach a target number. To study the generalization based on the regions, the researchers formed the model using a set of rules, then evaluated it using a different rule. For visual generalization, they formed the model using a color cards and tested its performance on the maps of other colors and the numbering diagrams.

The second task is V-alerWho tests the spatial reasoning capacities of the model in an open world navigation field which uses a realistic visual input. This task is also available in pure language versions and visual language. The researchers evaluated the generalization by modifying the type of instructions and the visual representations on which the model was formed and tested.

They carried out their tests on Llama-3.2-Vision-11b, warming the model by dragging it on a small set of SFT data, then creating separate versions for each task and training paradigm. For each task, they separated the training separately on RL and SFT. The SFT process forms the model on additional hand -made solutions, while RL allows the model to generate many solutions for each problem, to assess the results and to form on the right answers.

The results show that learning to strengthen regularly improve the performance of examples that are radically different from training data. On the other hand, SFT seems to memorize training rules and does not become widespread with examples outside distribution (OOD). These observations apply to text parameters only and multimodal.

The SFT trained models work well on the examples of training (in distribution) while showing poor performance on invisible examples (excluding distribution) (source: arxiv)

Implications for real world applications

Although their experiences show that RL is better to generalize than SFT, researchers have also found that SFT is useful to stabilize the model output format and is crucial to allow RL to reach its performance gains. The researchers found that, without the initial SFT stage, the RL training did not obtain desirable results.

This is a little different from the results obtained by Deepseek-R1-Zero, which was post-formulated on pure RL. The researchers suggest that this may be due to the different skeleton models they have used in their experiences.

It is clear that there is a lot of unexploited potential in RL-RL approaches. For use cases that have verifiable results, let the models learn by themselves can often lead to unforeseen results that humans could not have been made. This could be very practical in parameters where the creation of artisanal examples can be tedious and costly.

Daily information on business use cases with VB daily

If you want to impress your boss, VB Daily has covered you. We give you the interior scoop on what companies do with a generative AI, from regulatory changes to practical deployments, so that you can share information for a maximum return on investment.

Read our privacy policy

Thank you for subscribing. Discover more VB newsletters here.

An error occurred.


You Might Also Like

The Federal Funding Freeze Will Cause Lasting Damage to Medical Research

Player First euthanizes MultiVersus amid ongoing live-service purge

From MIPS to exaflops in mere decades: Compute power is exploding, and it will transform AI

Samsung’s 2025 Bespoke appliances are going all in on AI

Starmer to face PMQs after Musk row as Tories push vote on child grooming gangs: live inkeinspires

Share This Article
Facebook X Email Print
Leave a Comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Subscribe to Our Newsletter
Subscribe to our newsletter to get our newest articles instantly!
loader

Email Address*

Name

Follow US

Find US on Social Medias
FacebookLike
XFollow
YoutubeSubscribe
TelegramFollow

Weekly Newsletter

Subscribe to our newsletter to get our newest articles instantly!
[mc4wp_form]
Popular News
Business

Why Summit Therapeutics Inc. (SMMT) Soared Last Week

MTHANNACH MTHANNACH May 5, 2025
BP and Apollo agree on $1bn deal for 25% stake in bp TANAP
Star Trek: The Next Generation Fans Got A Character To Disappear
Miley Cyrus’ demand for copyright lawsuit against her to be dismissed denied by judge
Stocks slumped in first quarter amid Trump tariffs
- Advertisement -
Ad imageAd image
Global Coronavirus Cases

Confirmed

0

Death

0

More Information:Covid-19 Statistics

Categories

  • Business
  • Breaking News
  • Entertainment
  • Technology
  • Health
  • Sports
  • Gadgets
We influence 20 million users and is the number one business and technology news network on the planet.
Quick Link
  • My Bookmark
  • InterestsNew
  • Contact Us
  • Blog Index
Top Categories
  • Entertainment

Subscribe US

Subscribe to our newsletter to get our newest articles instantly!

 

All Rights Reserved © Inkinspires 2025
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?