By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
inkeinspires.cominkeinspires.cominkeinspires.com
Notification Show More
Font ResizerAa
  • Home
  • Breaking News
    Breaking NewsShow More
    Brazil’s outspoken first lady comes under fire, but refuses to stop speaking out
    June 27, 2025
    2 charged with murder after bride shot dead, groom and 13-year-old nephew wounded at wedding party in France
    June 27, 2025
    Political violence is quintessentially American | Donald Trump
    June 27, 2025
    19 Virginia sheriffs endorse Miyares over Democrat Jones in attorney general race
    June 27, 2025
    China battery giant CATL is expanding globally: Here’s why it matters
    June 27, 2025
  • Business
    BusinessShow More
    Canara Bank hands over Rs 2,283 cr dividend to Centre amid record profits, joins SBI, BoB in robust payouts
    June 27, 2025
    Foreign stocks are crushing US shares, even with the new record high
    June 27, 2025
    Videos reveal driving issues with Tesla’s robotaxi fleet in Austin
    June 27, 2025
    US stocks hit record high as markets recover from Trump tariff shock
    June 27, 2025
    Renewables leaders parse the damage to their industry as Senate finalizes vote on ‘big beautiful bill’
    June 27, 2025
  • Entertainment
    EntertainmentShow More
    Terminator’s Forgotten First Attempt To Save Itself
    June 27, 2025
    Meghan Markle’s $658 Weekender Tote Look Is $36 on Amazon
    June 27, 2025
    Armed Elderly Woman Blocks Texas Highway In 5-Hour Standoff
    June 27, 2025
    Inside Kevin Spacey’s ‘Substantial’ Hollywood Return
    June 27, 2025
    12 Best Movies Like M3GAN
    June 27, 2025
  • Gadgets
    GadgetsShow More
    CES 2025: 41 Products You Can Buy Right Now
    January 13, 2025
    I can’t wait try out these 3 great plant tech gadgets that I saw at CES 2025
    January 13, 2025
    6 on Your Side Consumer Confidence: Kitchen gadgets to upgrade family recipes – ABC 6 News
    January 13, 2025
    35+ Best New Products, Tech and Gadgets
    January 13, 2025
    These gadgets kept me connected and working through a 90-mile backpacking trip
    January 13, 2025
  • Health
    HealthShow More
    A New Study Finds An 8-Hour Eating Window May Help Burn Fat—But Is It Safe? inkeinspires
    June 27, 2025
    184: Crafting a Morning Routine That Works For YOU inkeinspires
    June 26, 2025
    Endurance Exercise and Longevity – BionicOldGuy inkeinspires
    June 26, 2025
    How Zone 2 Cardio Can Burn Fat And Boost Longevity inkeinspires
    June 26, 2025
    What to do when an exercise doesn’t feel right inkeinspires
    June 25, 2025
  • Sports
    SportsShow More
    Brentford appoint former Wolves midfielder Andrews as boss
    June 27, 2025
    Real Betis still hopeful over ‘very complex’ deal for Manchester United’s Antony
    June 27, 2025
    Sri Lanka ODI squad vs Bangladesh announced, Matheesha Pathirana dropped
    June 27, 2025
    Rohit Sharma reveals the unsung hero behind India’s T20 World Cup 2024 triumph
    June 27, 2025
    Keyshawn Davis Under Fire: Fans Blast “Truth Will Reveal Itself” Apology After Missed Weight & Stripped Title
    June 27, 2025
  • Technology
    TechnologyShow More
    US Supreme Court Upholds Texas Porn ID Law
    June 27, 2025
    SCOTUS porn ruling opens door to sweeping internet age verification
    June 27, 2025
    Early Prime Day deals include our favorite mesh Wi-Fi router for a record-low price
    June 27, 2025
    Best Smart Home Safes for 2025: We Cracked the Code
    June 27, 2025
    Mattress Shopping Terms to Know (2025)
    June 27, 2025
  • Posts
    • Post Layouts
    • Gallery Layouts
    • Video Layouts
    • Audio Layouts
    • Post Sidebar
    • Review
      • User Rating
    • Content Features
    • Table of Contents
  • Contact US
  • Pages
    • Blog Index
    • Search Page
    • Customize Interests
    • My Bookmarks
    • 404 Page
Reading: Less supervision, better results: Study shows AI models generalize more effectively on their own
Share
Font ResizerAa
inkeinspires.cominkeinspires.com
  • Entertainment
Search
  • Home
  • Categories
    • Breaking News
    • Business
    • Sports
    • Technology
    • Entertainment
    • Gadgets
    • Health
  • Contact
Have an existing account? Sign In
Follow US
inkeinspires.com > Technology > Less supervision, better results: Study shows AI models generalize more effectively on their own
Technology

Less supervision, better results: Study shows AI models generalize more effectively on their own

MTHANNACH
Last updated: February 12, 2025 9:34 pm
MTHANNACH Published February 12, 2025
Share
SHARE

Join our daily and weekly newsletters for the latest updates and the exclusive content on AI coverage. Learn more


Linguistic models can better generalize when left to create their own solutions, a new study By the University of Hong Kong and the University of California, Berkeley, watch. The results, which apply both to important language models (LLMS) and vision language models (VLM), question one of the main beliefs in the LLM community – that models require examples of training marked by hand. In fact, researchers show that the formation of models on too many hand -made examples can have negative effects on the capacity of the model to become widespread in invisible data.

SFT VS RL in model training

For a long time, the supervised fine setting (SFT) was the gold stallion for the LLMS and the VLM training. Once a model is pre-formed on raw text and image data, companies and laboratories have generally interact on a large set of data of hand-made examples in the question format / response or request / response. After SFT, the model may undergo additional training steps, such as Reinforcement of learning human feedback (RLHF), where the model tries to learn implicit human preferences based on signals such as the classification of responses or to love / hate the model’s responses.

SFT is useful for directing the behavior of a model to the type of tasks for which the creators of the model have designed it. However, data collection is a slow and expensive process, which is a bottleneck for many companies and laboratories.

Recent developments in LLM have aroused interest in learning approaches by pure strengthening (RL), where the model receives a task and left to learn it alone without hand -made examples. The most important case is Deepseek-R1, the competitor Openai O1 which mainly used the learning of strengthening to learn complex reasoning tasks.

Generalization vs memorization

One of the main problems of automatic learning systems (ML) is an over-adjustment, where the model works well on its training data but fails to become widespread in invisible examples. During the training, the model gives the false impression of having learned the task, while in practice, it has just memorized its examples of training. In large and complex AI models, the separation of the generalization of memorization can be difficult.

The new study focuses on the generalization capacities of RL and SFT training in textual and visual reasoning tasks. For textual reasoning, an LLM formed on a set of rules should be able to generalize to the variants of these rules. In the visual reasoning, a VLM must remain consistent in the performance of the task in relation to the modifications of different aspects of the visual entry, such as the color and the spatial arrangement.

In their experiences, researchers used two representative tasks. The first was GeneralPoints, a reference that assesses the arithmetic reasoning capacities of a model. The model receives four cards, in the form of textual descriptions or images, and is invited to combine them to reach a target number. To study the generalization based on the regions, the researchers formed the model using a set of rules, then evaluated it using a different rule. For visual generalization, they formed the model using a color cards and tested its performance on the maps of other colors and the numbering diagrams.

The second task is V-alerWho tests the spatial reasoning capacities of the model in an open world navigation field which uses a realistic visual input. This task is also available in pure language versions and visual language. The researchers evaluated the generalization by modifying the type of instructions and the visual representations on which the model was formed and tested.

They carried out their tests on Llama-3.2-Vision-11b, warming the model by dragging it on a small set of SFT data, then creating separate versions for each task and training paradigm. For each task, they separated the training separately on RL and SFT. The SFT process forms the model on additional hand -made solutions, while RL allows the model to generate many solutions for each problem, to assess the results and to form on the right answers.

The results show that learning to strengthen regularly improve the performance of examples that are radically different from training data. On the other hand, SFT seems to memorize training rules and does not become widespread with examples outside distribution (OOD). These observations apply to text parameters only and multimodal.

The SFT trained models work well on the examples of training (in distribution) while showing poor performance on invisible examples (excluding distribution) (source: arxiv)

Implications for real world applications

Although their experiences show that RL is better to generalize than SFT, researchers have also found that SFT is useful to stabilize the model output format and is crucial to allow RL to reach its performance gains. The researchers found that, without the initial SFT stage, the RL training did not obtain desirable results.

This is a little different from the results obtained by Deepseek-R1-Zero, which was post-formulated on pure RL. The researchers suggest that this may be due to the different skeleton models they have used in their experiences.

It is clear that there is a lot of unexploited potential in RL-RL approaches. For use cases that have verifiable results, let the models learn by themselves can often lead to unforeseen results that humans could not have been made. This could be very practical in parameters where the creation of artisanal examples can be tedious and costly.

Daily information on business use cases with VB daily

If you want to impress your boss, VB Daily has covered you. We give you the interior scoop on what companies do with a generative AI, from regulatory changes to practical deployments, so that you can share information for a maximum return on investment.

Read our privacy policy

Thank you for subscribing. Discover more VB newsletters here.

An error occurred.


You Might Also Like

Former Broadband Director Calls Handout to Musk’s Starlink a ‘Betrayal’ to Rural America

The Star Trek: Strange New Worlds season 3 trailer has some serious retro vibes

Applivery raises funds from Supercell for device and software management

After Tesla’s Earnings Slide, Pressure’s on for Cybercab

You Can Change the Default Apps on Your iPhone. Here’s How

Share This Article
Facebook X Email Print
Leave a Comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Subscribe to Our Newsletter
Subscribe to our newsletter to get our newest articles instantly!
loader

Email Address*

Name

Follow US

Find US on Social Medias
FacebookLike
XFollow
YoutubeSubscribe
TelegramFollow

Weekly Newsletter

Subscribe to our newsletter to get our newest articles instantly!
[mc4wp_form]
Popular News
Technology

Archipelo comes out of stealth with $12M funding to secure human and AI-driven code

MTHANNACH MTHANNACH February 27, 2025
Elon Musk appears at White House defending DOGE’s work | Donald Trump News
Foo Fighters Fire Josh Freese as Drummer: ‘Shocked and Disappointed’
Uber-backed mobility fintech Moove acquires Brazil’s Kovi, takes ARR to $275M
Trump’s fight against the Fed
- Advertisement -
Ad imageAd image
Global Coronavirus Cases

Confirmed

0

Death

0

More Information:Covid-19 Statistics

Categories

  • Business
  • Breaking News
  • Entertainment
  • Technology
  • Health
  • Sports
  • Gadgets
We influence 20 million users and is the number one business and technology news network on the planet.
Quick Link
  • My Bookmark
  • InterestsNew
  • Contact Us
  • Blog Index
Top Categories
  • Entertainment

Subscribe US

Subscribe to our newsletter to get our newest articles instantly!

 

All Rights Reserved © Inkinspires 2025
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?