By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
inkeinspires.cominkeinspires.cominkeinspires.com
Notification Show More
Font ResizerAa
  • Home
  • Breaking News
    Breaking NewsShow More
    Freak-offs not worth $20m settlement
    May 17, 2025
    Albertans preparing for the ‘stress’ and ‘opportunity’ of hosting G7 summit
    May 16, 2025
    Pope Leo is walking in Francis’s shoes, but likely to tread his own path
    May 16, 2025
    Five key takeaways from US President Donald Trump’s Middle East trip | Donald Trump News
    May 16, 2025
    MLB news: Yankees’ Oswaldo Cabrera thanks fans for support in heartfelt post
    May 16, 2025
  • Business
    BusinessShow More
    This Massachusetts couple in their 80s were days away from losing their home — until a reporter stepped in
    May 17, 2025
    LARRY KUDLOW: Stop root canal Republicanism
    May 16, 2025
    Moody’s strips US of top-notch triple-A credit rating
    May 16, 2025
    CEO compensation disclosure gets fresh scrutiny from Trump’s SEC
    May 16, 2025
    India to remain the fastest growing economy, outpacing China, US, EU: UN Report
    May 16, 2025
  • Entertainment
    EntertainmentShow More
    Nottoway Plantation Destroyed In Massive Louisiana Fire (VIDEO)
    May 17, 2025
    ‘Mormon Wives’ Drama Heats Up Between Taylor And Demi
    May 16, 2025
    Who Plays Gurathin On Apple TV+’s Murderbot?
    May 16, 2025
    Tamera Mowry and Adam Housley: A Timeline of Their Relationship
    May 16, 2025
    Rep. Shuts Down Viral Rumors He Was Diddy Victim
    May 16, 2025
  • Gadgets
    GadgetsShow More
    CES 2025: 41 Products You Can Buy Right Now
    January 13, 2025
    I can’t wait try out these 3 great plant tech gadgets that I saw at CES 2025
    January 13, 2025
    6 on Your Side Consumer Confidence: Kitchen gadgets to upgrade family recipes – ABC 6 News
    January 13, 2025
    35+ Best New Products, Tech and Gadgets
    January 13, 2025
    These gadgets kept me connected and working through a 90-mile backpacking trip
    January 13, 2025
  • Health
    HealthShow More
    20-20-20 Workout Method: Cardio, Strength & Flexibility inkeinspires
    May 17, 2025
    Why Progress Stalls, And How To Overcome It! inkeinspires
    May 16, 2025
    Relaxing Long Group Ride – BionicOldGuy inkeinspires
    May 16, 2025
    Boost Speed & Endurance Indoors inkeinspires
    May 16, 2025
    Do Women Or Men Lose Weight Faster? What The Science Says inkeinspires
    May 15, 2025
  • Sports
    SportsShow More
    Dallas Wings vs Minnesota Lynx game player stats and box score for May 16
    May 17, 2025
    Brock Purdy: San Francisco 49ers quarterback agrees to five-year, $265m contract extension | NFL News
    May 16, 2025
    Club now believe they are signing Belgium star
    May 16, 2025
    Man City now ready to submit formal £29m proposal to sign “pacy” defender
    May 16, 2025
    Maguire goal ruled out for offside as ‘so close’ VAR check rescues hosts
    May 16, 2025
  • Technology
    TechnologyShow More
    How to watch Google I/O 2025
    May 17, 2025
    ‘Fortnite’ Players Are Already Making AI Darth Vader Swear
    May 16, 2025
    US man who hacked SEC’s X account to spike Bitcoin price sentenced to prison
    May 16, 2025
    Ring Deployment or Breach: Patch Smarter Now
    May 16, 2025
    How do I answer calls on my iPhone with only my voice?
    May 16, 2025
  • Posts
    • Post Layouts
    • Gallery Layouts
    • Video Layouts
    • Audio Layouts
    • Post Sidebar
    • Review
      • User Rating
    • Content Features
    • Table of Contents
  • Contact US
  • Pages
    • Blog Index
    • Search Page
    • Customize Interests
    • My Bookmarks
    • 404 Page
Reading: Less supervision, better results: Study shows AI models generalize more effectively on their own
Share
Font ResizerAa
inkeinspires.cominkeinspires.com
  • Entertainment
Search
  • Home
  • Categories
    • Breaking News
    • Business
    • Sports
    • Technology
    • Entertainment
    • Gadgets
    • Health
  • Contact
Have an existing account? Sign In
Follow US
inkeinspires.com > Technology > Less supervision, better results: Study shows AI models generalize more effectively on their own
Technology

Less supervision, better results: Study shows AI models generalize more effectively on their own

MTHANNACH
Last updated: February 12, 2025 9:34 pm
MTHANNACH Published February 12, 2025
Share
SHARE

Join our daily and weekly newsletters for the latest updates and the exclusive content on AI coverage. Learn more


Linguistic models can better generalize when left to create their own solutions, a new study By the University of Hong Kong and the University of California, Berkeley, watch. The results, which apply both to important language models (LLMS) and vision language models (VLM), question one of the main beliefs in the LLM community – that models require examples of training marked by hand. In fact, researchers show that the formation of models on too many hand -made examples can have negative effects on the capacity of the model to become widespread in invisible data.

SFT VS RL in model training

For a long time, the supervised fine setting (SFT) was the gold stallion for the LLMS and the VLM training. Once a model is pre-formed on raw text and image data, companies and laboratories have generally interact on a large set of data of hand-made examples in the question format / response or request / response. After SFT, the model may undergo additional training steps, such as Reinforcement of learning human feedback (RLHF), where the model tries to learn implicit human preferences based on signals such as the classification of responses or to love / hate the model’s responses.

SFT is useful for directing the behavior of a model to the type of tasks for which the creators of the model have designed it. However, data collection is a slow and expensive process, which is a bottleneck for many companies and laboratories.

Recent developments in LLM have aroused interest in learning approaches by pure strengthening (RL), where the model receives a task and left to learn it alone without hand -made examples. The most important case is Deepseek-R1, the competitor Openai O1 which mainly used the learning of strengthening to learn complex reasoning tasks.

Generalization vs memorization

One of the main problems of automatic learning systems (ML) is an over-adjustment, where the model works well on its training data but fails to become widespread in invisible examples. During the training, the model gives the false impression of having learned the task, while in practice, it has just memorized its examples of training. In large and complex AI models, the separation of the generalization of memorization can be difficult.

The new study focuses on the generalization capacities of RL and SFT training in textual and visual reasoning tasks. For textual reasoning, an LLM formed on a set of rules should be able to generalize to the variants of these rules. In the visual reasoning, a VLM must remain consistent in the performance of the task in relation to the modifications of different aspects of the visual entry, such as the color and the spatial arrangement.

In their experiences, researchers used two representative tasks. The first was GeneralPoints, a reference that assesses the arithmetic reasoning capacities of a model. The model receives four cards, in the form of textual descriptions or images, and is invited to combine them to reach a target number. To study the generalization based on the regions, the researchers formed the model using a set of rules, then evaluated it using a different rule. For visual generalization, they formed the model using a color cards and tested its performance on the maps of other colors and the numbering diagrams.

The second task is V-alerWho tests the spatial reasoning capacities of the model in an open world navigation field which uses a realistic visual input. This task is also available in pure language versions and visual language. The researchers evaluated the generalization by modifying the type of instructions and the visual representations on which the model was formed and tested.

They carried out their tests on Llama-3.2-Vision-11b, warming the model by dragging it on a small set of SFT data, then creating separate versions for each task and training paradigm. For each task, they separated the training separately on RL and SFT. The SFT process forms the model on additional hand -made solutions, while RL allows the model to generate many solutions for each problem, to assess the results and to form on the right answers.

The results show that learning to strengthen regularly improve the performance of examples that are radically different from training data. On the other hand, SFT seems to memorize training rules and does not become widespread with examples outside distribution (OOD). These observations apply to text parameters only and multimodal.

The SFT trained models work well on the examples of training (in distribution) while showing poor performance on invisible examples (excluding distribution) (source: arxiv)

Implications for real world applications

Although their experiences show that RL is better to generalize than SFT, researchers have also found that SFT is useful to stabilize the model output format and is crucial to allow RL to reach its performance gains. The researchers found that, without the initial SFT stage, the RL training did not obtain desirable results.

This is a little different from the results obtained by Deepseek-R1-Zero, which was post-formulated on pure RL. The researchers suggest that this may be due to the different skeleton models they have used in their experiences.

It is clear that there is a lot of unexploited potential in RL-RL approaches. For use cases that have verifiable results, let the models learn by themselves can often lead to unforeseen results that humans could not have been made. This could be very practical in parameters where the creation of artisanal examples can be tedious and costly.

Daily information on business use cases with VB daily

If you want to impress your boss, VB Daily has covered you. We give you the interior scoop on what companies do with a generative AI, from regulatory changes to practical deployments, so that you can share information for a maximum return on investment.

Read our privacy policy

Thank you for subscribing. Discover more VB newsletters here.

An error occurred.


You Might Also Like

Like it or not, AI is learning how to influence you

Asus Zenbook A14 Review: The Best Copilot Plus PC So Far

EA is giving fans a chance to test the next Battlefield early

DeepSeek exposed internal database containing chat histories and sensitive data

Engadget review recap: iPad, Nothing Phone 3a, Assassin's Creed Shadows and more

Share This Article
Facebook X Email Print
Leave a Comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Subscribe to Our Newsletter
Subscribe to our newsletter to get our newest articles instantly!
loader

Email Address*

Name

Follow US

Find US on Social Medias
FacebookLike
XFollow
YoutubeSubscribe
TelegramFollow

Weekly Newsletter

Subscribe to our newsletter to get our newest articles instantly!
[mc4wp_form]
Popular News
Business

Apple withdraws cloud encryption service from UK after government order

MTHANNACH MTHANNACH February 21, 2025
Mother & 1-Year-Old Dies Following Crash
Pregnant woman’s remains, showing signs of “ritual sacrifice,” discovered in Ecuador, archaeologists say
‘It is so convenient to hire labour…’: American mom’s brutal take on why India beats the US
Trump blasts Biden’s DOJ: They tried to turn US into a ‘corrupt communist’ third world country
- Advertisement -
Ad imageAd image
Global Coronavirus Cases

Confirmed

0

Death

0

More Information:Covid-19 Statistics

Categories

  • Business
  • Breaking News
  • Entertainment
  • Technology
  • Health
  • Sports
  • Gadgets
We influence 20 million users and is the number one business and technology news network on the planet.
Quick Link
  • My Bookmark
  • InterestsNew
  • Contact Us
  • Blog Index
Top Categories
  • Entertainment

Subscribe US

Subscribe to our newsletter to get our newest articles instantly!

 

All Rights Reserved © Inkinspires 2025
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?