Researchers find that retraining only small parts of AI models can cut costs and prevent forgetting

October 14, 2025

2 2 minutes read

Companies often notice this when they refine modelsAn effective approach to making a large language model (LLM) fit for purpose and based on data is to make the model lose some of its capabilities. After tuning, some models “forget” how to perform certain tasks or other tasks they have already learned.

Research from the University of Illinois Urbana-Champaign proposes a new method for retraining models that avoids “catastrophic forgetting,” in which the model loses some of its prior knowledge. The article focuses on two specific LLMs that generate responses from images: LLaVA and Qwen 2.5-VL.

The approach encourages enterprises to retrain only limited portions of an LLM to avoid retraining the entire model and significantly increasing computational costs. The team claims that catastrophic forgetting is not true memory loss, but rather a side effect of bias drift.

“Training a new LMM can take millions of dollars, weeks of time, and emit hundreds of tons of CO2, so finding ways to update existing models more efficiently and effectively is a pressing concern,” the team wrote in the paper. “Guided by this result, we explore tuning recipes that preserve learning while limiting output shift.”

The researchers focused on a multi-layer perceptron (MLP), the internal decision-making component of the model.

Catastrophically forgotten

The researchers first wanted to verify the existence and cause of catastrophic forgetting in models.

To do this, they created a series of target tasks for the models to complete. The models were then refined and evaluated to determine whether they led to substantial forgetting. But as the process continued, the researchers found that the models regained some of their capabilities.

“We also noted a surprising result, that the model’s performance would drop significantly in existing benchmarks after training on the counting task. It would largely recover on PathVQA, another specialized task that is not well represented in the benchmarks,” they said. “Meanwhile, while running the forgetting mitigation experiments, we also tried tuning just the self-attention projection (SA Proj) or MLP layers separately, motivated by the finding that tuning the LLM alone was generally better than tuning the full model. This led to another very surprising result: that tuning just the self-attention projection layers led to very good learning of the target tasks without a decline in performance on delayed tasks, even after training all five target tasks in a sequence.”

The researchers said they believe that “what appears to be forgetting or interference after tuning to a specific target task is in fact a bias in the output distribution due to the shift in the division of labor.”

Narrow retraining

That finding turned out to be key to the experiment. The researchers noted that tuning the MLP increased the likelihood of “numerical token execution and a highly correlated decrease in accuracy on delayed tasks.” What it showed is that a model that forgets some of its knowledge is only temporary and not a long-term issue.

“To avoid affecting the output distribution, we up/gating the MLP projections while keeping the down projection frozen, and find that this achieves similar learning to full MLP tuning, without missing anything,” the researchers said.

This allows for a simpler and more reproducible method for refining a model.

By focusing on a narrow segment of the model, rather than large-scale retraining, companies can reduce computing costs. It also allows better control of output drift.

However, the research only focuses on two models, namely those related to vision and language. The researchers noted that due to limited resources, they cannot try the experiment with other models.

However, their findings can be extended to other LLMs, especially for different modalities.

Source link