Back to Blog
Research

REPAIR: Robust Editing via Progressive Adaptive Intervention and Reintegration

Introducing REPAIR, the framework that lets your LLM evolve.

👩‍🔬

Dr. Sarah Kim

CTO & Co-Founder

October 17, 2025
12 min read

Beyond Static AI: Introducing REPAIR, the Framework That Lets Your LLM Evolve

Large Language Models (LLMs) are transforming our digital world with their impressive capabilities. But they have an inherent flaw: their knowledge is frozen in time. Correcting an error or teaching them new information requires costly retraining, a process fraught with the risk of "catastrophic forgetting" and unintended side effects.

In a world that constantly changes, an AI that can't learn and adapt is a depreciating asset. What if we could perform surgical updates to a model's knowledge with a precisely, safely, and low cost way without disrupting its existing abilities?

Now, we can. We are thrilled to introduce REPAIR, a revolutionary framework from ContiAl Research designed for the lifelong editing of LLMs.

What is REPAIR?

REPAIR, which stands for Robust Editing via Progressive Adaptive Intervention and Reintegration, is a novel lifelong editing framework. Its purpose is to make model updates precise and affordable while carefully preserving non-target knowledge. Think of it as an intelligent "immune system" for your LLM, one that can integrate new knowledge and heal errors without causing systemic issues.

The Three Pillars of REPAIR's Innovation

REPAIR's remarkable effectiveness stems from its unique, integrated design built on three core strategies:

  1. Closed-Loop Feedback with Dynamic Memory Management: Unlike "fire-and-forget" editing methods, REPAIR employs a closed-loop feedback mechanism that acts as an "Error Monitor". It constantly assesses the performance of each edit. If a knowledge update leads to instability or conflicts, the system can selectively prune or re-initialize the underperforming components, ensuring the model's long-term health during large-scale sequential edits.
  2. Distribution-Aware Optimization: To ensure edits are robust and generalizable, REPAIR goes beyond simple prompt-matching. It intelligently groups similar samples and uses "in-batch knowledge distillation" to foster consistency. This encourages the model to learn the underlying concept, not just memorize a specific phrase, allowing the new knowledge to apply across various paraphrases and related contexts.
  3. Frequent Knowledge Fusion with Locality Guardrails: To prevent information loss, REPAIR increases the frequency of merging new knowledge with the model's existing parameter base. Critically, before any integration, it validates the update with "locality guards" to prevent unintended ripple effects on unrelated knowledge. This ensures that fixing one problem doesn't accidentally create another.

The Results Speak for Themselves

REPAIR was rigorously tested on a diverse range of models, including LLaMA-3, Qwen-2.5, and GPT-2-XL, with outstanding results:

  • Accuracy Boost: Across multiple model families, REPAIR boosts editing accuracy by an impressive 10%-30%.
  • Forgetting is Minimized: The framework significantly reduces the problem of knowledge forgetting that plagues other methods.
  • Superior Scalability: In large-scale sequential editing scenarios (e.g., 1000 edits), where other methods degrade sharply, REPAIR's dynamic adjustment mechanism preserves robustness and achieves the best overall performance.
  • Effective Hallucination Mitigation: REPAIR is highly effective at reducing model hallucinations while preserving its performance on unrelated queries, striking a crucial balance between correction and stability.

N = 1

Method Model Rel. Gen. Loc. OP.
FT-L LLaMA-3-8B (ZsRE) 0.57 0.52 0.96 0.66
FT-EWC LLaMA-3-8B (ZsRE) 0.96 0.93 0.02 0.26
MEND LLaMA-3-8B (ZsRE) 0.95 0.93 0.96 0.95
ROME LLaMA-3-8B (ZsRE) 0.85 0.80 0.99 0.88
MEMIT-M LLaMA-3-8B (ZsRE) 0.84 0.81 0.99 0.88
DEFER LLaMA-3-8B (ZsRE) 0.68 0.58 0.56 0.61
GRACE LLaMA-3-8B (ZsRE) 0.97 0.36 1.00 0.71
WISE LLaMA-3-8B (ZsRE) 0.94 0.92 1.00 0.95
REPAIR LLaMA-3-8B (ZsRE) 0.94 0.92 1.00 0.95
--- --- --- --- --- ---
FT-L Qwen2.5-7B (ZsRE) 0.68 0.63 0.93 0.74
FT-EWC Qwen2.5-7B (ZsRE) 0.97 0.92 0.05 0.35
MEND Qwen2.5-7B (ZsRE) 0.96 0.95 0.96 0.96
ROME Qwen2.5-7B (ZsRE) 0.90 0.89 0.99 0.93
MEMIT-M Qwen2.5-7B (ZsRE) 0.84 0.81 0.99 0.88
DEFER Qwen2.5-7B (ZsRE) 0.74 0.67 0.88 0.76
GRACE Qwen2.5-7B (ZsRE) 0.97 0.41 0.98 0.73
WISE Qwen2.5-7B (ZsRE) 0.97 0.95 0.98 0.97
REPAIR Qwen2.5-7B (ZsRE) 0.98 0.95 1.00 0.98 ↑
--- --- --- --- --- ---
FT-L DeepSeek‑R1‑1.5B (WikiBigEdit) 0.71 0.68 0.93 0.77
FT-EWC DeepSeek‑R1‑1.5B (WikiBigEdit) 0.93 0.91 0.33 0.65
MEND DeepSeek‑R1‑1.5B (WikiBigEdit) 0.91 0.87 0.95 0.91
ROME DeepSeek‑R1‑1.5B (WikiBigEdit) 0.86 0.83 0.97 0.88
MEMIT-M DeepSeek‑R1‑1.5B (WikiBigEdit) 0.86 0.87 0.97 0.90
DEFER DeepSeek‑R1‑1.5B (WikiBigEdit) 0.68 0.58 0.47 0.35
GRACE DeepSeek‑R1‑1.5B (WikiBigEdit) 0.96 0.47 0.99 0.76
WISE DeepSeek‑R1‑1.5B (WikiBigEdit) 0.89 0.91 0.98 0.93
REPAIR DeepSeek‑R1‑1.5B (WikiBigEdit) 0.98 0.93 0.98 0.96 ↑

N = 30

Method Model Rel. Gen. Loc. OP.
FT-L LLaMA-3-8B (ZsRE) 0.35 0.35 0.52 0.39
FT-EWC LLaMA-3-8B (ZsRE) 0.78 0.76 0.02 0.23
MEND LLaMA-3-8B (ZsRE) 0.24 0.25 0.18 0.22
ROME LLaMA-3-8B (ZsRE) 0.61 0.60 0.68 0.63
MEMIT-M LLaMA-3-8B (ZsRE) 0.73 0.72 0.95 0.79
DEFER LLaMA-3-8B (ZsRE) 0.65 0.47 0.36 0.49
GRACE LLaMA-3-8B (ZsRE) 0.96 0.17 1.00 0.55
WISE LLaMA-3-8B (ZsRE) 0.62 0.60 0.86 0.68
REPAIR LLaMA-3-8B (ZsRE) 0.93 0.90 0.87 0.89 ↑
--- --- --- --- --- ---
FT-L Qwen2.5-7B (ZsRE) 0.28 0.23 0.44 0.30
FT-EWC Qwen2.5-7B (ZsRE) 0.82 0.80 0.02 0.24
MEND Qwen2.5-7B (ZsRE) 0.31 0.31 0.27 0.29
ROME Qwen2.5-7B (ZsRE) 0.77 0.73 0.52 0.66
MEMIT-M Qwen2.5-7B (ZsRE) 0.73 0.72 0.95 0.79
DEFER Qwen2.5-7B (ZsRE) 0.58 0.51 0.44 0.51
GRACE Qwen2.5-7B (ZsRE) 0.97 0.2 1.00 0.58
WISE Qwen2.5-7B (ZsRE) 0.79 0.73 0.91 0.80
REPAIR Qwen2.5-7B (ZsRE) 0.93 0.90 0.93 0.92 ↑
--- --- --- --- --- ---
FT-L DeepSeek‑R1‑1.5B (WikiBigEdit) 0.26 0.20 0.76 0.34
FT-EWC DeepSeek‑R1‑1.5B (WikiBigEdit) 0.70 0.70 0.18 0.45
MEND DeepSeek‑R1‑1.5B (WikiBigEdit) 0.43 0.38 0.10 0.25
ROME DeepSeek‑R1‑1.5B (WikiBigEdit) 0.72 0.71 0.67 0.70
MEMIT-M DeepSeek‑R1‑1.5B (WikiBigEdit) 0.78 0.77 0.82 0.79
DEFER DeepSeek‑R1‑1.5B (WikiBigEdit) 0.63 0.61 0.51 0.58
GRACE DeepSeek‑R1‑1.5B (WikiBigEdit) 0.93 0.24 0.91 0.59
WISE DeepSeek‑R1‑1.5B (WikiBigEdit) 0.76 0.74 0.89 0.79
REPAIR DeepSeek‑R1‑1.5B (WikiBigEdit) 0.84 0.83 0.91 0.86 ↑

N = 120

Method Model Rel. Gen. Loc. OP.
FT-L LLaMA-3-8B (ZsRE) 0.29 0.26 0.21 0.25
FT-EWC LLaMA-3-8B (ZsRE) 0.76 0.76 0.08 0.36
MEND LLaMA-3-8B (ZsRE) 0.08 0.07 0.00 0.00
ROME LLaMA-3-8B (ZsRE) 0.22 0.22 0.04 0.12
MEMIT-M LLaMA-3-8B (ZsRE) 0.70 0.65 0.82 0.72
DEFER LLaMA-3-8B (ZsRE) 0.20 0.12 0.27 0.20
GRACE LLaMA-3-8B (ZsRE) 0.94 0.14 1.00 0.51
WISE LLaMA-3-8B (ZsRE) 0.57 0.58 0.87 0.66
REPAIR LLaMA-3-8B (ZsRE) 0.76 0.74 1.00 0.83 ↑
--- --- --- --- --- ---
FT-L Qwen2.5-7B (ZsRE) 0.13 0.11 0.10 0.11
FT-EWC Qwen2.5-7B (ZsRE) 0.71 0.69 0.05 0.29
MEND Qwen2.5-7B (ZsRE) 0.15 0.14 0.03 0.09
ROME Qwen2.5-7B (ZsRE) 0.31 0.28 0.03 0.14
MEMIT-M Qwen2.5-7B (ZsRE) 0.70 0.65 0.82 0.72
DEFER Qwen2.5-7B (ZsRE) 0.22 0.21 0.43 0.27
GRACE Qwen2.5-7B (ZsRE) 0.95 0.08 0.98 0.42
WISE Qwen2.5-7B (ZsRE) 0.59 0.57 0.92 0.68
REPAIR Qwen2.5-7B (ZsRE) 0.81 0.80 0.92 0.84 ↑
--- --- --- --- --- ---
FT-L DeepSeek‑R1‑1.5B (WikiBigEdit) 0.13 0.11 0.37 0.17
FT-EWC DeepSeek‑R1‑1.5B (WikiBigEdit) 0.42 0.41 0.07 0.23
MEND DeepSeek‑R1‑1.5B (WikiBigEdit) 0.24 0.23 0.08 0.16
ROME DeepSeek‑R1‑1.5B (WikiBigEdit) 0.18 0.18 0.02 0.09
MEMIT-M DeepSeek‑R1‑1.5B (WikiBigEdit) 0.54 0.51 0.77 0.60
DEFER DeepSeek‑R1‑1.5B (WikiBigEdit) 0.17 0.15 0.33 0.20
GRACE DeepSeek‑R1‑1.5B (WikiBigEdit) 0.76 0.13 0.89 0.44
WISE DeepSeek‑R1‑1.5B (WikiBigEdit) 0.64 0.65 0.83 0.70
REPAIR DeepSeek‑R1‑1.5B (WikiBigEdit) 0.71 0.69 0.90 0.76 ↑

N = 1000

Method Model Rel. Gen. Loc. OP.
FT-L LLaMA-3-8B (ZsRE) 0.19 0.15 0.02 0.08
FT-EWC LLaMA-3-8B (ZsRE) 0.69 0.67 0.08 0.33
MEND LLaMA-3-8B (ZsRE) 0.00 0.00 0.00 0.00
ROME LLaMA-3-8B (ZsRE) 0.01 0.01 0.01 0.01
MEMIT-M LLaMA-3-8B (ZsRE) 0.63 0.63 0.62 0.63
DEFER LLaMA-3-8B (ZsRE) 0.03 0.03 0.74 0.27
GRACE LLaMA-3-8B (ZsRE) 0.93 0.08 1.00 0.42
WISE LLaMA-3-8B (ZsRE) 0.45 0.44 0.51 0.47
REPAIR LLaMA-3-8B (ZsRE) 0.68 0.65 0.89 0.73 ↑
--- --- --- --- --- ---
FT-L Qwen2.5-7B (ZsRE) 0.08 0.06 0.02 0.05
FT-EWC Qwen2.5-7B (ZsRE) 0.58 0.56 0.03 0.21
MEND Qwen2.5-7B (ZsRE) 0.02 0.02 0.00 0.00
ROME Qwen2.5-7B (ZsRE) 0.01 0.02 0.00 0.00
MEMIT-M Qwen2.5-7B (ZsRE) 0.52 0.51 0.57 0.53
DEFER Qwen2.5-7B (ZsRE) 0.14 0.08 0.25 0.14
GRACE Qwen2.5-7B (ZsRE) 0.94 0.02 1.00 0.27
WISE Qwen2.5-7B (ZsRE) 0.44 0.41 0.72 0.51
REPAIR Qwen2.5-7B (ZsRE) 0.72 0.70 0.67 0.69 ↑
--- --- --- --- --- ---
FT-L DeepSeek‑R1‑1.5B (WikiBigEdit) 0.02 0.02 0.08 0.03
FT-EWC DeepSeek‑R1‑1.5B (WikiBigEdit) 0.18 0.15 0.02 0.08
MEND DeepSeek‑R1‑1.5B (WikiBigEdit) 0.03 0.03 0.02 0.05
ROME DeepSeek‑R1‑1.5B (WikiBigEdit) 0.01 0.0 0.01 0.00
MEMIT-M DeepSeek‑R1‑1.5B (WikiBigEdit) 0.38 0.38 0.62 0.45
DEFER DeepSeek‑R1‑1.5B (WikiBigEdit) 0.07 0.07 0.12 0.08
GRACE DeepSeek‑R1‑1.5B (WikiBigEdit) 0.63 0.07 0.81 0.33
WISE DeepSeek‑R1‑1.5B (WikiBigEdit) 0.47 0.38 0.61 0.48
REPAIR DeepSeek‑R1‑1.5B (WikiBigEdit) 0.58 0.54 0.81 0.63 ↑

Comparative results show REPAIR consistently achieving a higher Overall Performance (OP) across different models and edit scales, demonstrating its superior balance of Reliability, Generalization, and Locality.

The Path Forward

REPAIR introduces a robust and scalable framework for developing the next generation of reliable and continually evolving LLMs. It is a critical step towards creating truly dynamic AI systems that can learn, adapt, and stay relevant over time. This work opens up new possibilities for anyone looking to maintain and enhance the capabilities of large-scale models in a safe and efficient manner.

Ready to dive deeper into the technical details? Read the full paper on arXiv: arXiv:2510.01879v1.

Tags:REPAIRModel EditingLifelong LearningCatastrophic ForgettingContinual LearningClosed-Loop FeedbackHallucination Mitigation