Parameter-Efficient Fine-Tuning (PEFT) methods aim to reduce the number of tuning parameters when applying Large Language Models (LLMs) to downstream tasks, which has drawn plenty of attention with the rapid development of LLMs. One of the representative methods is Low-Rank Adaption (LoRA), which decomposes incremental weights matrices ∆W ∈ ℝd×d into low-rank matrices A ∈ ℝr×d and B ∈ ℝd×r (where r ≪ d) as follows:
h = W0 + ∆Wx = W0 + BAx.
Despite the progress, LoRA still has some shortcomings. Firstly, it lacks a granular consideration of the relative importance and optimal rank allocation within the decomposed matrices A and B. Secondly, in multi-task fine-tuning scenarios, LoRA fails to account for the inherent varying rank requirements across different tasks.
To solve the above problem and improve the capability of LoRA-based fine-tuning, Kun Zhang with his team published their research on 15 May 2025 in Frontiers of Computer Science co-published by Higher Education Press and Springer Nature.
The team proposed to add more flexibility into the rank of A and B for LoRA-based fine-tuning performance improvement. Specifically, they first explored distinct rank
settings of A and B and designed a novel Enhanced Matrix Decomposition in single-task scenarios. By adding an additional matrix, we can assign different ranks to learning metrices to improve their flexibility as follows:
h = W0 + ∆Wx = W0 + B'TA'x,
where A' ∈ ℝa×d, B' ∈ ℝd×b, and T ∈ ℝb×a. Moreover, since {a,b,r} ≪ d, their proposed strategy does not increase the computational complexity.
For multi-task learning, they treated each rank in the LoRA module as an expert and then used a routing mechanism to select a suitable expert for each task to perform computations. Therefore, different tasks can used part of LoRA module to realize fine-tuning. Along this line, the capability of LoRA-based fine-tuning method can be enhanced in multi-task learning scenarios.
DOI: 10.1007/s11704-024-40317-w