Name: Xiao Yao (PhD 1.5Y)

Eduction:  2013-2017 Tianjin U

                 2019-2022 Shanghai JiaoTong U

                 2023-Now  SUTD

                 Supervisor: Lu wei, Li Xiaoli

Working Experience: Ximalaya, Meituan

Current interest: LLM reasoning、Alignment、MoE

 

Date: 0703

Decomposed Prompt Tuning via Low-Rank Reparameterization

Xiao Yao、Xu Lu、Li JiaXi、Lu Wei and Li XiaoLi

EMNLP 2023

 

Background

  • Update a small amount of parameters
  • Achieve comparable (even better) performance than full finetuning

✅   Less storage

✅   Less gpu memory 

✅   Less Computational Overhead (Faster training speed)

Parameter Efficient Finetuning

Background

Prompt Tuning

PLM

x_1
x_2
x_3
x_7
x_8
...

Bert/T5/GPT

Trainable

Frozen

Soft Prompt e   c

\times

Input n   c

\times

Observation

U \in \mathbb{R}^{e \times e}, V \in \mathbb{R}^{c \times c} \\ \phantom{0000000} \Sigma \in \mathbb{R}^{e \times c} \phantom{0} \text{is diagonal matrix}
P = U\Sigma V
P = U Relu(\Sigma) V

The motivation of this initiation is that soft prompt flexibly adjust its rank.

Observation

Observation results are consistent.

Method

P = AB
A \in \mathbb{R}^{e \times b}, B \in \mathbb{R}^{b \times c}, b \ll min(e, c)

In practice, c is 100,  e is 512, 768, and 1024 for T5-Small, T5-Base, and T5-Large respectively. We set b as 10.

ec >> eb + bc

Based on the observation above, we directly decompose the soft prompt into two low-rank trainable matrix.

Results

Model: T5-Small, T5-Base, T5-Large

Datasets: SuperGlue

Few-shot Results

8-shot、16-shot and 32-shot

WiC、CB、RTE、and COPA

Ours are consistently better.

Bottleneck

If we increase b to a large number, performance will be unstable.

Prompt Length

Ours can work even when the length of soft prompt is extremely short.

Same idea

DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning

UCL

ICLR 2024

Self-Adaptive In-Context Chains-of-Thought for Enhanced Mathematical Reasoning

Xiao Yao、Xu Lu、Li JiaXi、Lu Wei and Li XiaoLi

EMNLP 2024 under review

Motivation

Recently, researcher explored the self-improvement approaches of LLM. These work can be classified into two categories:

  • update the parameters of LLM: RFT、DPO
  • ICL without parameter update: use external feedback like compiler error or gold label 

 

Can LLM self-improve without parameter update or external feedback?

Method

Intuition: Why It Works

Similar questions have similar solutions. Though generated solution may have mistakes, it can reason through the similar solutions to help solve the question.

Results

Random: choose n generated samples randomly

Long: choose n longest generated samples

Results

If we feed the generated samples to other models (especially the weak one), we find significant empirical gains.

Effects of Question Generator

The more powerful the question generator is, the better the performance.

Effects of Generated Solutions' Accuracy

The more accurate the generated solution is, the better the performance.

Robustness

The performance is relatively stable.

Thank you!

deck

By Yao

deck

  • 36