Name: Xiao Yao (PhD 1.5Y)

Eduction: 2013-2017 Tianjin U

2019-2022 Shanghai JiaoTong U

2023-Now SUTD

Supervisor: Lu wei, Li Xiaoli

Working Experience: Ximalaya, Meituan

Current interest: LLM reasoning、Alignment、MoE

Date: 0703

Decomposed Prompt Tuning via Low-Rank Reparameterization

Xiao Yao、Xu Lu、Li JiaXi、Lu Wei and Li XiaoLi

EMNLP 2023

Background

Update a small amount of parameters
Achieve comparable (even better) performance than full finetuning

✅ Less storage

✅ Less gpu memory

✅ Less Computational Overhead (Faster training speed)

Parameter Efficient Finetuning

Background

Prompt Tuning

PLM

x_1

x_2

x_3

x_7

x_8

...

Bert/T5/GPT

Trainable

Frozen

Soft Prompt e c

\times

Input n c

\times

Observation

U \in \mathbb{R}^{e \times e}, V \in \mathbb{R}^{c \times c} \\ \phantom{0000000} \Sigma \in \mathbb{R}^{e \times c} \phantom{0} \text{is diagonal matrix}

P = U\Sigma V

P = U Relu(\Sigma) V

The motivation of this initiation is that soft prompt flexibly adjust its rank.

Observation

Observation results are consistent.

Method

P = AB

A \in \mathbb{R}^{e \times b}, B \in \mathbb{R}^{b \times c}, b \ll min(e, c)

In practice, c is 100, e is 512, 768, and 1024 for T5-Small, T5-Base, and T5-Large respectively. We set b as 10.

ec >> eb + bc

Based on the observation above, we directly decompose the soft prompt into two low-rank trainable matrix.

Results

Model: T5-Small, T5-Base, T5-Large

Datasets: SuperGlue

Few-shot Results

8-shot、16-shot and 32-shot

WiC、CB、RTE、and COPA

Ours are consistently better.

Bottleneck

If we increase b to a large number, performance will be unstable.

Prompt Length

Ours can work even when the length of soft prompt is extremely short.

Same idea

DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning

UCL

ICLR 2024

Self-Adaptive In-Context Chains-of-Thought for Enhanced Mathematical Reasoning

Xiao Yao、Xu Lu、Li JiaXi、Lu Wei and Li XiaoLi

EMNLP 2024 under review

Motivation

Recently, researcher explored the self-improvement approaches of LLM. These work can be classified into two categories:

update the parameters of LLM: RFT、DPO
ICL without parameter update: use external feedback like compiler error or gold label

Can LLM self-improve without parameter update or external feedback?

Method

Intuition: Why It Works

Similar questions have similar solutions. Though generated solution may have mistakes, it can reason through the similar solutions to help solve the question.

Results

Random: choose n generated samples randomly

Long: choose n longest generated samples

Results

If we feed the generated samples to other models (especially the weak one), we find significant empirical gains.

Effects of Question Generator

The more powerful the question generator is, the better the performance.

Effects of Generated Solutions' Accuracy

The more accurate the generated solution is, the better the performance.

Decomposed Prompt Tuning via Low-Rank Reparameterization

Background

Parameter Efficient Finetuning

Background

Prompt Tuning

Observation

Observation

Method

Results

Few-shot Results

Bottleneck

Prompt Length

Same idea

DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning

Self-Adaptive In-Context Chains-of-Thought for Enhanced Mathematical Reasoning

Motivation

Method

Intuition: Why It Works

Results

Results

Effects of Question Generator

Effects of Generated Solutions' Accuracy

Robustness

Thank you!

deck

deck

Yao

Decomposed Prompt Tuning via Low-Rank Reparameterization

Background

Parameter Efficient Finetuning

Background

Prompt Tuning

Observation

Observation

Method

Results

Few-shot Results

Bottleneck

Prompt Length

Same idea

DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning

Self-Adaptive In-Context Chains-of-Thought for Enhanced Mathematical Reasoning

Motivation

Method

Intuition: Why It Works

Results

Results

Effects of Question Generator

Effects of Generated Solutions' Accuracy

Robustness

Thank you!

deck

More from Yao