Hesitation in LLMs
Background
Background
Background
Idea: Reinforced Hesitaiton
\text{Reward} = \begin{cases} 1 & \text{Corrrect Answer} \\ 0 & \text{Hesitaiton} \\ -x & \text{Wrong Answer} \end{cases}
Results: Reinforced Hesitaiton
Results: Reinforced Hesitaiton
Results: Reinforced Hesitaiton
Ideas: Boosting And Cascading
Made with Slides.com