SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales

The article presents an advanced training framework called SaySelf, designed to improve the confidence estimates of large language models (LLMs). The framework guides LLMs to produce self-reflective rationales that identify gaps in their knowledge and explain their uncertainty. This is achieved by using an LLM to summarize uncertainties in specific knowledge via natural language, based on the analysis of inconsistency in multiple sampled reasoning chains. The resulting data is then used for supervised fine-tuning.

The researchers also use reinforcement learning with a carefully designed reward function to calibrate the confidence estimates. This encourages LLMs to provide accurate, high-confidence predictions and penalizes overconfidence in incorrect outputs. The experimental results show that SaySelf effectively reduces the confidence calibration error and maintains task performance. The generated self-reflective rationales are found to be reasonable and contribute further to the calibration.

Key takeaways:

The paper introduces SaySelf, a training framework that teaches Large Language Models (LLMs) to express more accurate fine-grained confidence estimates.
SaySelf also directs LLMs to produce self-reflective rationales that identify gaps in their parametric knowledge and explain their uncertainty.
The framework uses reinforcement learning with a carefully designed reward function to calibrate the confidence estimates, encouraging LLMs to provide accurate, high-confidence predictions and penalizing overconfidence in incorrect outputs.
Experimental results show that SaySelf is effective in reducing the confidence calibration error and maintaining task performance, with the generated self-reflective rationales contributing to the calibration.

SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales

Key takeaways:

Comments (0)

Newsletter