The research highlights the correlation between the diversity of counterfactual instances and the improvement in model performance. The team introduced innovative techniques, Solo-QAG and Duo-QAG, for efficient counterfactual generation, ensuring high-quality, diverse, and relevant instances. The study also found that models trained with counterfactual augmented data have better-calibrated prediction probabilities, making them more reliable in real-world scenarios.
Key takeaways:
- Researchers from the Ubiquitous Knowledge Processing Lab at the Technical University of Darmstadt have developed an approach to improve the performance of smaller language models in extractive question answering by using counterfactual instances for training.
- The research indicates that the diversity of counterfactual instances, in terms of surface form and semantic content, is crucial for achieving robustness against spurious correlations and bridging data distribution gaps.
- The team introduced innovative techniques, Solo-QAG and Duo-QAG, for efficient counterfactual generation, which, along with filtering steps for quality assurance, ensure high-standard, diverse, and relevant counterfactual instances.
- Models trained with counterfactual augmented data show better-calibrated prediction probabilities, ensuring their reliability in real-world scenarios. The research also found that rationale-augmented calibrator models prefer concise explanations over comprehensive ones.