A human preference study conducted on 100 CNN DailyMail articles revealed that humans prefer GPT-4 summaries that are denser than those generated by a vanilla prompt and almost as dense as human-written summaries. The study also found a tradeoff between informativeness and readability. The authors have made 500 annotated CoD summaries and an additional 5,000 unannotated summaries available on HuggingFace.
Key takeaways:
- The study explores the challenge of selecting the right amount of information for a summary, introducing a method called 'Chain of Density' (CoD) prompt.
- CoD prompts GPT-4 to generate an initial entity-sparse summary before iteratively incorporating missing salient entities without increasing the length.
- Summaries generated by CoD are more abstractive, exhibit more fusion, and have less of a lead bias than GPT-4 summaries generated by a vanilla prompt.
- A human preference study found that humans prefer GPT-4 summaries that are more dense than those generated by a vanilla prompt and almost as dense as human written summaries.