To address this, the authors propose a new contamination detection method called the LLM decontaminator. This method identifies the top-k training items with the highest similarity to each test case, then generates k potential rephrased pairs to evaluate for rephrasing using an advanced LLM. The LLM decontaminator was found to be more effective at removing rephrased samples than existing methods. The authors encourage the community to adopt stronger decontamination tools and develop fresh one-time exams to accurately evaluate LLMs.
Key takeaways:
- The authors introduced Llama-rephraser, a 13B model that achieves GPT-4 performance in major benchmarks by rephrasing the test set or translating it into a different language.
- They proposed a new contamination detection method called "LLM decontaminator" which uses embedding similarity search and an advanced LLM to identify and remove rephrased samples from the training set.
- The LLM decontaminator was found to be more effective than existing methods in detecting contamination, including n-gram overlap and embedding similarity search.
- The authors applied the LLM decontaminator to real-world datasets and found a significant amount of rephrased samples, suggesting that contamination may be more prevalent than previously thought.