From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples

The article analyzes the performance of large pre-trained language models like Llama2, GPT-4, and Claude 3 in linear and non-linear regression tasks using in-context examples, without any additional training. The findings suggest that these models can perform regression tasks as well as, or even better than, traditional supervised methods such as Random Forest, Bagging, or Gradient Boosting. For instance, Claude 3 outperformed several supervised methods on the Friedman #2 regression dataset.

The study also explores how the performance of these large language models scales with the number of in-context examples. Drawing from the concept of regret in online learning, the research empirically demonstrates that large language models can achieve a sub-linear regret, indicating their potential efficiency in learning from a growing number of examples.

Key takeaways:

The study analyzes the performance of pre-trained large language models like Llama2, GPT-4, Claude 3, etc., in linear and non-linear regression tasks using in-context examples.
Several large language models, such as GPT-4 and Claude 3, are found to perform regression tasks with a performance that rivals or even outperforms traditional supervised methods like Random Forest, Bagging, or Gradient Boosting.
On the challenging Friedman #2 regression dataset, Claude 3 outperformed many supervised methods such as AdaBoost, SVM, Random Forest, KNN, or Gradient Boosting.
The study also investigates how the performance of large language models scales with the number of in-context exemplars, showing that these models are capable of obtaining a sub-linear regret.

From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples

Key takeaways:

Comments (0)

Newsletter