Analyzing the Impact of Context Length for Large Language Models

Supervised by: Corina Masanti

If you are interested in this topic or have further questions, do not hesitate to contact me.

Context/Background/Current State

The length of input sequences can affect the performance of Large Language Models (LLMs) in terms of accuracy and processing time. The goal of this thesis is to investigate the effect of input sequence length on LLMs and to compare their accuracy and efficiency between document-wise and sentence-wise processing approaches in the context of automatic error detection and correction in text documents.

Goal(s)

  • Quantitative evaluation of the influence of input sequence length on the accuracy and processing time of LLMs for automatic error detection and correction in text documents.
  • Compare the performance of LLMs when processing entire documents versus processing individual sentences for the same task.

Approach

Design experiments for automatic error detection and correction with input sequences of different lengths. Evaluate accuracy and processing time. Analyze the tradeoffs between accuracy and efficiency based on the input sequence length.

Required Skills

  • Good programming skills.
  • Good understanding of machine learning concepts.

Remarks

Further Reading

  • Vaswani, Ashish, et al. ”Attention is all you need.” Advances in neural infor- mation processing systems 30 (2017).
  • Shi, Freda, et al. ”Large language models can be easily distracted by irrelevant context.” International Conference on Machine Learning, PMLR (2023).