Data Behind the Large Language Models (LLM), GPT, and Beyond

Decoding the Data that Powers Large Language Models (LLMs) and GPT, and Looking Ahead to Their Future

Korkrid Kyle Akepanidtaworn
5 min readApr 11, 2023

A Brief History of Large Language Models

Large language models are computer programs that can generate natural language text based on a given input. Language models date back to Claude Shannon, who founded information theory in 1948 with his seminal paper, A Mathematical Theory of Communication. The first large language models were developed in the late 2000s and early 2010s, such as Google’s N-gram model and Microsoft’s Web N-gram model.

An n-gram is a collection of n successive items in a text document that may include words, numbers, symbols, and punctuation.

These probabilities are computed based on the number of times various n-grams (e.g., ate the mouse, ate the cheese) occur in a large corpus of text, and appropriately smoothed to avoid overfitting.

N-gram models are extremely computationally efficient and statistically inefficient. On the other hand, neural-network language models are statistically efficient but computationally inefficient.

--

--

Korkrid Kyle Akepanidtaworn
Korkrid Kyle Akepanidtaworn

Written by Korkrid Kyle Akepanidtaworn

AI Specialized CSA @ Microsoft | Enterprise AI, GenAI, LLM, LLamaIndex, ML | GenAITechLab Fellow, MScFE at WorldQuant, MSDS at CU Boulder

No responses yet