Member-only story

Data Behind the Large Language Models (LLM), GPT, and Beyond

Decoding the Data that Powers Large Language Models (LLMs) and GPT, and Looking Ahead to Their Future

Korkrid Kyle Akepanidtaworn
5 min readApr 11, 2023

A Brief History of Large Language Models

Large language models are computer programs that can generate natural language text based on a given input. Language models date back to Claude Shannon, who founded information theory in 1948 with his seminal paper, A Mathematical Theory of Communication. The first large language models were developed in the late 2000s and early 2010s, such as Google’s N-gram model and Microsoft’s Web N-gram model.

An n-gram is a collection of n successive items in a text document that may include words, numbers, symbols, and punctuation.

These probabilities are computed based on the number of times various n-grams (e.g., ate the mouse, ate the cheese) occur in a large corpus of text, and appropriately smoothed to avoid overfitting.

N-gram models are extremely computationally efficient and statistically inefficient. On the other hand, neural-network language models are statistically efficient but computationally inefficient.

--

--

Korkrid Kyle Akepanidtaworn
Korkrid Kyle Akepanidtaworn

Written by Korkrid Kyle Akepanidtaworn

AI Specialized CSA @ Microsoft | Enterprise AI, GenAI, LLM, LLamaIndex, ML | GenAITechLab Fellow, MSDS at CU Boulder

No responses yet