Large Language Model (LLM)

A Large Language Model (LLM) is a type of artificial intelligence model trained on vast amounts of text data to understand and generate human-like text. These models have billions or trillions of parameters and can perform a wide range of language tasks.

Key Characteristics

Scale: LLMs typically have billions or trillions of parameters
Training Data: Trained on massive datasets of text from the internet, books, articles, and other sources
Architecture: Usually based on transformer architectures with attention mechanisms
Capabilities: Can generate text, answer questions, summarize content, translate languages, write code, and more

Examples of LLMs

GPT-4 (OpenAI)
Claude (Anthropic)
Llama 2 (Meta)
Mistral (Mistral AI)
PaLM (Google)
Gemini (Google)

Limitations

Knowledge Cutoff: LLMs only have knowledge up to their training cutoff date
Hallucinations: May generate plausible-sounding but incorrect information
Context Window: Limited by the amount of text they can process at once
Bias: May reflect biases present in their training data

← Back to Vocabulary

Large Language Model (LLM)

Key Characteristics

Examples of LLMs

Limitations

Related Terms