How does ChatGPT work? As explained by the ChatGPT team.

The article explains how ChatGPT, a large language model developed by OpenAI, works under the hood. It provides insights from Evan Morikawa, the lead of the Applied engineering team at OpenAI that created ChatGPT.

[01] How ChatGPT Works

1. What are the key steps involved in how ChatGPT generates responses?

  • The input text is first tokenized into individual tokens (roughly equivalent to words)
  • These tokens are then converted into numerical embeddings, which capture semantic relationships between words
  • The embeddings are multiplied by hundreds of billions of model weights to calculate the probability of the next most likely token
  • The model then samples the next token based on this probability distribution, and the process repeats to generate the full response

2. How are the model weights that power ChatGPT generated?

  • The model weights are trained through a process called pretraining, where the model is exposed to a large corpus of text data and learns to predict the next token in a sequence
  • This pretraining is done using gradient descent, a mathematical optimization technique, to gradually update the model weights

3. What are the limitations of how ChatGPT works?

  • ChatGPT and other large language models do not truly "think" or "understand" like humans, but rather generate text based on statistical patterns in the training data
  • The model is limited to generating text based on its training, and cannot access or use external tools or information beyond what it was trained on

[02] Comparison to Human Capabilities

1. How does ChatGPT's capabilities compare to human capabilities?

  • ChatGPT can access and process far more information than any individual human, and can generate coherent and human-like text
  • However, it does not have true understanding or reasoning capabilities like humans, and is limited to the patterns and knowledge encoded in its training data

2. What are some key limitations of ChatGPT compared to human intelligence?

  • ChatGPT lacks the ability to truly think, reason, and understand like humans
  • It is limited to generating text based on statistical patterns, without the deeper comprehension and creativity that humans possess
