Richard Sutton believes that the primary function of Large Language Models (LLMs) is to mimic people and what they say, rather than to understand the world or figure out what to do. He contrasts this with Reinforcement Learning, which he sees as fundamental to AI and focused on understanding the world.
Richard Sutton views the primary function of Large Language Models (LLMs) as mimicking human language and actions, rather than possessing a genuine understanding of the world or a clear objective. He contrasts this with Reinforcement Learning, which he considers fundamental to AI and focused on understanding the world through experience and rewards.
Richard Sutton defines intelligence in AI as the ability to understand your world. He sees Reinforcement Learning as the fundamental approach to achieving this understanding.
This video discusses the fundamental differences between Reinforcement Learning (RL) and Large Language Models (LLMs) in the context of Artificial Intelligence. Richard Sutton argues that RL is more aligned with the core concept of intelligence, which involves understanding and acting in the world based on experience and goals, whereas LLMs primarily focus on mimicking human language and behavior without a genuine understanding of the world or a defined goal.
According to Richard Sutton, LLMs lack "ground truth" because there is no definition of what the "right" thing to say or do is within their framework. Unlike reinforcement learning, where the goal is to achieve rewards (providing a clear metric for correctness), LLMs don't have an explicit goal. Therefore, they cannot truly assess whether their output is correct or not, and consequently, they cannot have prior knowledge that is grounded in truth.
The key difference lies in their learning mechanisms: RL agents learn from direct experience by interacting with the world, taking actions, and observing the consequences to maximize rewards. In contrast, LLMs learn by mimicking human output from vast amounts of text, essentially learning "what a person did" in a given situation, without direct experience or a defined goal to guide their actions.