The Pursuit of General Problem Solvers in AI: From Early Attempts to Modern LLMs
The quest for a General Problem Solver (GPS) has been a cornerstone of artificial intelligence research for decades. This article traces the history of attempts to achieve this goal, culminating in the development of Large Language Models (LLMs), and discusses why even modern AI systems fall short of true general problem-solving capabilities. We’ll explore how general problem-solving aligns with the broader goal of general intelligence, and why this holy grail remains elusive.
Defining General Problems
At the heart of GPS is the concept of general problems, which share certain characteristics:
- States and Actions: Problems consist of different states, and actions cause transitions between these states.
- Initial and Goal States: Every problem has a starting point and a desired end state.
- State Transition: Actions modify the state of the system.
- Preconditions: Each action may require specific conditions to be met before it can be executed.
- Domain Knowledge: Facts and knowledge about the problem domain are essential for solving it.
The solution to a general problem involves transitioning from the initial state to the goal state efficiently. Ideally, a true general problem solver should only require the initial and goal states to devise a solution based on prior knowledge and reasoning.
In a perfect scenario, this would be a zero-shot general problem solver, meaning no additional task-specific data is provided. While such systems don’t exist yet, modern AI, including LLMs, are designed to work from vast pre-trained knowledge bases (like the model itself). However, even LLMs fall short of this ideal, especially when handling Out-of-Distribution (OOD) problems—tasks that deviate significantly from what they’ve been trained on.
Traditional Approaches to General Problem-Solving
Several approaches have been proposed over the years to achieve general problem solving, but all have limitations:
1. Search-Based Solutions
Search-based methods treat problem-solving as navigating a graph where states are nodes and actions are edges. Techniques like A search* are widely used. A* optimizes for cost, with two components:
- The cost to reach the current state.
- A heuristic estimate of the cost to reach the goal state.
While A* and other search-based algorithms are more efficient than brute-force methods, they require a problem-specific search graph, making them unsuitable for zero-shot problem-solving. Additionally, as problem complexity grows, these techniques become computationally expensive and hard to scale.
2. Deductive Logic-Based Solutions
Logic-based approaches involve encoding all relevant knowledge in the form of rules and facts, such as in Expert Systems. In this framework, problems are solved through inference engines using Horn clauses, operating either through forward or backward chaining.
- Forward chaining starts from the known facts and applies rules to generate new facts, aiming to reach the goal state.
- Backward chaining starts with the goal state and works backward to find a set of facts that lead to it.
While this approach theoretically has the potential to be zero-shot (given a comprehensive and correct knowledge base), the reality is far more complex. Real-world problems often contain exceptions and incomplete knowledge, limiting the applicability of deductive systems. Furthermore, they also suffer from scalability issues as the problem space expands.
3. Reinforcement Learning (RL) Solutions
Unlike symbolic methods, Reinforcement Learning (RL) approaches are data-driven and rely on trial and error. In Q-learning, for example, the model explores actions within states, updating its understanding of which actions lead to rewards over time. The goal is to learn a policy that selects the optimal action for any given state.
While RL can handle a wide variety of problems and scales better with the help of Deep Learning, it is still not a general-purpose problem solver. RL models are trained for specific tasks and struggle with OOD problems. Even advancements in meta-learning, which aims to generalize across similar tasks, have achieved limited success.
Modern Approaches: LLMs as Problem Solvers
Large Language Models (LLMs), like those based on the Transformer architecture, represent a different approach. These models are designed to predict the next word or token in a sequence, generating coherent text based on the input provided. Recently, models like GPT-4 and OpenAI’s GPT-4 Turbo (O1) have been touted as potential problem solvers, using techniques like Reinforcement Learning with Human Feedback (RLHF) to improve their reasoning abilities.
LLMs generate multi-step solutions by breaking down problems into sequences of steps. However, LLMs don’t truly “reason”; instead, they perform associative pattern matching, selecting the next step in a sequence based on probabilities learned from large datasets. This mimicking of reasoning falls short when faced with novel or OOD problems, where the model has no prior training data to rely on.
Even models like O1, which have been described as Large Reasoning Models (LRM), perform poorly on tests like ARC-AGI (an OOD reasoning problem set), where they score a mere 20%. Similarly, they struggle with Mysterious Blocksworld, another planning-based dataset. Despite improvements in coherence and fluency, LLMs are far from achieving true general problem-solving capabilities.
Challenges and Limitations
The fundamental challenge in developing a GPS is the need for general intelligence, which no current AI system possesses. LLMs, search-based, logic-based, and RL approaches all have significant limitations, particularly in scaling to complex problems and handling OOD tasks.
While LLMs are scalable and offer zero-shot problem-solving within familiar domains, they cannot generalize across completely new types of problems. This is because they rely heavily on patterns in the training data and lack the intrinsic reasoning capabilities required for true general problem-solving.
Final Thoughts
The dream of a General Problem Solver remains unfulfilled. Despite significant advancements, no AI system today can claim to solve all problems with the ease and flexibility of a human mind. LLMs, while powerful, are not the answer to general problem-solving. The pursuit of GPS continues, and achieving true general intelligence in AI will be essential to realizing this goal.