AI Hallucination: When Machines Start to “Dream”

Imagine that you are chatting with an AI assistant, and suddenly it starts to talk about a story about “flying cats ruling the world”. You are confused and think: “Is this AI on drugs?” Don’t worry, this is not a science fiction movie, but AI’s “Hallucination” — a phenomenon that deviates from facts or logic when generating content. AI hallucination is like a machine’s “daydream”. It may make up non-existent facts, provide wrong answers, or even create ridiculous content. For example, if you ask AI: “What is the diameter of the earth?” It may answer: “The diameter of the earth is 42 kilometers because it is stretched by a huge rubber band.” Well, it sounds creative, but it’s totally wrong. Causes of AI Hallucinations Many factors can cause AI models to hallucinate, including insufficient or biased training data, overfitting of models, loss of context, limited domain knowledge, and ambiguity in model architecture and prompt words. Insufficient or biased training data: If AI has not seen enough correct data, it will “guess blindly”. Model Overfit: When an AI model overfits the training data, it may start to generate outputs that are too specific to the training data and do not generalize well to new data. This can cause the model to generate hallucinations or irrelevant outputs. Loss of Context: In long conversations, AI may forget the content of previous conversations, resulting in confusing logic. Limited Domain Knowledge: AI models designed for specific domains or tasks will “free play” when receiving inputs from other domains or tasks because they lack the knowledge background of related input and output. Although a model can be trained on a large vocabulary in multiple languages, it may lack the cultural context, history, and nuances to correctly string concepts together. AI model architecture: AI model architecture affects how easily AI hallucinates. Models with more layers or more parameters may be more likely to hallucinate due to increased architectural complexity. Fuzzy prompt words: If the user’s question is not clear enough, the AI may “free play”. Real Examples of AI Hallucination 1. AI Hallucination in Web Site Development Code Generation Error： You asked AI to generate a piece of JavaScript code, but it wrote an infinite loop, causing the browser to crash. while (true) { console.log("This is an infinite loop!"); } Solution: Explicitly ask the AI to generate runnable code and add test cases. Misleading API documentation: AI may generate non-existent API endpoints or incorrect parameter descriptions. GET /api/v1/users/{id} Parameters: - id: The user's favorite color (string) Solution: Ask the AI to refer to the official documentation and provide sample requests and responses. Outrageous design suggestions: AI suggested setting the website background to “flashing rainbow colors” and adding auto-playing background music. Solution: Clarify design requirements, such as “simple, modern, and in line with brand style. 2.AI Illusions in Financial Analysis False Data Predictions: AI predicts that a stock will rise 500% next week based on a national document that does not exist in the actual query. Solution: Require AI to analyze based on historical data and market trends, and provide a credible source, and we will check whether the data source exists. False Economic Indicators: AI claims that a country’s GDP growth rate is 1000% because “aliens invested in the country’s economy.” Solution: Explicitly require AI to use authoritative data sources (such as the World Bank, and IMF) for analysis. Outrageous Investment Advice: AI recommends investing all funds in “FlyingCatCoin” because “it will become the next Bitcoin.” Solution: Require AI to provide diversified investment advice based on risk assessment. How to solve AI hallucinations? 1. Use special dialogue prompts Through carefully designed prompts, AI can be guided to generate more accurate and reliable content. Here are some practical tips for prompt words: Explicit instructions: Wrong example: “Tell me about the Earth.” Correct example: “Please provide the exact value of the Earth’s diameter and attach the source.” Limited scope: Wrong example: “Write a story.” Correct example: “Write a 200-word story about scientists discovering a new planet. Make sure the content is in line with scientific common sense.” Require verification: Wrong example: “Will AI rule the world?” Correct example: “Based on the current technological development trends, analyze whether AI will rule the world and provide credible evidence.” Think step by step: Wrong example: “How to solve global warming?” Correct example: “Please explain the causes of global warming in steps and propose three feasible solutions.” 2. Introduce a fact-checking mechanism External knowledge base: Let AI refer to external databases or authoritative sources when generating answers. Multi-model validation

Mar 21, 2025 - 09:07

AI Hallucination: When Machines Start to “Dream”

Imagine that you are chatting with an AI assistant, and suddenly it starts to talk about a story about “flying cats ruling the world”. You are confused and think: “Is this AI on drugs?” Don’t worry, this is not a science fiction movie, but AI’s “Hallucination” — a phenomenon that deviates from facts or logic when generating content.

AI hallucination is like a machine’s “daydream”. It may make up non-existent facts, provide wrong answers, or even create ridiculous content. For example, if you ask AI: “What is the diameter of the earth?” It may answer: “The diameter of the earth is 42 kilometers because it is stretched by a huge rubber band.” Well, it sounds creative, but it’s totally wrong.

Causes of AI Hallucinations

Many factors can cause AI models to hallucinate, including insufficient or biased training data, overfitting of models, loss of context, limited domain knowledge, and ambiguity in model architecture and prompt words.

Insufficient or biased training data: If AI has not seen enough correct data, it will “guess blindly”.
Model Overfit: When an AI model overfits the training data, it may start to generate outputs that are too specific to the training data and do not generalize well to new data. This can cause the model to generate hallucinations or irrelevant outputs.
Loss of Context: In long conversations, AI may forget the content of previous conversations, resulting in confusing logic.
Limited Domain Knowledge: AI models designed for specific domains or tasks will “free play” when receiving inputs from other domains or tasks because they lack the knowledge background of related input and output. Although a model can be trained on a large vocabulary in multiple languages, it may lack the cultural context, history, and nuances to correctly string concepts together.
AI model architecture: AI model architecture affects how easily AI hallucinates. Models with more layers or more parameters may be more likely to hallucinate due to increased architectural complexity.
Fuzzy prompt words: If the user’s question is not clear enough, the AI may “free play”.

Real Examples of AI Hallucination

1. AI Hallucination in Web Site Development

Code Generation Error：
You asked AI to generate a piece of JavaScript code, but it wrote an infinite loop, causing the browser to crash.

while (true) {
  console.log("This is an infinite loop!");
}

Solution: Explicitly ask the AI to generate runnable code and add test cases.

Misleading API documentation:
AI may generate non-existent API endpoints or incorrect parameter descriptions.

GET /api/v1/users/{id}
Parameters:
  - id: The user's favorite color (string)

Solution: Ask the AI to refer to the official documentation and provide sample requests and responses.

Outrageous design suggestions:
AI suggested setting the website background to “flashing rainbow colors” and adding auto-playing background music.
Solution: Clarify design requirements, such as “simple, modern, and in line with brand style.

2.AI Illusions in Financial Analysis

False Data Predictions:
AI predicts that a stock will rise 500% next week based on a national document that does not exist in the actual query.
Solution: Require AI to analyze based on historical data and market trends, and provide a credible source, and we will check whether the data source exists.

False Economic Indicators:
AI claims that a country’s GDP growth rate is 1000% because “aliens invested in the country’s economy.”
Solution: Explicitly require AI to use authoritative data sources (such as the World Bank, and IMF) for analysis.

Outrageous Investment Advice:
AI recommends investing all funds in “FlyingCatCoin” because “it will become the next Bitcoin.”
Solution: Require AI to provide diversified investment advice based on risk assessment.

How to solve AI hallucinations?

1. Use special dialogue prompts
Through carefully designed prompts, AI can be guided to generate more accurate and reliable content. Here are some practical tips for prompt words:

Explicit instructions:
Wrong example: “Tell me about the Earth.”
Correct example: “Please provide the exact value of the Earth’s diameter and attach the source.”

Limited scope:
Wrong example: “Write a story.”
Correct example: “Write a 200-word story about scientists discovering a new planet. Make sure the content is in line with scientific common sense.”

Require verification:
Wrong example: “Will AI rule the world?”
Correct example: “Based on the current technological development trends, analyze whether AI will rule the world and provide credible evidence.”

Think step by step:
Wrong example: “How to solve global warming?”
Correct example: “Please explain the causes of global warming in steps and propose three feasible solutions.”

2. Introduce a fact-checking mechanism
External knowledge base: Let AI refer to external databases or authoritative sources when generating answers.
Multi-model validation: Use multiple AI models to cross-validate the consistency of answers.
User feedback: Allow users to mark incorrect answers to help AI learn and improve.

3. Optimize model training
High-quality data: Use more comprehensive and accurate datasets to train AI.
Reinforcement learning: Encourage AI to generate accurate content through reward mechanisms.
Context management: Improve the model’s long-term memory ability to avoid context loss.

4. Set up a “hallucination detector”
Logical consistency check: Detect whether the generated content is logical.
Factual scoring: Score the generated content and filter out low-confidence answers.
Abnormal content tagging: Automatically identify and tag content that may be hallucinations.

Summary

Although AI hallucinations are interesting, they may cause problems in practical applications. By using special dialogue prompts, introducing fact-checking mechanisms, optimizing model training, and setting up hallucination detectors, we can effectively reduce AI’s “daydreams” and make it more reliable and practical. After all, what we need is a reliable assistant, not a “science fiction writer”! The AI model I often use is DeepSeek-R1, what about you?

If you have other questions about AI hallucinations, feel free to let me know!