The 3 Lines of Python Code That Fixed My AI Agent’s Hallucinations

The Fallacy of Prompt Engineering

There is a widespread misconception in the AI engineering community that hallucinations can be solved with better words. Developers spend hours appending phrases like “Think step-by-step,” “You are a helpful expert,” “Output strictly in JSON,” and “Do not lie under any circumstances” to their system prompts.

No amount of prompt engineering can completely eradicate LLM hallucinations in a production agentic system. The fundamental flaw is treating the LLM as a magical black box that will always output valid, parseable text.

The Architectural Challenge: Parsing Unpredictable Text

When an LLM generates a response, standard systems attempt to parse it using regular expressions, string splitting, or loose json.loads() wrappers. If the model hallucinates an extra sentence, forgets a trailing comma, or decides to wrap its JSON in markdown backticks (```json), your downstream Python logic crashes.

In multi-agent systems, hallucinations aren’t just factual errors (like stating the wrong capital of a country); they are structural errors. If the Routing Agent hallucinates an invalid route name, the entire orchestration graph fails.

The Fix: Strict Pydantic Enforcement

The secret to reliable AI agents is removing the LLM’s ability to be creative with its output format. By using Pydantic models combined with modern Structured Output APIs (introduced by OpenAI in late 2024), you can force the model at the API-level to conform strictly to a predefined JSON schema.

The 3 Lines of Code

Using LangChain’s wrapper around OpenAI’s structured outputs, we can bind a rigid Python class to the generation pipeline.

from pydantic import BaseModel, Field
from langchain_openai import ChatOpenAI

# 1. Define the absolute constraints of the Agent's decision
class AgentDecision(BaseModel):
    confidence_score: float = Field(ge=0.0, le=1.0, description="Confidence metric")
    action: str = Field(description="Strict enum routing", enum=["refund", "escalate", "ignore"])
    reasoning: str = Field(description="Brief explanation for audit logs")

llm = ChatOpenAI(model="gpt-4o")

# 2. The Magic Line: Bind the schema to the LLM natively
structured_llm = llm.with_structured_output(AgentDecision)

# 3. Invoke. The output is a guaranteed Pydantic object, not a string!
output = structured_llm.invoke("The customer is furiously demanding their money back for a broken monitor.")

# No parsing, no regex, no crashes. Just raw object properties.
print(f"Action chosen: {output.action} with {output.confidence_score} confidence.")

Why This Kills Hallucinations

Schema Coercion at the Token Level: OpenAI’s native structured outputs actually constrain the token generation probabilities on the server side. If the model tries to output an action like “send_email” instead of the allowed “refund”, the API refuses to generate the invalid token.
Grounding via Data Types: Forcing the LLM to output specific variable types (like floats strictly between 0 and 1) anchors its generation process, leaving less computational room for erratic generation.
Deterministic Routing: You can now use standard Python if/elif statements for your LangGraph edges, completely eliminating string-matching bugs.

Stop asking models to format things nicely using prompts. Force them using Pydantic.

Related Reading: We discuss how structured outputs also drastically cut costs in I Saved 80k Tokens a Day, and how to trace these outputs in Silent Failures in Production.

View Source Code on GitHub

The 3 Lines of Python Code That Fixed My AI Agent’s Hallucinations

The Fallacy of Prompt Engineering

The Architectural Challenge: Parsing Unpredictable Text

The Fix: Strict Pydantic Enforcement

The 3 Lines of Code

Why This Kills Hallucinations

Other Stories

VSIX Download: How to Install VS Code Extensions Offline (The Easy Way)

Top 25+ AWS DevOps Projects for Practice on GitHub (2026)

The Fallacy of Prompt Engineering

The Architectural Challenge: Parsing Unpredictable Text

The Fix: Strict Pydantic Enforcement

The 3 Lines of Code

Why This Kills Hallucinations

Related Articles

I Created a Second Brain for My Local AI Agents and Saved 70%

Azure Add Budget to Single Azure OpenAI Deployment: Stop AI Cost Runaways

Vector Search in Azure AI Search: The Ultimate Guide for Enterprise RAG

Azure OpenAI Model Deployment Guide: Configuring TPM, RPM, and PTU for Production

Other Stories

VSIX Download: How to Install VS Code Extensions Offline (The Easy Way)

Top 25+ AWS DevOps Projects for Practice on GitHub (2026)