Definition

Agentic AI refers to AI systems that can autonomously plan and execute multi-step tasks — not just respond to a single prompt. An AI agent uses tools (code execution, web search, file I/O, APIs), observes results, adapts its plan, and iterates until a goal is achieved. This is a qualitative leap beyond conversational AI.

What makes AI 'agentic'

Property	Conversational AI	Agentic AI
Action model	Single response to single message	Sequence of actions toward a goal
Tool access	Text generation only	Code execution, search, APIs, file I/O, browser
Memory	Context window only (ephemeral)	Can write to files, databases, memory stores
Error handling	Errors surface to user	Observes error, diagnoses, retries autonomously
Task scope	One exchange = one task	One prompt = many sequential tasks over minutes/hours
Human involvement	Every step requires human input	Human sets goal; agent executes; human reviews result

The agentic loop

The core pattern: (1) Observe current state. (2) Reason about what action to take next toward the goal. (3) Call a tool (code, search, API). (4) Observe result. (5) Update plan based on what was learned. (6) Repeat until goal achieved or stuck. Each iteration, the agent updates its understanding of the world — enabling course-correction, error recovery, and adaptive planning that single-turn LLMs cannot do.

Tool use in AI agents

Agents are defined by their tools. Tool use is implemented via function calling — the LLM outputs a structured JSON call, the runtime executes it, and the result is fed back as the next model input.

Anthropic function calling — tool use loop

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const tools = [
  {
    name: "execute_code",
    description: "Run Python code in a sandbox and return stdout/stderr",
    input_schema: {
      type: "object",
      properties: {
        code: { type: "string", description: "Python code to execute" }
      },
      required: ["code"]
    }
  },
  {
    name: "web_search",
    description: "Search the web and return top results",
    input_schema: {
      type: "object",
      properties: {
        query: { type: "string" }
      },
      required: ["query"]
    }
  }
];

// Agentic loop
let messages = [{ role: "user", content: "Scrape the top 10 HN stories and plot upvotes" }];
while (true) {
  const response = await client.messages.create({
    model: "claude-opus-4-5", max_tokens: 4096, tools, messages
  });
  
  if (response.stop_reason === "end_turn") break;  // done
  
  // Process tool calls
  for (const block of response.content) {
    if (block.type === "tool_use") {
      const result = await executeTool(block.name, block.input);
      messages.push({ role: "assistant", content: response.content });
      messages.push({ role: "user", content: [{ type: "tool_result", tool_use_id: block.id, content: result }] });
    }
  }
}

Tool category	Examples	Risk level
Code execution	Python sandbox, JavaScript runner, shell commands	High — can modify state; use sandboxing
Web search	Bing/Google search, URL fetching, scraping	Low — read-only
File I/O	Read/write files, create documents, parse PDFs	Medium — can overwrite files
API calls	REST APIs, database queries, external services	High — irreversible actions (email, payments)
Browser control	Playwright, Selenium — click, fill forms, navigate	High — can submit forms, make purchases
Memory	Vector DB writes, note-taking, state persistence	Low — but affects future context

The agentic loop in detail

Concretely: a user gives the goal "Analyze this CSV and create a visualization." Here is what the agent does autonomously:

Plan: "I should load the CSV, examine its structure and data types, then create an appropriate visualization."
Act: Calls execute_code with Python to load the CSV and print df.head(), df.dtypes, df.describe().
Observe: Sees columns [date, revenue, users, region], revenue and users are numeric, region is categorical.
Re-plan: "A grouped bar chart of revenue by region over time would show the key trends."
Act: Calls execute_code with matplotlib/seaborn code to generate the chart.
Observe: Code raises a ValueError — date column is a string, not datetime.
Self-correct: Adds pd.to_datetime(df["date"]) to the code, re-executes.
Observe: Chart generated successfully at /tmp/chart.png.
Complete: Returns the chart + written interpretation of the trends it shows.

ReAct: the standard agent prompting pattern

ReAct (Yao et al., 2022) formalizes the agentic loop as alternating Thought → Action → Observation cycles: (1) Thought: the agent reasons about current state. (2) Action: calls a tool with specific arguments. (3) Observation: receives the result. Prompting the model to always produce explicit Thought steps before acting dramatically improves reliability — the model reasons before committing to an action. Most production agent frameworks (LangChain, LlamaIndex, AutoGPT) implement ReAct-style loops.

Multi-agent systems

Complex tasks benefit from specialization — an orchestrator agent breaks the goal into sub-tasks and delegates to specialist agents. The key challenge: errors compound across agents.

Agent role	Responsibility	Example
Orchestrator	Decomposes goal, assigns tasks, synthesizes results	"Build a full-stack app" → assigns frontend, backend, DB agents
Coder	Writes and iterates on code	GitHub Copilot, Claude Code, Devin
Researcher	Searches web, reads papers, synthesizes findings	Perplexity-style deep research
QA / Critic	Reviews outputs for errors, runs tests, suggests fixes	Code review agent, fact-checker
Memory manager	Maintains shared context — writes/reads from vector DB	Stores progress notes for other agents
Tool specialist	Calls external APIs (Stripe, Slack, Salesforce)	Zapier/Make.com-style automation

Error compounding is the main risk

In a 5-agent pipeline, each agent operating at 90% accuracy produces a correct end-to-end result only 0.9⁵ = 59% of the time. This is why multi-agent systems need checkpointing (human review at key milestones), confidence scoring (agents flag uncertainty), and extensive testing. Current best practice: keep pipelines short (3–4 agents max), verify outputs between stages, and always include a QA/critic agent.

Risks and safety in agentic systems

Risk	Description	Mitigation
Consequential actions	Deleting files, sending emails, making purchases — hard to undo	Sandboxing; confirmation prompts for irreversible actions; dry-run mode
Prompt injection	Malicious text in the environment (webpage, file, API response) hijacks agent goals	Input sanitization; don't execute instructions from retrieved content
Goal misinterpretation	"Delete all errors" → deletes error-handling code; agent achieves literal goal, not intent	Structured goal specification; intermediate confirmation; critic agents
Infinite loops	Agent repeats same failing action; no exit condition	Max iteration limits; detecting repeated actions; cost budgets
Cost blowouts	Many tool calls = many tokens; unbounded agent runs can cost $100s	Token budgets; iteration limits; cost alerts
Over-automation	Removing human oversight from critical decisions	Human-in-the-loop checkpoints; irreversible action gates

LumiChats Agent Mode sandbox

LumiChats Agent Mode runs entirely inside a WebContainer — a sandboxed Node.js environment in the browser. There is no filesystem access beyond the container, no network access to external services, and no persistence after the session. All code execution is isolated to the browser tab. Actions are logged in real-time so you can monitor every step, and you can interrupt the agent at any point. This eliminates the irreversible action risk entirely for the LumiChats use case.

Practice questions

What is the ReAct (Reasoning + Acting) pattern in agentic AI? (Answer: ReAct (Yao et al. 2022): the agent alternates between Reasoning steps (internal thought about what to do next) and Acting steps (calling a tool). Format: Thought: I need to find the current stock price... Action: search[AAPL stock price today] Observation: AAPL is at $189.32 Thought: Now I can answer... Final answer: AAPL is $189.32. The explicit reasoning steps make the agent's decisions interpretable and allow self-correction if a reasoning step is wrong.)
What are the main failure modes of AI agents? (Answer: (1) Hallucination of tool results: agent 'recalls' information instead of calling the tool, fabricating results. (2) Infinite loops: agent calls the same tool repeatedly without progress. (3) Prompt injection from tool results: malicious content in search/document results hijacks agent behaviour. (4) Over-permissive tool use: agent takes irreversible actions (send email, delete file) without confirming. (5) Context overflow: long agentic runs accumulate context until the window fills, causing the agent to lose track of its goal.)
What is the difference between a single-agent and multi-agent system? (Answer: Single-agent: one LLM with multiple tools, handles all subtasks sequentially. Simpler, easier to debug. Multi-agent: multiple specialised LLM agents, each with specific tools and expertise — an orchestrator coordinates them. Example: researcher agent (web search) + analyst agent (code execution) + writer agent (drafting). Advantages: specialisation (each agent optimised for its role), parallelism (agents work concurrently), scalability. Disadvantages: coordination overhead, error propagation between agents, debugging difficulty.)
How does human-in-the-loop oversight work in production AI agent systems? (Answer: Checkpoints: the agent pauses at predefined high-risk action points (sending email, making API calls with financial impact, deleting data) and requires explicit user approval before proceeding. Confirmation prompts: 'I plan to send this email to 500 contacts — confirm?' Audit logging: every tool call, result, and decision is logged for review. Rollback: reversible actions use transactions or staging environments before committing. Interrupt handlers: users can stop the agent mid-run. Minimal footprint principle: agents request only necessary permissions and prefer reversible actions.)
What is the difference between agentic AI and traditional RPA (Robotic Process Automation)? (Answer: RPA: rule-based automation of structured workflows — scrapes data from fixed UI positions, follows rigid decision trees. Brittle: breaks when UI changes. Cannot handle ambiguity or novel situations. Agentic AI: understands intent from natural language, adapts to UI changes, handles ambiguous situations through reasoning, and can decompose novel tasks it has never seen before. RPA is scripted; agents are reasoning. Hybrid approaches (AI + RPA) use agents for decision-making and RPA for reliable execution of structured steps.)

On LumiChats

LumiChats Agent Mode gives the AI a full sandboxed Node.js environment (WebContainer) running in your browser. The agent writes code, executes it, reads output, fixes errors, and generates downloadable files — all without needing a server.

Try it free

Agentic AI

What makes AI 'agentic'

Tool use in AI agents

The agentic loop in detail

Multi-agent systems

Risks and safety in agentic systems

Practice questions

Try LumiChats for ₹69

Related Terms