Glossary/Agentic AI
LumiChats Features

Agentic AI

AI that takes actions, not just gives answers.


Definition

Agentic AI refers to AI systems that can autonomously plan and execute multi-step tasks — not just respond to a single prompt. An AI agent uses tools (code execution, web search, file I/O, APIs), observes results, adapts its plan, and iterates until a goal is achieved. This is a qualitative leap beyond conversational AI.

What makes AI 'agentic'

PropertyConversational AIAgentic AI
Action modelSingle response to single messageSequence of actions toward a goal
Tool accessText generation onlyCode execution, search, APIs, file I/O, browser
MemoryContext window only (ephemeral)Can write to files, databases, memory stores
Error handlingErrors surface to userObserves error, diagnoses, retries autonomously
Task scopeOne exchange = one taskOne prompt = many sequential tasks over minutes/hours
Human involvementEvery step requires human inputHuman sets goal; agent executes; human reviews result

The agentic loop

The core pattern: (1) Observe current state. (2) Reason about what action to take next toward the goal. (3) Call a tool (code, search, API). (4) Observe result. (5) Update plan based on what was learned. (6) Repeat until goal achieved or stuck. Each iteration, the agent updates its understanding of the world — enabling course-correction, error recovery, and adaptive planning that single-turn LLMs cannot do.

Tool use in AI agents

Agents are defined by their tools. Tool use is implemented via function calling — the LLM outputs a structured JSON call, the runtime executes it, and the result is fed back as the next model input.

Anthropic function calling — tool use loop

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const tools = [
  {
    name: "execute_code",
    description: "Run Python code in a sandbox and return stdout/stderr",
    input_schema: {
      type: "object",
      properties: {
        code: { type: "string", description: "Python code to execute" }
      },
      required: ["code"]
    }
  },
  {
    name: "web_search",
    description: "Search the web and return top results",
    input_schema: {
      type: "object",
      properties: {
        query: { type: "string" }
      },
      required: ["query"]
    }
  }
];

// Agentic loop
let messages = [{ role: "user", content: "Scrape the top 10 HN stories and plot upvotes" }];
while (true) {
  const response = await client.messages.create({
    model: "claude-opus-4-5", max_tokens: 4096, tools, messages
  });
  
  if (response.stop_reason === "end_turn") break;  // done
  
  // Process tool calls
  for (const block of response.content) {
    if (block.type === "tool_use") {
      const result = await executeTool(block.name, block.input);
      messages.push({ role: "assistant", content: response.content });
      messages.push({ role: "user", content: [{ type: "tool_result", tool_use_id: block.id, content: result }] });
    }
  }
}
Tool categoryExamplesRisk level
Code executionPython sandbox, JavaScript runner, shell commandsHigh — can modify state; use sandboxing
Web searchBing/Google search, URL fetching, scrapingLow — read-only
File I/ORead/write files, create documents, parse PDFsMedium — can overwrite files
API callsREST APIs, database queries, external servicesHigh — irreversible actions (email, payments)
Browser controlPlaywright, Selenium — click, fill forms, navigateHigh — can submit forms, make purchases
MemoryVector DB writes, note-taking, state persistenceLow — but affects future context

The agentic loop in detail

Concretely: a user gives the goal "Analyze this CSV and create a visualization." Here is what the agent does autonomously:

  1. Plan: "I should load the CSV, examine its structure and data types, then create an appropriate visualization."
  2. Act: Calls execute_code with Python to load the CSV and print df.head(), df.dtypes, df.describe().
  3. Observe: Sees columns [date, revenue, users, region], revenue and users are numeric, region is categorical.
  4. Re-plan: "A grouped bar chart of revenue by region over time would show the key trends."
  5. Act: Calls execute_code with matplotlib/seaborn code to generate the chart.
  6. Observe: Code raises a ValueError — date column is a string, not datetime.
  7. Self-correct: Adds pd.to_datetime(df["date"]) to the code, re-executes.
  8. Observe: Chart generated successfully at /tmp/chart.png.
  9. Complete: Returns the chart + written interpretation of the trends it shows.

ReAct: the standard agent prompting pattern

ReAct (Yao et al., 2022) formalizes the agentic loop as alternating Thought → Action → Observation cycles: (1) Thought: the agent reasons about current state. (2) Action: calls a tool with specific arguments. (3) Observation: receives the result. Prompting the model to always produce explicit Thought steps before acting dramatically improves reliability — the model reasons before committing to an action. Most production agent frameworks (LangChain, LlamaIndex, AutoGPT) implement ReAct-style loops.

Multi-agent systems

Complex tasks benefit from specialization — an orchestrator agent breaks the goal into sub-tasks and delegates to specialist agents. The key challenge: errors compound across agents.

Agent roleResponsibilityExample
OrchestratorDecomposes goal, assigns tasks, synthesizes results"Build a full-stack app" → assigns frontend, backend, DB agents
CoderWrites and iterates on codeGitHub Copilot, Claude Code, Devin
ResearcherSearches web, reads papers, synthesizes findingsPerplexity-style deep research
QA / CriticReviews outputs for errors, runs tests, suggests fixesCode review agent, fact-checker
Memory managerMaintains shared context — writes/reads from vector DBStores progress notes for other agents
Tool specialistCalls external APIs (Stripe, Slack, Salesforce)Zapier/Make.com-style automation

Error compounding is the main risk

In a 5-agent pipeline, each agent operating at 90% accuracy produces a correct end-to-end result only 0.9⁵ = 59% of the time. This is why multi-agent systems need checkpointing (human review at key milestones), confidence scoring (agents flag uncertainty), and extensive testing. Current best practice: keep pipelines short (3–4 agents max), verify outputs between stages, and always include a QA/critic agent.

Risks and safety in agentic systems

RiskDescriptionMitigation
Consequential actionsDeleting files, sending emails, making purchases — hard to undoSandboxing; confirmation prompts for irreversible actions; dry-run mode
Prompt injectionMalicious text in the environment (webpage, file, API response) hijacks agent goalsInput sanitization; don't execute instructions from retrieved content
Goal misinterpretation"Delete all errors" → deletes error-handling code; agent achieves literal goal, not intentStructured goal specification; intermediate confirmation; critic agents
Infinite loopsAgent repeats same failing action; no exit conditionMax iteration limits; detecting repeated actions; cost budgets
Cost blowoutsMany tool calls = many tokens; unbounded agent runs can cost $100sToken budgets; iteration limits; cost alerts
Over-automationRemoving human oversight from critical decisionsHuman-in-the-loop checkpoints; irreversible action gates

LumiChats Agent Mode sandbox

LumiChats Agent Mode runs entirely inside a WebContainer — a sandboxed Node.js environment in the browser. There is no filesystem access beyond the container, no network access to external services, and no persistence after the session. All code execution is isolated to the browser tab. Actions are logged in real-time so you can monitor every step, and you can interrupt the agent at any point. This eliminates the irreversible action risk entirely for the LumiChats use case.

Practice questions

  1. What is the ReAct (Reasoning + Acting) pattern in agentic AI? (Answer: ReAct (Yao et al. 2022): the agent alternates between Reasoning steps (internal thought about what to do next) and Acting steps (calling a tool). Format: Thought: I need to find the current stock price... Action: search[AAPL stock price today] Observation: AAPL is at $189.32 Thought: Now I can answer... Final answer: AAPL is $189.32. The explicit reasoning steps make the agent's decisions interpretable and allow self-correction if a reasoning step is wrong.)
  2. What are the main failure modes of AI agents? (Answer: (1) Hallucination of tool results: agent 'recalls' information instead of calling the tool, fabricating results. (2) Infinite loops: agent calls the same tool repeatedly without progress. (3) Prompt injection from tool results: malicious content in search/document results hijacks agent behaviour. (4) Over-permissive tool use: agent takes irreversible actions (send email, delete file) without confirming. (5) Context overflow: long agentic runs accumulate context until the window fills, causing the agent to lose track of its goal.)
  3. What is the difference between a single-agent and multi-agent system? (Answer: Single-agent: one LLM with multiple tools, handles all subtasks sequentially. Simpler, easier to debug. Multi-agent: multiple specialised LLM agents, each with specific tools and expertise — an orchestrator coordinates them. Example: researcher agent (web search) + analyst agent (code execution) + writer agent (drafting). Advantages: specialisation (each agent optimised for its role), parallelism (agents work concurrently), scalability. Disadvantages: coordination overhead, error propagation between agents, debugging difficulty.)
  4. How does human-in-the-loop oversight work in production AI agent systems? (Answer: Checkpoints: the agent pauses at predefined high-risk action points (sending email, making API calls with financial impact, deleting data) and requires explicit user approval before proceeding. Confirmation prompts: 'I plan to send this email to 500 contacts — confirm?' Audit logging: every tool call, result, and decision is logged for review. Rollback: reversible actions use transactions or staging environments before committing. Interrupt handlers: users can stop the agent mid-run. Minimal footprint principle: agents request only necessary permissions and prefer reversible actions.)
  5. What is the difference between agentic AI and traditional RPA (Robotic Process Automation)? (Answer: RPA: rule-based automation of structured workflows — scrapes data from fixed UI positions, follows rigid decision trees. Brittle: breaks when UI changes. Cannot handle ambiguity or novel situations. Agentic AI: understands intent from natural language, adapts to UI changes, handles ambiguous situations through reasoning, and can decompose novel tasks it has never seen before. RPA is scripted; agents are reasoning. Hybrid approaches (AI + RPA) use agents for decision-making and RPA for reliable execution of structured steps.)

On LumiChats

LumiChats Agent Mode gives the AI a full sandboxed Node.js environment (WebContainer) running in your browser. The agent writes code, executes it, reads output, fixes errors, and generates downloadable files — all without needing a server.

Try it free

Try LumiChats for ₹69

39+ AI models. Study Mode with page-locked answers. Agent Mode with code execution. Pay only on days you use it.

Get Started — ₹69/day

Related Terms

4 terms