We made the world first shopping AI chatbot, by MCP like approach, and avoided pink slip

Summary

My boss showed me white paper of sierra.ai, and gave me a mission to do the same.

So I changed a shopping mall backend server to an AI agent with LLM function calling enhanced by compiler skills, and it worked fine. Impressed by the demonstration, my boss decided to open source our solution.

This is @agentica, an AI agent framework, specialized in LLM function calling. Also, you can automate frontend development by @autoview

Github Repository: https://github.com/wrtnlabs/agentica

Homepage: https://wrtnlabs.io/agentica

1. Preface

Last year, my boss showed me white paper of sierra.ai, a $4.5 billion corporation founded by an OpenAI board member. He asked me why we couldn't do something similar with sierra.ai, and challenged me to prove why he should continue paying my salary.

Looking at sierra.ai's homepage, they appear to focus on AI agent development for e-commerce and counseling. However, their AI agents are not yet completed. It was not possible to search or buy products from their chatbot.

When I asked for a refund, sierra.ai's agent just told me:

"Contact the email address below, and request refund by yourself".

So luckily, I was able to avoid the pink slip and find an opportunity.

Since the mission was urgent, I just took a swagger.json file from an e-commerce backend server. And converted it to the LLM (Large Language Model) function calling schemas. As the number of API functions was large (289), I also composed an agent orchestration strategy filtering proper functions from the conversation context.

After that, I created the AI agent application, and demonstrated it to my boss.

In the demonstration, everything worked perfectly: searching and purchasing products, order and delivery management, customer support with refund features, discount coupons, and account deposits.

After the demonstration, my boss said:

Hey, we should open source this.

Our company and staff size are significantly smaller than sierra.ai, so we cannot directly compete with them, but instead, we can release our solution as open source.

Let's make our technology world-famous.

2. Agentic AI Framework

`@agentica`

@agentica, an Agentic AI Framework specialized in LLM Function Calling.

No complex workflows required. Just list up functions from below protocols.

If you want to make a great scale AI agent of enterprise level, list up a lot of functions related to the subject. Otherwise if you want to make a simple agent, list up few functions that you need. That's all.

Concentrating on such LLM function calling, and supporting ecosystem for it (compiler library typia), we could reach to the new Agentic AI era.

Github Repository: https://github.com/wrtnlabs/agentica
Homepage: https://wrtnlabs.io/agentica
Three protocols serving functions to call

import { Agentica, assertHttpLlmApplication } from "@agentica/core";
import typia from "typia";

const agent = new Agentica({
  controllers: [
    assertHttpLlmApplication({
      model: "chatgpt",
      document: await fetch(
        "https://shopping-be.wrtn.ai/editor/swagger.json",
      ).then(r => r.json()),
      connection: {
        host: "https://shopping-be.wrtn.ai",
        headers: {
          Authorization: "Bearer ********",
        },
      },
    }),
    typia.llm.application<MobileCamera, "chatgpt">(),
    typia.llm.application<MobileFileSystem, "chatgpt">(),
    typia.llm.application<MobilePhoneCall, "chatgpt">(),
  ],
});
await agent.conversate("I wanna buy MacBook Pro");

2.2. LLM Function Calling

AI selects proper function and fills arguments.

In nowadays, most LLM (Large Language Model) like OpenAI are supporting "function calling" feature. The "LLM function calling" means that LLM automatically selects a proper function and fills parameter values from conversation with the user (mainly by chatting text).

Structured output is another feature of LLM. The "structured output" means that LLM automatically transforms the output conversation into a structured data format like JSON.

@agentica concentrates on such LLM function calling, and doing everything with the function calling. The new Agentic AI era can only be realized through this function calling technology.

2.3. Function Calling vs Workflow

Whether scalable, flexible, and mass productive or not.

Workflow is not scalable, but function calling is.

In the traditional agent development method, whenever the agent's functionality is expanded, AI developers had drawn more and more complex agent workflows and graphs. However, no matter how complex the workflow graph is drawn, the accuracy of the agent has been significantly reduced as the functionality has expanded.

It's because whenever a new node is added to the agent graph, the number of processes to go through increases, and the success rate decreases as a Cartesian Product by the number of processes. For example, if five agent nodes are sequentially listed up, and each node has a 80% success rate, the final success rate becomes 32.77% (0.8⁵).

To hedge the Cartesian Product of success rate, AI developers need to construct a much more complex graph to independently partition each event. This inevitably makes AI agent development difficult and makes it difficult to respond to changing requirements, such as adding or modifying new features.

To mitigate this Cartesian product disaster, AI developers often create a new supervisor workflow as an add-on to the main workflow's nodes. If functionality needs to expand further, this leads to fractal patterns of workflows. To avoid the Cartesian product disaster, AI developers must face another fractal disaster.

Using such workflow approaches, would it be possible to create a shopping chatbot agent? Is it possible to build an enterprise-level chatbot? This explains why we mostly see special-purpose chatbots or chatbots that resemble toy projects in the world today.

The problem stems from the fact that agent workflows themselves are difficult to create and have extremely poor scalability and flexibility.

Jensen Huang Graph, and his advocacy about Agentic AI

In contrary, if you develop AI agent just by listing up functions to call, the agent application would be scalable, flexible, and mass productive.

If you want to make a large scale AI agent of enterprise level, list up a lot of functions related to the subject. Otherwise you wanna make a simple agent, list up few functions what you need. That's all.

AI agent development becomes much easier than workflow case, and whenever you need to make a change to the agent, you can do it just by adding or removing some functions to call. Such function driven AI development is the only way to accomplish Agentic AI.

import { Agentica, assertHttpLlmApplication } from "@agentica/core";
import typia from "typia";

const agent = new Agentica({
  controllers: [
    typia.llm.application<MobileCamera, "chatgpt">(),
    typia.llm.application<MobileFileSystem, "chatgpt">(),
    typia.llm.application<MobilePhoneCall, "chatgpt">(),
  ],
});
await agent.conversate("I wanna buy MacBook Pro");

3. Compiler Driven Development

I love zod, and think it is a convenient tool for schema generation. What I wanted was a comparison with hand-crafting a JSON schema, but somehow this video came out.

In fact, function calling driven AI agent development strategy has existed since 2023. When OpenAI released its function calling spec in 2023, many AI researchers predicted that function calling would rule the world.

However, this strategy could not be widely adopted because writing a function calling schema was too cumbersome, difficult, and risky. Because of this, the workflow agent graph has come to dominate the development of AI agents instead of function calling.

Today, @agentica revives the function calling driven strategy by compiler skills. From now on, LLM function calling schema will be automatically composed by the TypeScript compiler.

3.1. Problem of Hand-written Schema

Type	Value
Number of Functions	289
LOC of source code	37,752
LOC of function schema	212,069
Compiler Success Rate	100.00000 %
Human Success Rate	0.00002 %

Number of functions in @samchon/shopping-backend was 289, and LOC (Line of Codes) of it was 37,752. Then how about LOC of the LLM function calling schemas? Surprisingly, it is 212,069 LOC, about 5 times bigger than the original source code.

In traditional AI development, these LLM function calling schemas had been hand-written by humans. Humans need to write a schema of about 500 to 1,000 lines for each function. Can a human really do something well without making mistakes? I think it would be a very radical assumption that humans fail at each function 5% of the time.

However, even with the radical assumption of a 95% success rate, the human schema composition success rate converges to 0% (0.95²⁸⁹ ≒ 0.0000002). I think this is the reason why AI agents have evolved around workflow agent graphs rather than function calling.

When OpenAI announced function calling in 2023, many people thought function calling was a magic bullet, and in fact, they presented the same vision as @agentica.

In the traditional development ecosystem, if a human (maybe a backend developer) makes a mistake on schema writing, co-workers (maybe frontend developers) would avoid it by their intuition.

However, AI never forgives it. If there's any schema level mistake, it breaks the whole agent application. Therefore, in 2023, AI developers could not overcome the human errors, thus abandoned the function calling driven strategy, and adopted workflow agent graph strategy.

Today, @agentica will revive the function calling strategy again with its compiler skills. Compiler will make function calling schemas automatically and systematically, so that make success rate to be 100%.

3.2. Function Calling Schema by Compiler

import typia from "typia";

typia.llm.application<BbsArticleService, "chatgpt">();

class BbsArticleService { ... }

💻 Playground Link

Compiler revives Agentic AI.

If it's possible to make LLM function calling schema by compiler, every problem occurring from human mistakes is resolved, so that function calling driven AI agent development can be revived.

To build schema by compiler, @agentica is utilizing typia.llm.application<Class, Model>() function. When you call the function with a class type, typia analyzes your source code about the target class and related DTO types, then generates LLM function calling schema at the compilation level.

@samchon/shopping-backend was also made by such compiler driven LLM schema generation. Every function and DTO type defined in the project is analyzed by TypeScript compiler with typia (nestia: wrapper of typia in NestJS), so that LLM function calling schema is generated without any error.

Compiler is the key component to reaching the Agentic AI era.

3.3. Schema Conversion

Github Repository: https://github.com/samchon/openapi
Swagger/OpenAPI Specifications
JSON Schema Specifications in each LLM
- IChatGptSchema: OpenAI ChatGPT
- IClaudeSchema: Anthropic Claude
- IDeepSeekSchema: High-Flyer DeepSeek
- IGeminiSchema: Google Gemini
- ILlamaSchema: Meta Llama
Related Document: https://wrtnlabs.io/agentica/docs/core/controller/swagger/#llm-schema-conversion

You know what? There are a lot of versions in the Swagger and OpenAPI documents. And even in the same version, ambiguous and duplicated specifications are numerous.

Furthermore, JSON schema models are different between LLM vendors too, and some of them are not following the JSON schema standard.

To overcome such confusing environments of JSON schema specification, @agentica is utilizing @samchon/openapi library which converts Swagger/OpenAPI documents to an emended 3.1 version of OpenAPI document for clarity and unification, and then converts it to the specific LLM function calling schema of the service vendor bypassing the migration schema.

Also, OpenAI GPT and Google Gemini are not supporting full specification of JSON schema standard, so they cannot express constraint specs like format: uuid or minimum: 10 like properties. To make OpenAI and Gemini support these constraint specs, @samchon/openapi is writing JsDocTag comment on the description field like below.

JSON schema specification of Gemini

Banned types

IJsonSchema.IReference.$ref

IJsonSchema.IOneOf.oneOf

IJsonSchema.IAnyOf.anyOf

Banned constraint properties

IJsonSchema.INumber.minimum

IJsonSchema.INumber.maximum

IJsonSchema.INumber.exclusiveMinimum

IJsonSchema.INumber.exclusiveMaximum

IJsonSchema.INumber.multipleOf

IJsonSchema.IString.minLength

IJsonSchema.IString.maxLength

IJsonSchema.IString.format

IJsonSchema.IString.pattern

IJsonSchema.IString.contentMediaType

IJsonSchema.IArray.minItems

IJsonSchema.IArray.maxItems

IJsonSchema.IArray.uniqueItems

{
  "type": "string",
  "description": "Primary Key.\n\n@format uuid"
}

3.4. Validation Feedback

Is LLM Function Calling perfect? The answer is "NO".

You know what? LLM (Large Language Model) like OpenAI sometimes makes mistakes when composing arguments in the function calling. Even though number like simple type is defined in the parameters schema, LLM sometimes fills it just with a string typed value.

By the way, @agentica is an AI agent framework specialized in LLM function calling. And LLM function calling feature is not such perfect, doesn't that mean that @agentica is a vulnerable framework?

To correct such LLM function calling mistakes and make @agentica meaningful, @agentica runs a validation feedback strategy which informs the validation errors to the AI agent, so that it induces the AI agent to correct the mistakes in the next trial.

For example, when filling a shopping cart with products, an AI agent configures parameters of the wrong type about 60% of the time. Without the validation feedback, @samchon/shopping-backend could not have been converted to the AI chatbot, and I may have gotten the pink slip after that previous demonstration.

Name	Status
`ObjectConstraint`	1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣
`ObjectFunctionSchema`	2️⃣2️⃣4️⃣2️⃣2️⃣2️⃣2️⃣2️⃣5️⃣2️⃣
`ObjectHierarchical`	1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣2️⃣1️⃣1️⃣2️⃣
`ObjectJsonSchema`	1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣
`ObjectSimple`	1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣
`ObjectUnionExplicit`	1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣
`ObjectUnionImplicit`	1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣
`ShoppingCartCommodity`	1️⃣2️⃣2️⃣3️⃣1️⃣1️⃣4️⃣2️⃣1️⃣2️⃣
`ShoppingOrderCreate`	1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣
`ShoppingOrderPublish`	1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣❌1️⃣1️⃣1️⃣
`ShoppingSaleDetail`	1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣
`ShoppingSalePage`	1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣1️⃣

Value 1 means that function calling succeeded without validation feedback.

The other value means how many times validation feedbacks are utilized.

4. TypeScript Class, OpenAPI and MCP

@agentica collects functions to call from three protocols:

In nowadays, MCP becomes very popular protocol for function calling. And OpenAPI is the traditional powerhouse of API specs. @agentica supports both OpenAPI and MCP (Model Context Protocol) too.

However, @agentica still recommends users to just use the TypeScript class for convenience. Rather than wrapping TypeScript functions and classes to MCP or HTTP/OpenAPI servers, just use them directly. It is much more easy and convenient.

Furthermore, if you are developing a mobile application, you can make your AI agent call the device APIs directly through the TypeScript class providing. Below is a demonstration directly calling the Battery API from the LLM function calling, that OpenAPI and MCP servers never can do.

Of course, OpenAPI and MCP servers are great protocols, but developers are more familiar with classes.

import { Agentica } from "@agentica/core";
import typia from "typia";

import { ReactNativeBatteryService } from "./services/ReactNativeBatteryService";

const agent = new Agentica({
  controllers: [
    {
      protocol: "class",
      name: "battery",
      application: typia.llm.application<ReactNativeBatteryService, "chatgpt">,
      execute: new ReactNativeBatteryService(),
    },
  ],
});
await agent.conversate("How much battery left?");

5. AutoView

Github Repository: https://github.com/wrtnlabs/autoview
Playground Website: https://wrtnlabs.io/autoview

AI code generator from type schema for UI components.

@agentica lets you accomplish Agentic AI development just by listing up functions to call. And in the AI agent built by @agentica, users can call the functions just by conversation texts. So, sometimes, you can reduce full level frontend application development, especially about the input component development.

However, even though @agentica can reduce the input component development time, return value is just printed by markdown content, and you may not be satisfied with such markdown content quality. In that case, you may consider developing viewer components manually for each type.

For that case, @agentica supports another open source library @autoview, which generates frontend code by type schema. If your TypeScript class has 10 functions, @autoview will make 10 React rendering TypeScript codes automatically. If your OpenAPI document has 400 API functions, @autoview will automate the 400 UI components development.

With @agentica and @autoview, let's make ideal Agentic AI chatbot.

Jeongho Nam @samchon