Year 2023 was supposed to be the year of AI. For sure we all see some changes, especially in social media walls. For example, my LinkedIn is flooded with AI generated texts or images.
However, I prefer to focus on the realms of Artificial Intelligence that are equally fascinating and beneficial. I tried using AI, specifically GPT, to transform natural language into structured, machine-readable forms. This approach is used to generate machine-readable forms that translate into other use cases as well, i.e. the insights in this blog post can inform other equally interesting and useful applications. In this article you will learn how to use GPT for data processing and allowing users to provide natural input to an application. Let’s take a look into some interesting use cases.
It turns out that GPT does a pretty good job of parsing natural language input into data structures defined by a schema. We can use it for transforming manuals into a JSON graph, that can be then used for, e.g. presenting step-by-step instructions or visualizing dependencies.
Example: Let’s cook!
Example: Let’s cook!
We can use GPT to transform a recipe into a graph, defined by a JSON structure. It can then be used to, for example, present step-by-step instructions or visualizing dependencies between them.
The Upside team found a pretty good spaghetti recipe available here.
We can try transforming it into a JSON.
First, let’s define a JSON schema of the JSON that we’d like to get from GPT:
The schema can look like this:
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://upsidelab.io/recipe.schema.json",
"title": "Recipe",
"description": "A recipe definition",
"type": "object",
"properties": {
"ingredients": {
"description": "The list of ingredients required to prepare the recipe",
"type": "array",
"items": {
"$ref": "#Ingredient"
}
},
"steps": {
"description": "The list of steps required to prepare the recipe",
"type": "array",
"items": {
"$ref": "#Step"
}
}
},
"$defs": {
"ingredient": {
"$anchor": "Ingredient",
"type": "object",
"properties": {
"id": { "type": "number" },
"name": { "type": "string" },
"quantity": { "type": "string" }
}
},
"step": {
"$anchor": "Step",
"type": "object",
"properties": {
"id": { "type": "number" },
"name": { "type": "string" },
"description": { "type": "string" },
"dependsOnSteps": {
"type": "array",
"items": { "type": "number" }
},
"dependsOnIngredients": {
"type": "array",
"items": { "type": "number" }
}
}
}
}
}
Then a prompt for GPT. We’ll split it into three parts:
- Schema definition
- The recipe to be parsed
- A summary of what we expect it to return (just the JSON, conforming to the schema)
Summarize the following recipe into a json that conforms to the following json schema delimited by three backticks
\`
${schemaDefinition}
\`
Use the following recipe, which is delimited by three backticks
\`
${recipe}
\`
Respond with the JSON only. Your output should be parsable by a JSON parser and should conform to the JSON schema definied before.
To get stable results, you may need to tweak the prompt. We’ve noticed that more complex domains GPT will often require a more complex prompt that more explicitly specifies our requirements. For a spaghetti recipe it’s enough though.
Sample output:
{
"ingredients": [
{ "id": 1, "name": "olive oil", "quantity": "1 tbsp" },
{ "id": 2, "name": "smoked streaky bacon", "quantity": "4 rashers" },
{ "id": 3, "name": "medium onions", "quantity": "2" },
{ "id": 4, "name": "carrots", "quantity": "2" },
{ "id": 5, "name": "celery sticks", "quantity": "2" },
{ "id": 6, "name": "garlic cloves", "quantity": "2" },
{ "id": 7, "name": "rosemary sprigs", "quantity": "2-3" },
{ "id": 8, "name": "beef mince", "quantity": "500g" },
{ "id": 9, "name": "plum tomatoes", "quantity": "2 x 400g tins" },
{ "id": 10, "name": "basil", "quantity": "small pack" },
{ "id": 11, "name": "dried oregano", "quantity": "1 tsp" },
{ "id": 12, "name": "bay leaves", "quantity": "2" },
{ "id": 13, "name": "tomato purée", "quantity": "2 tbsp" },
{ "id": 14, "name": "beef stock cube", "quantity": "1" },
{ "id": 15, "name": "red chilli", "quantity": "1 (optional)" },
{ "id": 16, "name": "red wine", "quantity": "125ml" },
{ "id": 17, "name": "cherry tomatoes", "quantity": "6" },
{ "id": 18, "name": "parmesan", "quantity": "75g" },
{ "id": 19, "name": "spaghetti", "quantity": "400g" },
{ "id": 20, "name": "crusty bread", "quantity": "to serve (optional)" }
],
"steps": [
{
"id": 1,
"name": "Prepare the saucepan",
"description": "Put a large saucepan on a medium heat and add 1 tbsp olive oil.",
"dependsOnSteps": [],
"dependsOnIngredients": [1]
},
{
"id": 2,
"name": "Cook bacon",
"description": "Add 4 finely chopped bacon rashers and fry for 10 mins until golden and crisp.",
"dependsOnSteps": [1],
"dependsOnIngredients": [2]
},
{
"id": 3,
"name": "Add vegetables and herbs",
"description": "Reduce the heat and add the 2 onions, 2 carrots, 2 celery sticks, 2 garlic cloves and the leaves from 2-3 sprigs rosemary, all finely chopped, then fry for 10 mins.",
"dependsOnSteps": [2],
"dependsOnIngredients": [3, 4, 5, 6, 7]
},
{
"id": 4,
"name": "Brown the mince",
"description": "Increase the heat to medium-high, add 500g beef mince and cook stirring for 3-4 mins until the meat is browned all over.",
"dependsOnSteps": [3],
"dependsOnIngredients": [8]
},
{
"id": 5,
"name": "Add bolognese sauce ingredients",
"description": "Add 2 tins plum tomatoes, the finely chopped leaves from ¾ small pack basil, 1 tsp dried oregano, 2 bay leaves, 2 tbsp tomato purée, 1 beef stock cube, 1 deseeded and finely chopped red chilli (if using), 125ml red wine and 6 halved cherry tomatoes. Stir well.",
"dependsOnSteps": [4],
"dependsOnIngredients": [9, 10, 11, 12, 13, 14, 15, 16, 17]
},
{
"id": 6,
"name": "Simmer the sauce",
"description": "Bring to the boil, reduce to a gentle simmer and cover with a lid. Cook for 1 hr 15 mins stirring occasionally, until you have a rich, thick sauce.",
"dependsOnSteps": [5],
"dependsOnIngredients": []
},
{
"id": 7,
"name": "Add parmesan and season",
"description": "Add the 75g grated parmesan, check the seasoning and stir.",
"dependsOnSteps": [6],
"dependsOnIngredients": [18]
},
{
"id": 8,
"name": "Cook spaghetti",
"description": "When the bolognese is nearly finished, cook 400g spaghetti following the pack instructions.",
"dependsOnSteps": [7],
"dependsOnIngredients": [19]
},
{
"id": 9,
"name": "Serve",
"description": "Drain the spaghetti and either stir into the bolognese sauce, or serve the sauce on top. Serve with more grated parmesan, the remaining basil leaves and crusty bread, if you like.",
"dependsOnSteps": [8],
"dependsOnIngredients": [10, 18, 20]
}
]
}
This is something we can easily display on a graph:
If we aren’t satisfied with the outcome, we can also use a follow-up prompt to mutate the structure. If you look at the recipe, step 8 (cook spaghetti) is independent of the previous steps.
Let’s ask GPT to change that:
Step 8 doesn't depend on any of the previous steps and can be done in parallel. Change the resulting JSON to reflect that
{
"ingredients": [
{ "id": 1, "name": "olive oil", "quantity": "1 tbsp" },
{ "id": 2, "name": "smoked streaky bacon", "quantity": "4 rashers" },
{ "id": 3, "name": "medium onions", "quantity": "2" },
{ "id": 4, "name": "carrots", "quantity": "2" },
{ "id": 5, "name": "celery sticks", "quantity": "2" },
{ "id": 6, "name": "garlic cloves", "quantity": "2" },
{ "id": 7, "name": "rosemary sprigs", "quantity": "2-3" },
{ "id": 8, "name": "beef mince", "quantity": "500g" },
{ "id": 9, "name": "plum tomatoes", "quantity": "2 x 400g tins" },
{ "id": 10, "name": "basil", "quantity": "small pack" },
{ "id": 11, "name": "dried oregano", "quantity": "1 tsp" },
{ "id": 12, "name": "bay leaves", "quantity": "2" },
{ "id": 13, "name": "tomato purée", "quantity": "2 tbsp" },
{ "id": 14, "name": "beef stock cube", "quantity": "1" },
{ "id": 15, "name": "red chilli", "quantity": "1 (optional)" },
{ "id": 16, "name": "red wine", "quantity": "125ml" },
{ "id": 17, "name": "cherry tomatoes", "quantity": "6" },
{ "id": 18, "name": "parmesan", "quantity": "75g" },
{ "id": 19, "name": "spaghetti", "quantity": "400g" },
{ "id": 20, "name": "crusty bread", "quantity": "to serve (optional)" }
],
"steps": [
{
"id": 1,
"name": "Prepare the saucepan",
"description": "Put a large saucepan on a medium heat and add 1 tbsp olive oil.",
"dependsOnSteps": [],
"dependsOnIngredients": [1]
},
{
"id": 2,
"name": "Cook bacon",
"description": "Add 4 finely chopped bacon rashers and fry for 10 mins until golden and crisp.",
"dependsOnSteps": [1],
"dependsOnIngredients": [2]
},
{
"id": 3,
"name": "Add vegetables and herbs",
"description": "Reduce the heat and add the 2 onions, 2 carrots, 2 celery sticks, 2 garlic cloves and the leaves from 2-3 sprigs rosemary, all finely chopped, then fry for 10 mins.",
"dependsOnSteps": [2],
"dependsOnIngredients": [3, 4, 5, 6, 7]
},
{
"id": 4,
"name": "Brown the mince",
"description": "Increase the heat to medium-high, add 500g beef mince and cook stirring for 3-4 mins until the meat is browned all over.",
"dependsOnSteps": [3],
"dependsOnIngredients": [8]
},
{
"id": 5,
"name": "Add bolognese sauce ingredients",
"description": "Add 2 tins plum tomatoes, the finely chopped leaves from ¾ small pack basil, 1 tsp dried oregano, 2 bay leaves, 2 tbsp tomato purée, 1 beef stock cube, 1 deseeded and finely chopped red chilli (if using), 125ml red wine and 6 halved cherry tomatoes. Stir well.",
"dependsOnSteps": [4],
"dependsOnIngredients": [9, 10, 11, 12, 13, 14, 15, 16, 17]
},
{
"id": 6,
"name": "Simmer the sauce",
"description": "Bring to the boil, reduce to a gentle simmer and cover with a lid. Cook for 1 hr 15 mins stirring occasionally, until you have a rich, thick sauce.",
"dependsOnSteps": [5],
"dependsOnIngredients": []
},
{
"id": 7,
"name": "Add parmesan and season",
"description": "Add the 75g grated parmesan, check the seasoning and stir.",
"dependsOnSteps": [6],
"dependsOnIngredients": [18]
},
{
"id": 8,
"name": "Cook spaghetti",
"description": "When the bolognese is nearly finished, cook 400g spaghetti following the pack instructions.",
"dependsOnSteps": [],
"dependsOnIngredients": [19]
},
{
"id": 9,
"name": "Serve",
"description": "Drain the spaghetti and either stir into the bolognese sauce, or serve the sauce on top. Serve with more grated parmesan, the remaining basil leaves and crusty bread, if you like.",
"dependsOnSteps": [7, 8],
"dependsOnIngredients": [10, 18, 20]
}
]
}
As the previous response is a part of the follow-up prompt, GPT does a pretty good job at keeping the unchanged elements stable.
Use case - build a BPMN specification of a business process
Let’s expand this further - this approach allows us to generate a BPMN definition of a business process, which can subsequently be used by a tool like Camunda.
Let’s try the following prompt:
Generate camunda xml file that describes a process delimited by three backticks
\`
${processDescription}
\`
Respond with the XML only. The XML should contain process definition and a diagram of the process. The XML should be a valid camunda xml file and should be parsable by its modeller tool. There should be nothing except the xml in the response.
Where process description could be for example:
An invoice is received by a system, parsed by an OCR system, then reviewed by an accountant. After the initial review, it's sent to a manager for an approval. When the approval is made, it's sent to account payable. If the approval is not given by the manager, it is sent to accounting for further clarification.
After sending that to GPT, we’ll get the following output:
<?xml version="1.0" encoding="UTF-8"?>
<definitions xmlns="http://www.omg.org/spec/BPMN/20100524/MODEL"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:camunda="http://camunda.org/schema/1.0/bpmn"
xmlns:bpmndi="http://www.omg.org/spec/BPMN/20100524/DI"
xmlns:dc="http://www.omg.org/spec/DD/20100524/DC"
xmlns:di="http://www.omg.org/spec/DD/20100524/DI"
xsi:schemaLocation="http://www.omg.org/spec/BPMN/20100524/MODEL BPMN20.xsd"
id="Definitions_1"
targetNamespace="http://bpmn.io/schema/bpmn">
<process id="invoiceProcess" name="Invoice Processing" isExecutable="true">
<startEvent id="startEvent" name="Invoice Received">
<outgoing>Flow_1</outgoing>
</startEvent>
<sequenceFlow id="Flow_1" sourceRef="startEvent" targetRef="parseInvoice" />
<serviceTask id="parseInvoice" name="Parse Invoice with OCR" camunda:type="external" camunda:topic="parseInvoice">
<incoming>Flow_1</incoming>
<outgoing>Flow_2</outgoing>
</serviceTask>
<sequenceFlow id="Flow_2" sourceRef="parseInvoice" targetRef="reviewByAccountant" />
<userTask id="reviewByAccountant" name="Review by Accountant">
<incoming>Flow_2</incoming>
<outgoing>Flow_3</outgoing>
</userTask>
<sequenceFlow id="Flow_3" sourceRef="reviewByAccountant" targetRef="managerApproval" />
<exclusiveGateway id="managerApproval" name="Manager Approval">
<incoming>Flow_3</incoming>
<outgoing>Flow_4</outgoing>
<outgoing>Flow_5</outgoing>
</exclusiveGateway>
<sequenceFlow id="Flow_4" sourceRef="managerApproval" targetRef="sendToAccountPayable" name="Approved">
<conditionExpression xsi:type="tFormalExpression">\${approved}</conditionExpression>
</sequenceFlow>
<sequenceFlow id="Flow_5" sourceRef="managerApproval" targetRef="sendToAccounting" name="Not Approved">
<conditionExpression xsi:type="tFormalExpression">not(\${approved})</conditionExpression>
</sequenceFlow>
<userTask id="sendToAccountPayable" name="Send to Account Payable">
<incoming>Flow_4</incoming>
<outgoing>Flow_6</outgoing>
</userTask>
<userTask id="sendToAccounting" name="Send to Accounting for Clarification">
<incoming>Flow_5</incoming>
<outgoing>Flow_7</outgoing>
</userTask>
<endEvent id="endEvent" name="Process End">
<incoming>Flow_6</incoming>
<incoming>Flow_7</incoming>
</endEvent>
<sequenceFlow id="Flow_6" sourceRef="sendToAccountPayable" targetRef="endEvent" />
<sequenceFlow id="Flow_7" sourceRef="sendToAccounting" targetRef="endEvent" />
</process>
<bpmndi:BPMNDiagram id="BPMNDiagram_1">
<bpmndi:BPMNPlane id="BPMNPlane_1" bpmnElement="invoiceProcess">
<bpmndi:BPMNShape id="StartEvent_1_di" bpmnElement="startEvent">
<dc:Bounds x="173" y="102" width="36" height="36" />
</bpmndi:BPMNShape>
<bpmndi:BPMNEdge id="Flow_1_di" bpmnElement="Flow_1">
<di:waypoint x="209" y="120" />
<di:waypoint x="261" y="120" />
</bpmndi:BPMNEdge>
<bpmndi:BPMNShape id="ServiceTask_1_di" bpmnElement="parseInvoice">
<dc:Bounds x="261" y="80" width="100" height="80" />
</bpmndi:BPMNShape>
<bpmndi:BPMNEdge id="Flow_2_di" bpmnElement="Flow_2">
<di:waypoint x="361" y="120" />
<di:waypoint x="411" y="120" />
</bpmndi:BPMNEdge>
<bpmndi:BPMNShape id="UserTask_1_di" bpmnElement="reviewByAccountant">
<dc:Bounds x="411" y="80" width="100" height="80" />
</bpmndi:BPMNShape>
<bpmndi:BPMNEdge id="Flow_3_di" bpmnElement="Flow_3">
<di:waypoint x="511" y="120" />
<di:waypoint x="561" y="120" />
</bpmndi:BPMNEdge>
<bpmndi:BPMNShape id="ExclusiveGateway_1_di" bpmnElement="managerApproval" isMarkerVisible="true">
<dc:Bounds x="561" y="95" width="50" height="50" />
</bpmndi:BPMNShape>
<bpmndi:BPMNEdge id="Flow_4_di" bpmnElement="Flow_4">
<di:waypoint x="611" y="120" />
<di:waypoint x="661" y="120" />
</bpmndi:BPMNEdge>
<bpmndi:BPMNEdge id="Flow_5_di" bpmnElement="Flow_5">
<di:waypoint x="586" y="145" />
<di:waypoint x="586" y="195" />
</bpmndi:BPMNEdge>
<bpmndi:BPMNShape id="UserTask_2_di" bpmnElement="sendToAccountPayable">
<dc:Bounds x="661" y="80" width="100" height="80" />
</bpmndi:BPMNShape>
<bpmndi:BPMNShape id="UserTask_3_di" bpmnElement="sendToAccounting">
<dc:Bounds x="536" y="195" width="100" height="80" />
</bpmndi:BPMNShape>
<bpmndi:BPMNEdge id="Flow_6_di" bpmnElement="Flow_6">
<di:waypoint x="761" y="120" />
<di:waypoint x="811" y="120" />
</bpmndi:BPMNEdge>
<bpmndi:BPMNEdge id="Flow_7_di" bpmnElement="Flow_7">
<di:waypoint x="586" y="275" />
<di:waypoint x="586" y="325" />
</bpmndi:BPMNEdge>
<bpmndi:BPMNShape id="EndEvent_1_di" bpmnElement="endEvent">
<dc:Bounds x="811" y="102" width="36" height="36" />
</bpmndi:BPMNShape>
</bpmndi:BPMNPlane>
</bpmndi:BPMNDiagram>
</definitions>
Remarkable insights:
- GPT seems to omit some of the XMLNS references quite often. This can be easily fixed in post-processing, which may be way more cost-effective than regenerating the response.
- I once got a response where the diagram definition looked like the following:
<bpmndi:BPMNPlane id="BPMNPlane_1" bpmnElement="invoiceProcess">
<!-- Diagram layout omitted for brevity -->
</bpmndi:BPMNPlane>
Even though it’s quite amusing, this is an edge case that we’d likely need to support if aiming for production use. The most common approaches are to add more validations and retries to our application’s code, improve the prompt to direct GPT, or try different temperatures of the model.
Use case - build dynamic forms in React
Another common use case that we can try to automate is building custom forms dynamically. Many applications, such as surveys, custom forms etc., rely on this feature to gain feedback from its candidates, customers or potential leads.
GPT makes it incredibly easy to build a tool that allows a regular user to create such a form by providing natural language input. Let’s use a library called react-jsonschema-form. It’s a React-based library that generates forms based on a JSON definition. It also contains an extensive schema definition that such JSONs can be validated against.
We can leverage the fact that GPT was trained on a large dataset that also includes definitions of commonly used JSON schemas. That means we don’t have to provide the exact schema, and we can refer to it in our prompt by name.
Let’s create a prompt that:
- Specifies that we would like to use the schema of react-jsonschema-form
- Include the user's definition of the ideal form
- Will specify some additional criteria for the output (i.e. make sure that we only get a JSON as a response)
Generate a JSON definition that follows the schema of react-jsonschema-form. Use the following form that is delimited by three backticks
\`
${userInput}
\`
Respond with the JSON only and don't include any additional characters. Make sure that the output is a valid JSON and that it passes validation with the @rjsf/validator-ajv8 validator
Where userInput could be:
The form should allow a customer to provide an order number (starting with letter R followed by 9 digits), customer's phone number (limited only to country code and digits) and a customer support request. There should also be a field for selecting the type of the issue from a list of the following options: order missing, change in order, return request. At the end of the form, there should be a checkbox labeled "I agree to the privacy policy of company X"
In this case GPT gets back to us with
{
"title": "Customer Support Form",
"type": "object",
"required": [
"orderNumber",
"phoneNumber",
"supportRequest",
"issueType",
"privacyPolicyAgreement"
],
"properties": {
"orderNumber": {
"type": "string",
"title": "Order Number",
"pattern": "R[0-9]{9}"
},
"phoneNumber": {
"type": "string",
"title": "Phone Number",
"pattern": "\\+[0-9]+"
},
"supportRequest": {
"type": "string",
"title": "Customer Support Request"
},
"issueType": {
"type": "string",
"title": "Type of Issue",
"enum": [
"order missing",
"change in order",
"return request"
]
},
"privacyPolicyAgreement": {
"type": "boolean",
"title": "I agree to the privacy policy of company X",
"const": true
}
}
}
We can quickly verify it in react-jsonschema-form’s playground: https://rjsf-team.github.io/react-jsonschema-form/
As you can see, it correctly generated a form, together with relevant validations. As with previous examples, we can make modifications by follow-up prompts.
Let’s try making the privacy policy field selectable by default. Sending the following prompt:
The privacy policy agreement should be checked by default
Will result in a new JSON definition of the form:
{
"title": "Customer Support Form",
"type": "object",
"required": ["orderNumber", "phoneNumber", "supportRequest", "issueType"],
"properties": {
"orderNumber": {
"type": "string",
"title": "Order Number",
"pattern": "R[0-9]{9}"
},
"phoneNumber": {
"type": "string",
"title": "Phone Number",
"pattern": "\\+[0-9]+"
},
"supportRequest": {
"type": "string",
"title": "Customer Support Request"
},
"issueType": {
"type": "string",
"title": "Type of Issue",
"enum": ["order missing", "change in order", "return request"]
},
"privacyPolicyAgreement": {
"type": "boolean",
"title": "I agree to the privacy policy of company X",
"default": true
}
}
}
Which looks like that:
As you can see, the checkbox is now selected by default. Similar to that, we can make further tweaks to the existing form.
Use case - build dynamic UIs
In the previous example, we were bound to forms, which are useful but limited in terms of what we can do. To build more general UIs dynamically, we can use a tool called react-json-schema, combine it with GPT, and build fully dynamic complex UIs.
To start with, let’s define some components. We’ll use react-bootstrap to simplify the UI work needed.
Let’s make a wrapper component for the whole app:
function AppRoot({ children }) {
return (
<>
{children}
</>
)
}
export default AppRoot
Then a basic layout component:
import { Container } from 'react-bootstrap';
function AppMainLayout({ children }) {
return (
<Container>
{children}
</Container>
)
}
export default AppMainLayout
A customizable navbar:
import Container from 'react-bootstrap/Container';
import Nav from 'react-bootstrap/Nav';
import Navbar from 'react-bootstrap/Navbar';
function AppNavbar({ title, links }) {
return (
<Navbar bg="primary" data-bs-theme="dark">
<Container>
<Navbar.Brand href="/">{title}</Navbar.Brand>
<Nav className="me-auto">
{links && links.map((link) => {
return <Nav.Link href={link.url} key={link.title}>{link.title}</Nav.Link>
})}
</Nav>
</Container>
</Navbar>
)
}
export default AppNavbar;
A toolbar with some action buttons (that we can wire together later):
import Button from 'react-bootstrap/Button';
function AppToolbar() {
return (
<div className="py-2">
<Button variant="primary" className="mx-1">New user</Button>
<Button variant="secondary" className="mx-1">Refresh</Button>
<Button variant="secondary" className="mx-1">Deactivate</Button>
<Button variant="secondary" className="mx-1">Reset password</Button>
</div>
)
}
export default AppToolbar
And a table for displaying a list of users:
import Table from 'react-bootstrap/Table'
function AppUserList() {
return (
<Table striped bordered hover>
<thead>
<tr>
<th>#</th>
<th>Email</th>
<th>First Name</th>
<th>Last Name</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>test@example.com</td>
<td>John</td>
<td>Doe</td>
</tr>
<tr>
<td>2</td>
<td>jack@example.com</td>
<td>Jack</td>
<td>Dummy</td>
</tr>
<tr>
<td>3</td>
<td>thomas@example.com</td>
<td>Tomas</td>
<td>Jonas</td>
</tr>
<tr>
<td>4</td>
<td>paul@example.com</td>
<td>Paul</td>
<td>Test</td>
</tr>
</tbody>
</Table>
)
}
export default AppUserList
view raw
Then connect it all together in our web application:
import ReactDOM from 'react-dom/client';
import reportWebVitals from './reportWebVitals';
import ReactJsonSchema from 'react-json-schema';
import AppNavbar from './components/AppNavbar'
import AppRoot from './components/AppRoot'
import AppMainLayout from './components/AppMainLayout'
import AppToolbar from './components/AppToolbar'
import AppUserList from './components/AppUserList'
const uiRenderer = new ReactJsonSchema();
uiRenderer.setComponentMap({ AppRoot, AppNavbar, AppMainLayout, AppToolbar, AppUserList });
const uiSchema = await fetchGPTResponse(prompt);
const root = ReactDOM.createRoot(document.getElementById('root'));
root.render(
uiRenderer.parseSchema(uiSchema)
);
It’s easy:
- Initialize react-json-schema parser
- Configure it to recognize our custom components
- Prompt GPT for a schema based on user’s description
- Render the schema that GPT responded with
Now it’s time to define the actual prompt to be used:
Generate a JSON that follows the react-json-schema format for the following set of components and their props delimited by three backticks:
\`
AppRoot (children: Node[]) , AppNavbar (title: string, links: Link[]), AppMainLayout (children: Node[]) , AppToolbar (), AppUserList (), h1 (text: string), h2 (text: string)
\`
The following custom types are defined in the application
\`
type Link = {
title: string;
url: string;
}
\`
The JSON should build a UI following a specification delimited by three backticks
\`
The UI should start with a navbar, a title specifying that this is a dashboard for managing users. Below the title there should be a more detailed description of the dashboard. The main element of the UI should be a table with a list of users. Above the table there should be a toolbar allowing to modify the list of users.
\`
The UI must be wrapped in the AppRoot component. Specify the props directly in the same object as the component, without an additional props object. Only use the props that are defined above. The output should be a valid react-json-schema specification. Don't output any additional characters.
Remarkable insights: As you can see, we started with a list of available components, followed by an additional list of props available for each of them. My tests have shown that it’s also helpful to explicitly state how to provide props (without a nested props object). Without that remark, GPT put it inside a nested object, which didn’t work with react-json-schema library. My assumption is that react-json-schema is not a very popular library and lacks extensive examples online - hence the need to point GPT in the right direction.
A sample GPT response will look like:
{
"component": "AppRoot",
"children": [
{
"component": "AppNavbar",
"title": "User Management Dashboard",
"links": [
{
"title": "Home",
"url": "/"
},
{
"title": "Settings",
"url": "/settings"
}
]
},
{
"component": "AppMainLayout",
"children": [
{
"component": "h1",
"text": "User Management Dashboard"
},
{
"component": "h2",
"text": "Manage and oversee user accounts and permissions"
},
{
"component": "AppToolbar",
"children": []
},
{
"component": "AppUserList",
"children": []
}
]
}
]
}
PICS Which renders a quite decent dashboard! GPT was able to infer that the majority of the layout should go inside a component called AppMainLayout, but didn’t put the AppNavbar there. It also filled the navbar with some example links.
Use case - querying an ElasticSearch index
ElasticSearch is commonly used for building indexes of products or application logs. We can use GPT to translate user’s queries to an ElasticSearch query.
The approach would be:
- Specify that we’d like to get an ElasticSearch query as a result
- Provide information about the structure of the index
- Provide information about the user's query
- Specify additional criteria for the output
Example: We have an ElasticSearch index with the following definition:
{
"products_production": {
"aliases": {},
"mappings":{
"spree_product": {
"properties": {
"available_on": {
"type":"date",
"format":"dateOptionalTime"
},
"brand_id": {
"type":"keyword"
},
"created_at": {
"type":"date",
"format":"dateOptionalTime"
},
"description": {
"type":"text",
"analyzer":"snowball"
},
"discontinue_on": {
"type":"date",
"format":"dateOptionalTime"
},
"featured": {
"type":"boolean"
},
"name": {
"type":"text",
"boost":100.0,
"fields": {
"untouched": {
"type":"keyword"
}
},
"analyzer":"snowball"
},
"price": {
"type":"double"
},
"sku": {
"type":"keyword"
},
"suggestion_terms": {
"type":"text",
"analyzer":"simple"
},
"taxon_ids": {
"type":"keyword"
}
}
}
},
"settings": {
"index": {
"number_of_shards":"1",
"provided_name":"products_production",
"creation_date":"1684177003621",
"analysis": {
"filter": {
"nGram_filter": {
"token_chars": [
"letter",
"digit",
"punctuation",
"symbol"
],
"min_gram": "3",
"type": "nGram",
"max_gram": "4"
}
},
"analyzer": {
"nGram_analyzer": {
"filter":[
"lowercase",
"asciifolding",
"nGram_filter"
],
"type": "custom",
"tokenizer": "whitespace"
},
"whitespace_analyzer": {
"filter": [
"lowercase",
"asciifolding"
],
"type": "custom",
"tokenizer": "whitespace"
}
}
},
"number_of_replicas": "1",
"uuid": "WV8HWbS_QsudGKL223hwrA",
"version": {
"created": "6080091"
}
}
}
}
}
Create an elasticsearch query json that operates on an index with the following definition delimited by three backticks
\`
${indexDefinition}
\`
The query should return the results described here delimited by three backticks:
\`
Return the list of the most expensive products that are currently available, aren't featured and don't have any taxons associated with them.
\`
Return only a JSON and make sure it can be interpreted by an elasticsearch 6.8 instance
It’s worth mentioning that this approach should never be used on an ElasticSearch that’s not meant to be fully accessible by the user due to the potential risk of making arbitrary queries to the cluster.
It comes back with the desired query:
{
"query": {
"bool": {
"must": [
{
"range": {
"available_on": {
"lte": "now"
}
}
},
{
"term": {
"featured": false
}
},
{
"bool": {
"must_not": {
"exists": {
"field": "taxon_ids"
}
}
}
}
]
}
},
"sort": [
{
"price": {
"order": "desc"
}
}
]
}
We can now pass it to ElasticSearch and get the results:
{
"took": 51,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 435,
"max_score": null,
"hits": [
{
"_index": "products_production",
"_type": "product",
"_id": "1694",
"_score": null,
"_source": {
"name": "Asus ROG Zephyrus S Ultra Slim Gaming Laptop, 15.6\u0026quot; 144Hz IPS Type FHD, GeForce RTX 2070, Intel Core i7-9750H, 16GB DDR4, 512GB PCIe Nvme SSD, Aura Sync RGB, Windows 10, GX531GW-AB76",
"description": "\u003cp\u003eNvidia GeForce RTX 2070 8GB DDR6 (base: 1080 MHz, Boost: 1305 MHz; TDP: 90W). 230W Power Adaptor. Support Type-c PD 3.0 up to 65W Latest Intel Core i7-9750h Hexa-Core processor; Bluetooth 5.0 15.6\u0026rdquo; 144Hz 3ms IPS-Type full HD (1920x1080) Display 16GB 2666Hz DDR4 | 512GB PCIe NVMe SSD | Windows 10 Home 0.62\u0026rdquo; thin, 4.6 lbs ultraportable military-grade magnesium alloy body with premium metal cover Rog active Aerodynamic System (AAs) with upgraded 12V fans and anti-dust tunnels to preserve cooling performance and system stability Gigabit wave 2 Wi-Fi 5 (802.11AC) | customizable 4-zone ASUS Aura RGB Gaming Keyboard The all new ROG Zephyrus s gaming Laptop is redefining ultra-slim gaming laptops yet again with innovative engineering to cool Its 9th Gen Intel Core i7 processor and GeForce RTX Max-Q graphics, so you can immerse yourself in its no-compromise 144Hz/3ms Display. *The actual transfer speed of USB 3.0, 3.1 (Gen 1 and 2), and/or Type-C will vary depending on many factors including the processing speed of the host device, file attributes and other factors related to system configuration and your operating environment.\u003c/p\u003e",
"taxon_ids": [],
"brand_id": null,
"available_on": "2019-12-02T00:00:00.000Z",
"discontinue_on": null,
"featured": false,
"price": "1595.73",
"created_at": "2019-12-02T20:07:26.594Z",
"suggestion_terms": ["Asus ROG Zephyrus S Ultra Slim Gaming Laptop, 15.6\u0026quot; 144Hz IPS Type FHD, GeForce RTX 2070, Intel Core i7-9750H, 16GB DDR4, 512GB PCIe Nvme SSD, Aura Sync RGB, Windows 10, GX531GW-AB76"]
},
"sort": [1595.73]
},
{
"_index": "products_production",
"_type": "product",
"_id": "1470",
"_score": null,
"_source": {
"name": "Weber - Genesis II S-435 62006001 Stainless Steel",
"description": "\u003cp\u003eA grill that has it all. A powerful grilling engine, impressive features, and an outstanding 10 year warranty on all parts of the grill to back it up. The immense space and sear station will execute as many steaks as you need to perfection, while the side burner saut\u0026eacute;s veggies. Premium stainless steel cooking grates retain even heat and are easy to clean. The Tuck-Away warming rack keeps food warm while the main dish grills underneath. And between cookouts, store your grilling tools and accessories in the enclosed grill cabinet. Those are just a few features that set the Genesis II S-435 apart from your typical grill.\u003c/p\u003e\r\n\u003cul\u003e\r\n\u003cli\u003ePropane Grill fueled by propane\u003c/li\u003e\r\n\u003cli\u003eElectric Ignition Electronic ignition makes it easy to start\u003c/li\u003e\r\n\u003cli\u003eSide Burner Side burner for cooking versatility.\u003c/li\u003e\r\n\u003cli\u003e9,000 BTU-per-hour input Sear Station burner\u003c/li\u003e\r\n\u003cli\u003eStainless steel heat deflectors\u003c/li\u003e\r\n\u003cli\u003eStainless steel lid with thermometer\u003c/li\u003e\r\n\u003cli\u003eEnclosed cabinet with stainless steel doors\u003c/li\u003e\r\n\u003cli\u003eSix tool hooks\u003c/li\u003e\r\n\u003cli\u003eTwo heavy-duty front locking swivel casters and two large wheels\u003c/li\u003e\r\n\u003c/ul\u003e",
"sku": "62006001",
"taxon_ids": [],
"brand_id": null,
"available_on": "2019-11-03T00:00:00.000Z",
"discontinue_on" :"2019-11-04T00:00:00.000Z",
"featured": false,
"price": "1315.5",
"created_at": "2019-11-03T19:02:48.694Z",
"suggestion_terms": ["Weber - Genesis II S-435 62006001 Stainless Steel"]
},
"sort" :[1315.5]
},
// more results here
]
}
}
Follow-up use case - building a query in a query language that the user may not know
This feature is useful when a user needs to query a tool with its own specific query language or with a structure that the user doesn’t know. AWS seems to be running an A/B test of a feature where you can generate a CloudWatch search query based on a natural language input.
Use case - data visualization
Interesting use case - you allow users to pull report data (by e.g. querying an ElasticSearch index or a database). This is often combined with visualizing the data. Could we allow the user to also define how they’d like the data presented? Let’s say a manager needs a pie chart for their PowerPoint presentation.
Fortunately, there’s a tool called Vega which is a visualization grammar that allows you to define graphs using a JSON schema. It also comes with Vega-Lite, which allows us to easily render these charts using JavaScript. There’s also an interactive playground that we can use for testing our diagrams.
Let’s take a sample CSV file from Vega-Lite’s demo that contains the stock prices history of a few tech companies. It will be easy for us to reference it on the playground.
The file is a regular CSV that contains a few columns:
symbol,date,price
MSFT,Jan 1 2000,39.81
MSFT,Feb 1 2000,36.35
MSFT,Mar 1 2000,43.22
Let’s try to generate some charts that utilize this structure.
Generate a JSON using vega-lite schema that takes input data from file data/stocks.csv. The visualization should work on a dataset with the following columns delimited by three backticks
\`
symbol date price
\`
The visualization should result in the following result delimited by three backticks
\`
I would like to get a line chart of each of the stocks over time. There should also be a second line chart that shows only MSFT stock. Finally, I would like to see a pie chart of the average stock prices for each symbol
\`
Return only the resulting JSON and don't include any additional characters. Make sure that the visualization conforms to the vega-lite schema.
After trying it out with GPT we get the following definition
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {"url": "data/stocks.csv"},
"vconcat": [
{
"title": "Stock Prices Over Time",
"mark": "line",
"encoding": {
"x": {"field": "date", "type": "temporal", "title": "Date"},
"y": {"field": "price", "type": "quantitative", "title": "Price"},
"color": {"field": "symbol", "type": "nominal", "title": "Symbol"}
}
},
{
"title": "MSFT Stock Price Over Time",
"transform": [{"filter": "datum.symbol === 'MSFT'"}],
"mark": "line",
"encoding": {
"x": {"field": "date", "type": "temporal", "title": "Date"},
"y": {"field": "price", "type": "quantitative", "title": "Price"}
}
},
{
"title": "Average Stock Prices",
"transform": [{"aggregate": [{"op": "mean", "field": "price", "as": "average_price"}], "groupby": ["symbol"]}],
"mark": "arc",
"encoding": {
"theta": {"field": "average_price", "type": "quantitative", "title": "Average Price"},
"color": {"field": "symbol", "type": "nominal", "title": "Symbol"}
}
}
]
}
Which gives us the desired set of charts
There’s also an interesting project on Github that showcases a similar concept. Their base prompt looks slightly differently:
You are a great assistant at vega-lite visualization creation. No matter what the user asks, you should always respond with a valid vega-lite specification in JSON.
You should create the vega-lite specification based on user's query.
Besides, Here are some requirements:
1. Do not contain the key called 'data' in vega-lite specification.
2. If the user ask many times, you should generate the specification based on the previous context.
3. You should consider to aggregate the field if it is quantitative and the chart has a mark type of react, bar, line, area or arc.
4. Consider to use bin for field if it is a chart like heatmap or histogram.
5. The available fields in the dataset and their types are:
${metas
.map((field) => `${field.name} (${field.semanticType})`)
.join(", ")}
Good practices - learn from the Upside team's experiences
Once the request was submitted to ChatGPT, we should run a JSON schema (or XML) validator on the response it provides. Despite its reliability, it may act differently than expected. Sometimes a retry will be enough to get a result that’s consistent with the defined structure. On the user experience side, there are several approaches, but you should tailor them to your specific needs. A good example of JSON schema validation is using Vega-Lite’s JSON Schema, available here.
You can apply the same method with any JSON schema you create, like for a spaghetti recipe.
Complexity - try not to bite more than GPT can chew
The prompts can’t be too long, as GPT may lose information and context with complicated inputs. You can use clever prompting engineering to break a task into smaller ones that gradually reduces the cognitive load and follows a logical reasoning process. For example, instead of asking GPT to format a recipe in a specific JSON schema, at first break it down into steps “Break this recipe down into steps". Once that’s achieved - "Structure this as the following JSON schema". It’s about breaking it down into steps, where each step builds on the understanding of the previous one and aligns it with the defined structure.
Performance - being patient is the key
Generating outputs with GPT is quite time-consuming, especially if we need to process multiple prompts in order to achieve the desired use case. We can expect improvements when it comes to hardware, so this won’t always be an issue, but for now, we need to focus on the UX level to cover up these issues. Improving UX in such cases is challenging, and often the best solution is to stream the response in smaller parts. However, if the model needs to return a full result (e.g. a JSON schema), this approach doesn’t work.
Cost - manage your expenses
However, using e.g. GPT-4 can be quite costly at scale. I wouldn’t be too worried about that in the long term, as like any technology, it will get cheaper over time, but for now it’s good to keep an eye on how much money is spent. The most sensible approach is to perform as much preprocessing as possible before pushing input to GPT, in order to limit the cost of the actual GPT usage.
Reliability - fine-tuning for diverse inputs
Playing around with these examples makes it quite easy to get it to a point where it fairly consistently delivers a great result. However, once you launch a GPT-based feature to a wider audience that will use it with different input data, you may need plenty of fine-tuning in order to ensure the solution's reliability. You never really know what the user’s will type into the chat box once you deploy your app. As developers of the feature, we are usually biased. User testing those solutions is important, especially since using a chat interface to solve actual problems (rather than just getting answers) is still a pretty novel idea and people have different expectations.
Prompt injection risks - always double check
It goes without saying that you need to be careful about the responses it may generate // (Optional) Screenshot of DPD example (the one where a customer prompted it to write a song about a shitty delivery company) ![[dpd_chatbot_fail-1.png]]
Summary
GPT does a pretty good job in parsing data. Basing on the experiments described in the post, we can define a simple framework that can handle diverse input and output formats:
- Specify the desired schema
- Specify the details of available fields (when needed)
- Specify the input data
- Reinforce important details (e.g. that the response should be a JSON and only a JSON)
Even now, there are many applications that can utilize GPT, without being AI-first. It doesn’t need to be a whole GPT-based experience, it can supplement the existing experiences and improve user productivity.