🎨JSON Style Guides for Controlled Image Generation with GPT-4o and GPT-Image-1

Image generation with GPT-4o and GPT-Image-1 can yield visually stunning results—but without clear instructions, results may vary. Using JSON style guides is a powerful way to bring clarity, structure, and repeatability to your prompts. This tutorial will walk you through why JSON style guides matter, how to use them effectively, and provide a complete reference to all parameters you can define.

🚀 Why Use a JSON Style Guide?

Natural language is powerful but often ambiguous. By organizing your image prompts using JSON:

✅ You eliminate ambiguity with structured fields.
✅ You ensure consistency across multiple generations.
✅ You can automate or scale prompt creation for batch processing.
✅ You separate content from style, making iterations easier.
✅ Developers and designers can work together using shared, machine-readable formats.

🛠️ How to Use a JSON Style Guide

A JSON prompt is simply a structured document specifying everything you want the model to include. Here’s a simple example:

{
  "scene": "a magical forest clearing",
  "subjects": [
    {
      "type": "fox",
      "description": "wearing a wizard hat, sitting on a tree stump",
      "position": "center"
    }
  ],
  "style": "storybook illustration",
  "color_palette": ["forest green", "gold", "midnight blue"],
  "lighting": "soft dappled sunlight",
  "mood": "whimsical and cozy",
  "background": "glowing mushrooms and tall trees",
  "composition": "eye-level view, centered subject"
}

This structure gives the model explicit, interpretable instructions for what to render and how.

📚 Parameter Reference

Here’s a breakdown of possible fields you can use in a JSON style guide.

1. `scene`

A short overview of the entire setting or environment.

Example: "a futuristic city at sunset"

2. `subjects` (array of objects)

Describes each key subject in the image. Each subject can include:

{
  "type": "robot",
  "description": "silver body with glowing blue eyes",
  "position": "foreground",
  "pose": "standing upright",
  "size": "large",
  "expression": "neutral",
  "interaction": "looking at a floating screen"
}

3. `style`

The artistic or visual rendering style.

Examples: "photorealistic", "watercolor", "pixel art", "cyberpunk", "anime"

4. `color_palette`

An array of dominant and accent colors.

Example: ["emerald green", "burnt orange", "charcoal"]

5. `lighting`

How the image is lit.

Examples: "sunset backlight", "soft studio lighting", "glow from below"

6. `mood`

The emotional tone or atmosphere.

Examples: "peaceful", "dramatic", "eerie", "playful"

7. `background`

The scenery or backdrop.

Examples: "mountain landscape", "white cyclorama", "dreamy nebula sky"

8. `composition`

Overall layout and positioning.

Examples: "symmetrical", "rule of thirds", "top-down shot", "portrait orientation"

9. `camera`

Virtual photography settings.

{
  "angle": "eye-level",
  "distance": "medium shot",
  "lens": "wide-angle",
  "focus": "sharp subject, blurred background"
}

10. `medium`

Simulated medium or format.

Examples: "oil painting", "3D render", "ink drawing", "chalkboard sketch"

11. `textures`

Surface qualities and tactile impressions.

Examples: "soft velvet", "rusty metal", "wet pavement"

12. `resolution`

Intended resolution or output size.

Examples: "4K", "web banner", "Instagram square"

13. `details`

Extra fine-tuned attributes.

{
  "clothing": "flowing red cape",
  "weather": "light snowfall",
  "facial_features": "freckles and sharp jawline",
  "material": "glass and brass",
  "ornaments": "glasses, ring"
}

14. `effects`

Special effects or visual treatments.

Examples: "lens flare", "bokeh blur", "double exposure", "film grain"

15. `inspirations`

Known references to guide visual style.

Examples: "inspired by Studio Ghibli", "in the style of Van Gogh", "similar to Blade Runner"

🧪 Example Use Cases

Fantasy Character Concept Art

{
  "scene": "mountaintop at sunrise",
  "subjects": [
    {
      "type": "warrior elf",
      "description": "leather armor, long silver hair",
      "pose": "standing with sword raised",
      "position": "foreground"
    }
  ],
  "style": "digital painting",
  "color_palette": ["misty gray", "light gold", "teal"],
  "lighting": "sunrise backlight",
  "mood": "heroic and calm",
  "background": "foggy mountains",
  "composition": "rule of thirds",
  "camera": {
    "angle": "low angle",
    "distance": "medium shot",
    "focus": "sharp on character"
  }
}

Product Mockup

{
  "scene": "minimalist white studio",
  "subjects": [
    {
      "type": "smartwatch",
      "description": "silver frame with red strap",
      "position": "center",
      "pose": "lying flat"
    }
  ],
  "style": "photorealistic",
  "lighting": "diffused light from above",
  "mood": "clean and sleek",
  "background": "white gradient",
  "composition": "centered product with top view",
  "resolution": "4K"
}

Realistic Scene with two Characters

{
  "scene": "urban café terrace in Paris during golden hour",
  "subjects": [
    {
      "type": "young woman",
      "description": "30s, Black hair in a bun, wearing a white blouse and tan trench coat, holding a coffee cup",
      "pose": "sitting at a café table, leaning forward slightly",
      "position": "left foreground",
      "expression": "engaged, smiling softly"
    },
    {
      "type": "young man",
      "description": "30s, light brown curly hair, wearing a navy blue jacket and scarf, gesturing with one hand",
      "pose": "sitting across from the woman, mid-conversation",
      "position": "right foreground",
      "expression": "animated, talking"
    }
  ],
  "style": "hyper-realistic photography",
  "lighting": "natural golden hour light with soft shadows and sun flare",
  "mood": "warm and intimate",
  "background": {
    "elements": ["street with bicycles", "café signage", "distant pedestrians"],
    "depth_of_field": "shallow, blurred background"
  },
  "composition": "framed using the rule of thirds, both characters centered with table between them",
  "camera": {
    "angle": "eye level",
    "distance": "medium close-up",
    "focus": "sharp on characters' faces"
  },
  "color_palette": ["warm gold", "beige", "navy", "soft rose", "espresso brown"],
  "props": ["ceramic coffee cups", "croissants on a small plate", "notebook and pen on table"],
  "resolution": "4K"
}

Using JSON style guides gives you a consistent, modular, and precise way to control image generation. Whether you're creating a portfolio of characters, designing branded assets, or prototyping environments, structured prompts give you the power to communicate with clarity and scale with confidence.

And don’t hesitate to use ChatGPT to refine or co-create your JSON Style Guides! It can turn vague ideas into structured, generation-ready prompts in seconds.

raphiki @raphiki