Audio 2 Text 2 Image Generation with Analog & Cloudflare Worker AI
Dale Nguyen

Dale Nguyen @dalenguyen

About: JavaScript Enthusiast :)

Location:
Toronto
Joined:
Jun 18, 2019

Audio 2 Text 2 Image Generation with Analog & Cloudflare Worker AI

Publish Date: Apr 4 '24
20 7

This is a submission for the Cloudflare AI Challenge.

What I Built

This is simple app where you generate images from text input.

Demo

Image description

Demo link: https://cloudflare-challange.pages.dev/

My Code

You can check my code at: https://github.com/dalenguyen/cloudflare-challenge

Journey

This is an interesting challenge since I haven't used CloudFlare Pages to deploy web applications. Turns out that, the deployment process is really straightforward and can be done via Cloudflare dashboard.

Another thing is that this's done with Analog - a full-stack Angular meta framework which means that you create an entire application with full support from backend.

Here is the stack detail:

  • Analog
  • Nx Workspace
  • Github
  • Cloudflare Pages
  • Worker AI
  • @cf/bytedance/stable-diffusion-xl-lightning for text to image model generation
  • @cf/openai/whisper for audio to text
  • uform-gen2-qwen-500m for image to text

Multiple Models and/or Triple Task Types

I combined three 3 models to do different tasks that support image generation:

  • Audio to text: listen to voice command and apply to the input field
  • Text to image: generate image from text input
  • Image to text: provide further description on generated image

Comments 7 total

  • Uzondu
    UzonduApr 5, 2024

    Great job Dale Nguyen. You were fast ⚡. For me though , it isn't that straight forward. At first i thought you could create any frontend application locally, then
    deploy it to cloudflare workers. Before that or while deploying it you could then integrate its AI models. So i thought maybe i would only use HTML CSS and JavaScript. But then i realized that
    these applications created using cloudflare features , use other files and languages such as typescript,
    .json files and others which i haven't been exposed to. So how do i do this ? I am supposed to learn new languages ? If so which?

    • Dale Nguyen
      Dale NguyenApr 5, 2024

      All you need is JavaScript. TypeScript is basically JavaScript with type.

      In my case, there're a frontend app and a backend that handles requests from the frontend. It's because the AI Token shouldn't be exposed on the frontend, so you have to hide it in the backend. Any meta framework can help you with that such as Next.js.

      If you want to fast start, try the example from Cloudflare and start from there: developers.cloudflare.com/workers-...

      • Uzondu
        UzonduApr 5, 2024

        Thanks alot for this information Dale Nyugen , i am very grateful 🤝💌.
        I'll check it out.

        • Dale Nguyen
          Dale NguyenApr 5, 2024

          Cool. Keep me posted. If you have any questions, just let me know :D

          • Uzondu
            UzonduApr 5, 2024

            I have a little problem. While I was trying to create a new worker on Cloudflare dashboard ,
            there at the bottom of the page the 10020 error message kept throwing up. If you have experienced this or you
            know about this maybe you can help me out. However , if don't know about this at all , then sorry for the trouble. By the way I this problem occurred on Edge, Chrome and Firefox.

  • Jess Lee
    Jess LeeApr 5, 2024

    Nice work!

Add comment