Audio 2 Text 2 Image Generation with Analog & Cloudflare Worker AI

Publish Date: Apr 4 '24

20 7

This is a submission for the Cloudflare AI Challenge.

What I Built

This is simple app where you generate images from text input.

Demo

Demo link: https://cloudflare-challange.pages.dev/

My Code

You can check my code at: https://github.com/dalenguyen/cloudflare-challenge

Journey

This is an interesting challenge since I haven't used CloudFlare Pages to deploy web applications. Turns out that, the deployment process is really straightforward and can be done via Cloudflare dashboard.

Another thing is that this's done with Analog - a full-stack Angular meta framework which means that you create an entire application with full support from backend.

Here is the stack detail:

Analog
Nx Workspace
Github
Cloudflare Pages
Worker AI
@cf/bytedance/stable-diffusion-xl-lightning for text to image model generation
@cf/openai/whisper for audio to text
uform-gen2-qwen-500m for image to text

Multiple Models and/or Triple Task Types

I combined three 3 models to do different tasks that support image generation:

Audio to text: listen to voice command and apply to the input field
Text to image: generate image from text input
Image to text: provide further description on generated image

Comments 7 total

UzonduApr 5, 2024
Great job Dale Nguyen. You were fast ⚡. For me though , it isn't that straight forward. At first i thought you could create any frontend application locally, then
deploy it to cloudflare workers. Before that or while deploying it you could then integrate its AI models. So i thought maybe i would only use HTML CSS and JavaScript. But then i realized that
these applications created using cloudflare features , use other files and languages such as typescript,
.json files and others which i haven't been exposed to. So how do i do this ? I am supposed to learn new languages ? If so which?
- Dale NguyenApr 5, 2024
  All you need is JavaScript. TypeScript is basically JavaScript with type.
  
  In my case, there're a frontend app and a backend that handles requests from the frontend. It's because the AI Token shouldn't be exposed on the frontend, so you have to hide it in the backend. Any meta framework can help you with that such as Next.js.
  
  If you want to fast start, try the example from Cloudflare and start from there: developers.cloudflare.com/workers-...
  - UzonduApr 5, 2024
    Thanks alot for this information Dale Nyugen , i am very grateful 🤝💌.
    I'll check it out.
    - Dale NguyenApr 5, 2024
      Cool. Keep me posted. If you have any questions, just let me know :D
      - UzonduApr 5, 2024
        I have a little problem. While I was trying to create a new worker on Cloudflare dashboard ,
        there at the bottom of the page the 10020 error message kept throwing up. If you have experienced this or you
        know about this maybe you can help me out. However , if don't know about this at all , then sorry for the trouble. By the way I this problem occurred on Edge, Chrome and Firefox.
        
        Dale NguyenApr 5, 2024
        I haven't seen it. If you want to chat about the cloudflare, you can join the discord. I'm also there :D
        
        discord.gg/cloudflaredev
Jess LeeApr 5, 2024
Nice work!

Add comment

Dale Nguyen @dalenguyen