# Would You Try a GPT-4 Vision to Web Development Tool?

## Metadata

- **Published:** 11/20/2023
- **Duration:** 7 minutes
- **YouTube URL:** https://youtube.com/watch?v=sj9KZtFfF1E
- **Channel:** nerding.io

## Description

Use GPT-4 Vision Transform Screenshots into HTML, CSS, JS Code
🌟 Discover the magic of "screenshot-to-code", an open source AI tool that converts screenshots into HTML, Tailwind CSS, and JavaScript code. This tutorial delves into the capabilities of OpenAI's GPT-4 Vision and DALL-E 3 in web design and development.

What You'll Learn:
- AI in Web Development: Unveiling the potential of GPT-4 Vision and DALL-E 3 in coding.
- Easy Code Generation: How to convert designs into clean HTML, Tailwind CSS, and JavaScript.
- Practical Application: Real-world examples of the tool in action.

📰 FREE eBooks & News: https://sendfox.com/nerdingio
👉🏻 Product of the Day:  https://www.producthunt.com/posts/ever-efficient-ai
📞 Book a Call: https://calendar.app.google/M1iU6X2x18metzDeA

🎥 Chapters
00:00 Introduction
00:23 Screenshot to Code
01:08 Dashboard
01:38 Configuration
02:12 Code streaming
02:32 Generation
03:04 Repository
04:17 Prompt
06:18 Image API
06:50 Logging LLM runs
07:05 Conclusion

🔗 Links
https://github.com/abi/screenshot-to-code/tree/main
https://picoapps.xyz/free-tools/screenshot-to-code

⤵️ Let's Connect
https://everefficient.ai
https://nerding.io
https://twitter.com/nerding_io
https://www.linkedin.com/in/jdfiscus/
https://www.linkedin.com/company/ever-efficient-ai/

## Key Highlights

### 1. GPT-4 Vision Powers Screenshot-to-Code

The tool uses GPT-4 Vision to convert a screenshot into HTML code using Tailwind CSS. It's open source and acts as a no-code solution. 

### 2. Live Code Generation & Preview

The tool streams the generated code in real-time, allowing users to see the HTML being built and preview the output simultaneously. CDN scripts & comments are generated. 

### 3. DALL-E Integration for Image Generation

The tool uses DALL-E to generate images for placeholders in the HTML, although this feature can be disabled. Useful for populating dynamic content. 

### 4. Customizable Prompt Engineering

The prompt defines the AI as a Tailwind developer, focusing on exact screenshot replication, background colors, and code comments. The prompt can be modified for accessibility (WCAG) compliance.

### 5. Low Cost of Generation

Generating the HTML from the screenshot cost about 21 cents. This is highlighted as a reasonable cost for the outcome.


## Summary

## Summary: GPT-4 Vision Powered "Screenshot-to-Code" Tool

**1. Executive Summary:** This video explores "screenshot-to-code," an open-source AI tool powered by GPT-4 Vision and DALL-E 3, that converts screenshots into functional HTML code using Tailwind CSS and JavaScript. The video demonstrates its capabilities, discusses its cost, and highlights opportunities for customization and expansion.

**2. Main Topics Covered:**

*   **Introduction to Screenshot-to-Code:** Overview of the tool, its purpose, and open-source nature.
*   **Live Demonstration:** Using the tool to convert a screenshot of an OpenAI API usage graph into HTML.
*   **Code Generation and Preview:** Observing the live code streaming, generated HTML, and preview output.
*   **DALL-E Integration:** Understanding the use of DALL-E to generate placeholder images, and the option to disable this feature.
*   **Repository Overview:** Examining the underlying code, including Python backend and React/Tailwind frontend.
*   **Prompt Engineering:** Analysis of the AI prompt, highlighting its instructions for Tailwind development, screenshot replication, and code commenting.
*   **Image API:** How the tool uses the GPT-4 vision preview.
*   **Logging LLM runs:** Explanation of tracking model usage.
*   **Customization & Expansion:** Potential for modifying the prompt to improve accessibility (WCAG compliance) and tailor the output.

**3. Key Takeaways:**

*   **GPT-4 Vision's Capabilities:** Demonstrates the power of GPT-4 Vision in transforming visual designs into functional code.
*   **Accessibility:** The tool acts as a no-code solution for quick web development.
*   **Customization:** The AI prompt is customizable, allowing users to fine-tune the output and add features like WCAG compliance.
*   **Low cost of generation:** The tool is very inexpensive, costing around 21 cents to generate an image.
*   **Real-time Code Streaming:** Live feedback on code generation and HTML output.
*   **Integration:** Can be used with other tools for greater customization.

**4. Notable Quotes or Examples:**

*   "It's kind of like a no code solution but it's open source and what it's using is chat GPT Vision 4 to be able to take a screenshot and convert it into HTML specifically using Tailwind CSS."
*   "Generating the HTML from the screenshot cost about 21 cents."
*   The prompt is defined as a "Tailwind developer... building single page apps using Tailwind HTML and JavaScript" to focus the output.
*   Discussing prompt customization: "One thing that we could talk about here in this prompt is we could actually tell it to be more compliant with WCAG."

**5. Target Audience:**

*   Web developers interested in AI-assisted coding.
*   No-code/low-code enthusiasts looking for innovative development tools.
*   Designers seeking a way to quickly prototype designs into functional HTML.
*   Anyone interested in the capabilities of GPT-4 Vision and DALL-E 3 for web development.


## Full Transcript

hey everyone welcome to nerding IO I'm JD and today we're going to be looking at a free tool called screenshot to code it's kind of like a no code solution but it's open source and what it's using is chat GPT Vision 4 to be able to take a screenshot and convert it into HTML specifically using Tailwind CSS all right so let's go ahead and dive in we're going to be looking at the screenshot to code repository they have some interesting examples here it's just a YouTube clone they you can notice that it has like a a live preview we're going to go through some of that uh they do have this try it here and that's what we're going to be using today before we actually go through the code and so all you really need to do is bring your own API key and make sure that you have access to GPT for vision if you haven't seen our previous video on GPT 4 Vision we have a couple different videos go ahead and check them out they go go through some Basics and some pretty neat tips and tricks all right so what we're going to use for this is we're actually going to use our open AI API usage uh graph and see how well it does with a screenshot of this um I did track it costs about like 21 cents to run the generator um which is fairly good for I think the cost I mean if you look it it actually tells you which image models and when it's using gp4 uh the turbo actually might be from usage today so we'll take a screenshot of this and then what we're going to do is actually go to the example so on this example the first thing you need to do is go ahead and put in your open AI key and save that and you can then click the uh to upload your image so while it's generating you can actually watch it being built and see the code that it's streaming so you can see right here it's starting to pull different information in uh it's even recognizing different color changes and states and you can actually watch the code as it's streaming in I thought this was really interesting and really cool the other things to note is it has things like code comments as well as the CDN scripts in order for it to work and different views to try the resp responsiveness I have noticed though that every single time this is a little bit different that's fairly normal I would say um sometimes I've actually gotten the progress bar to show up um and now that it's finished what it did is it took those placeholder images and actually made them into images using Dolly real quick if you haven't liked and subscribed already please do we also have a newsletter where you can get extra free resources it helps more than you know and with that let's get back to it so now we're going to jump back into the code and just kind of look at some of the things that they're using if you wanted to download this yourself so you do need python you're going to need something called poetry which can be in installed via pip uh and then you just need to run yarn on your front end so we're going to take a quick look at like the package to understand what's in the front end and then specifically we're going to look at the back end uh so they're using radx UI and then react uh and then Tailwind as well so but for the back end it's really interesting what they're using so not only is it generating images by using beautiful soup uh which is actually using the um the placeholder or placehold doco but then it's also generating the images which we saw after the fact using Dolly again this this can be turned off with a flag so it can be uh either used or not used and then what they're doing is if you look at the prompt itself it's it's doing a couple different things it's defining itself as a Tailwind developer it does say that you're building uh single page apps using Tailwind HTML and JavaScript you might be given a screenshot so that's pretty interesting that they're they're explaining you might have a screenshot take the screenshot make it look exact it does talk about paying close attention to the background color um which we saw that in like even in the uh the font changes and things like that the uh other things it actually addresses the the code comments which I thought was pretty interesting so giving it a little bit more than uh like navigation and things like that and actually explaining different items one thing that we could talk about here in this prompt is we could actually tell it to be U more compliant with wayag and that's one again we've covered this previously in both in the chat GPT Vision side but then also in looking at uh a gp4 uh assist in so if you're interested in those videos please uh check at the very end we'll we'll make sure that they're linked the other thing to note is that it's actually calling out the scripts that it wants to use so the CDN for Tailwind CSS as well as the font awesome icons it's specifically calling out that version uh and then it's telling it not to not only to return the full code but also not to include any of the markdown so so basically when you're you're using the open AI dashboard uh it it wraps everything in these markdown ticks so that it can actually display the code uh for you visually so I found that really interesting so now we have looked at the promets there was one other thing I just wanted to go over and it's how it's actually getting called in the llm so it's actually using the gp4 vision preview in order to send that uh that message so defining the model and then right here we have our message which is going to be coming through uh as part of the The Prompt itself again it's streaming which we saw the code coming in live so Not only was it building a live preview but it was also letting us uh look at the code as we were reviewing it so again I thought this was a a pretty cool kind of no code solution um and with that you know I you could definitely expand on the prompt and and really take it to wherever you want to go all right thanks everyone today we learned how to use screenshot to code we went through their online tool as well as dug into their prompt looked at some of the code that's being generated which is Tailwind CSS and if you haven't already please remember to like And subscribe we also have a free newsletter with different types of resources and with that happy nerding

---

*Generated for LLM consumption from nerding.io video library*