# 10 Ways Chat GPT Vision Can Help Developers ## Metadata - **Published:** 10/18/2023 - **Duration:** 18 minutes - **YouTube URL:** https://youtube.com/watch?v=glZ_LZpdp_8 - **Channel:** nerding.io ## Description Unlock the power of Chat GPT-4 Vision for software developers! Dive deep into 10 ways ChatGPT Vision can revolutionize how we approach design, code, and user experience. From optimizing SEO via visual webpage analysis to transforming static mockups into interactive prototypes, GPT-4 is redefining the boundaries of software development. ๐Ÿš€ Have you ever considered how a simple screenshot could help improve your website's accessibility? Or how a visual scan of a YouTube code tutorial could lead to more precise understanding and optimization tips? Welcome to the next-gen tech world, where GPT-4 Vision leads. ๐Ÿ“ฐ FREE snippets & news: https://sendfox.com/nerdingio ๐Ÿ‘‰๐Ÿป Ranked #1 Product of the Day: https://www.producthunt.com/posts/ever-efficient-ai ๐Ÿ“ž Book a Call: https://calendar.app.google/M1iU6X2x18metzDeA ๐ŸŽฅ Chapters 0:00 Introduction 0:13 Logo Feedback 1:29 SEO 2:57 Web Accessibility 5:58 Light and Dark Mode 8:19 YouTube Video Code Explainer 9:31 Debugger 12:29 Interactive Enhancement 14:31 Game Developer 16:36 AR App Guidance 18:29 Conclusion ๐Ÿ”— Links https://chat.openai.com/ https://openai.com/blog/chatgpt-can-now-see-hear-and-speak โคต๏ธ Let's Connect https://everefficient.ai https://nerding.io https://twitter.com/nerding_io https://www.linkedin.com/in/jdfiscus/ https://www.linkedin.com/company/ever-efficient-ai/ ## Key Highlights ### 1. Logo Feedback & Design Analysis Chat GPT Vision analyzes a logo (Nerding IO) for color theory, symmetry, typography, and design principles. Demonstrates potential for design feedback and brand analysis from images. ### 2. SEO & Accessibility Audits from Images Vision identifies SEO issues (keywords, alt text) and accessibility violations (contrast, keyboard navigation) from a website screenshot. Offers a quick, visual way to audit websites. ### 3. Code Debugging & Best Practices The tool can identify errors in code snippets from an image, even catching undefined variables, best practice violations (missing try-catch), and pointing out unused imports, showcasing its debugging capabilities. ### 4. UI/UX Enhancement Suggestions Given an image of a website, Chat GPT Vision suggests improvements to UI/UX, including interactivity, feedback mechanisms, and responsive design, acting like a visual consultant. ### 5. AR Augmentation Ideas from Images The tool can analyze real-world images (city, park, home) and suggest virtual augmentations for AR applications. It envisions holographic signs, virtual pets, and interactive elements, demonstrating creative possibilities. ## Summary ## Chat GPT Vision for Developers: Video Summary **1. Executive Summary:** This video explores 10 practical applications of Chat GPT-4 Vision for software developers, demonstrating its potential to revolutionize workflows across design, debugging, UI/UX enhancement, and even AR/game development. By analyzing images, Chat GPT Vision can provide valuable insights, suggest improvements, and generate code, streamlining development processes. **2. Main Topics Covered:** * **Logo Feedback:** Using Vision to analyze logo design principles. * **SEO & Accessibility Audits:** Extracting SEO insights and identifying accessibility violations from website screenshots. * **Light and Dark Mode CSS:** Generating CSS code for light and dark mode themes from website images. * **YouTube Video Code Explainer:** Analyzing code snippets from YouTube tutorials to understand their functionality. * **Code Debugger:** Identifying errors and suggesting best practices in code snippets from images. * **Interactive Enhancement:** Suggesting UI/UX improvements for static website mockups. * **AR App Guidance:** Generating ideas for augmented reality application based on real-world images. * **Game Developer:** Provides insights and suggestions to improve video game design from an image of the game. **3. Key Takeaways:** * Chat GPT Vision allows developers to analyze visual data (images) and extract actionable insights relevant to their work. * The tool can perform tasks ranging from design feedback and SEO auditing to code debugging and UI/UX enhancement, all from images. * Chat GPT Vision can be leveraged for creative brainstorming, generating ideas for AR applications and game development. * Using Vision can potentially speed up development cycles by providing quick, visually-driven feedback and suggestions. * Although helpful, the tool isn't perfect. While the debugging feature is amazing, it can be incorrect at times. **4. Notable Quotes or Examples:** * **Logo Feedback:** "The imagery of the glasses goes with the nerd stereotype... Simplicity... clean and able to be used in in different ways." * **SEO & Accessibility Audits:** "Talking about the file names right like making sure that they're descriptive instead of just image one." & "It's talking about the contrast right out of the gate. This could be super helpful." * **Code Debugger:** "This one was kind of interesting that like on line 22 you are using prompt format and it's not clear where prompt is coming from so this is kind of a mistake but kind of an interesting one at the same time because prompt is actually it commented out and so that's why prompt template isn't getting called." & "It recognized the best practice that even though I'm doing a fetch call it doesn't have a TR catch block..." * **Interactive Enhancement:** "...talking about keyboard accessibility this is the third time that we've seen it bring up wayag..." * **AR Augmentation Ideas:** "... holographic street signs above the road pointing out directions super imposed floating ads or billb Billboards in the sky...virtual pets...a virtual magazine that you could flip through." * **Game Designer/Developer (Super Mario Bros. image):** The platformers should automatically drop when mario stands on them, instead of having him manually jump off. Also, checkpoints should be added to help mitigate gamer fatigue. **5. Target Audience:** This video is primarily targeted at software developers, UI/UX designers, game developers, AR/VR developers, web developers, and anyone interested in leveraging AI and computer vision to improve their workflows. ## Full Transcript hey everyone welcome to nerding IO we've been testing out chat gbt Vision recently we actually came up with 10 different use cases that you can use as developers today so let's go ahead and get started all right so keep in mind we're doing tests that will specifically help uh software developers and the first one that we came up with was this idea of taking a logo and getting some feedback so what I did is I just took the the IO logo and I asked it for color theory Symmetry and other design principles and this is what it came back with it's really interesting it gives uh the the color choice you know it being vibrant the emotion that's associated with it talks about the Symmetry uh goes into the typography even noting that the io has a nice touch uh the imagery of the glasses goes with the nerd stereotype uh Whit space composition uh the Simplicity it even talks about how it's uh clean and able to be used in in different uh ways so like uh works well with grayscale and on different backgrounds so that was really cool to to kind of just put it in like a yo logo test for design you could probably do the same thing with uh a web page as well so this got me thinking about all right let's let's really try and go after some technical pieces so the next piece example was trying to do SEO and again we're going to be using just nerding IO so using the nerding io website I I actually did act as a SEO specialist to see if I could get some different results and ask for specific examples and it actually did not only did it recognize the title but it talking about like keywords and then associating the H1 tags the uh descriptive of the text uh it says include relevant words without stuffing which I thought was interesting talks about internal linking even some uh wake EG stuff here making sure that they're all of the alt tags which this also gave me a good idea and talking about the file names right like making sure that they're descriptive instead of just image one then it starts to talk about some of the uh areas where there could be some clarity talks about a little bit about the ux this also gave me another idea and then talks about metadata so this is actually really interesting things that like you can go back and look at and it talks about very specific examples from the uh YouTube channel to to the images So based on that the next one that we did was the uh wayag so acting as a web accessibility expert and evaluating the page for wag standards and highlighting uh you know some non-compliance or fixes again asking it for specific examples so it talks about the contrast right out of the gate this could be super helpful some of these things I'm sure are through you could figure out through um like Chrome and Lighthouse and and Axe and things like that but it's it's pretty interesting it's even talking about like the 3:1 for larger text and the ratios that's really cool uh talking about the the site should be responsive again we're you know mentioning the alt text talking about keyboard accessibility um going down into like audio if any uh talking about like link descriptions all of these things and then I think it's pretty cool that it was talking about error handling as well now it doesn't give a score or anything but still this is almost like a good checklist that you could use just write out the gate by Ju Just by taking like a screenshot of your website and putting it into chat gbt Vision so again you know it started making me think uh like when suggesting Focus IND indication and stuff like that like what about ux so that's the next one we're going to look at so using the same image now we're I actually made it a little simpler just saying offer feedback uh but still giving like specific examples I wanted to call out the image so it starts talking about clear hierarchy it's talking about the CTA uh the informational flow which is definitely interesting actually pulls out specific interface elements so talking about the navigation the illustrations um and type phography right the color scheme again the YouTube section is being brought up um and then gives a long long list of potential enhancements so talking about responsiveness again and search functionality um hover events testimonials right like this is this is almost like marketing as well and then uh you know the footer the loading speed so now we're talking about performance um again bringing up accessibility that's awesome even specifically calling out wayag without even being instructed to uh and then these are the specific examples that it's talking about U maybe addressing so very cool you know talking about hover animations and things all right so this next one uh I had the idea of taking that same image but then seeing if we could do a l light and dark mode and so when I first did the light and dark mode what it did is uh it gave it came up with like our base Styles which I thought was pretty cool um even looking at like the button transition understanding like the margin for the container and then it did it where it basically has class names Associated to each one so we can see like the colors that it's pulling out of that um and even with dark mode same thing it has the class attached to each element that well that's great and it even comes up with like the toggle so you can toggle between the two um that's not necessarily what I wanted I wanted it to actually uh detect what the mode was for and write all the CSS for the page so I tried that and it's using the PF color scheme and what it's doing is again it's doing the Basse Styles it added a header uh so it's trying to do a little bit of better association with the image um light mode by default this almost seems a little redundant but that's all right um and then it's going into this media and it's actually saying the prefers color schema and when it's associated with dark so it's using this so that like based on your uh your system it'll actually change from light to dark mode and then you can you can actually customize it so that's pretty interesting that I was able to figure that out again it's trying to write the HTML was kind of cool that it actually you know was pulling key elements out of there obviously didn't pull any the image tags just did like um web accessibility H1s and and things like that but I think for if you wanted to you know take screenshots of your web page and try and have it do some basic styling for even like good contrast of what your your light and dark mode should look like this might be a good way to to kind of test it out so with that we're going to go into a little bit more code and we'll jump over to the uh to the next one so what I did is uh again using a YouTube example I looked at the code that I had open in that screen and I just asked what is the code doing just trying to see if it could break some of that down and actually explain it to me so this actually includes code that has Lang chain which is not in chat gpt's um you know knowledge base and so it did recognize certain things you know like import statements it recognized that it was using react it also recognized that it was using nextjs uh going down into like the prompt function function even calling out the API calls this I thought was really cool it even noticed that it was a sidebar and then specifically called out that the structure it has was nextjs based on the folders so I thought that was pretty interesting hey everyone I hope you're enjoying these use cases with chat gbt Vision I just wanted to ask really quickly if you haven't already please like And subscribe it helps more than you know and leave us any comments that you think of what you want to see next so next I kind of wanted to trip this up and I wanted to see if I could just take that same piece of code and make it fail and so I just put in this uh undefined variable just called fail and I uh I just asked what's wrong with this code didn't give it a a roll or or any other kind of pointers and it actually came up with some pretty interesting stuff so the first thing it obvious viously figured out you know that uh an unexplained fail keyword it's not part of the JavaScript statement um it also actually found that the prompt template is not used which I thought was pretty cool and you could maybe say like oh because it's not showing up it's different um but then it keeps going and this one was kind of interesting that like on line 22 you are using prompt format and it's not clear where prompt is coming from so this is kind of a mistake but kind of an interesting one at the same time because prompt is actually uh it commented out and so that's why prompt template isn't getting called but later it's thinking that this is actually running even though that's commented out so it's kind of an error but kind of an interesting error the next part talking about the undefined function or method uh prevent default um then it's talking about environment variables this one I also thought was really cool it recognized the best practice that even though I'm doing a fetch call it doesn't have a TR catch block and so best practices would state that we would have that and be able to uh do you know a result uh return specific and probably also do a console error um also talking about the the end points uh and then the setting the state um but again it had the unused Imports which I thought was super cool because this information is actually commented out and then it even talks about the comments themselves I don't know that these lines are correct but it actually did call out the fact that there is comments that need to be or there are comments that need to be commented out and then goes on to discuss that the first thing to fix is the fail statement which is accurate so again I thought this was really cool that you could actually uh do some debugging uh and give it and get some best practices so the next one is a little different what we're going to do is we're actually going to take with this code represents and take a look at that and so for this one this is the actual website it's just a uh a input and a submit that goes and runs a lang chain process and so what was cool about this is that it's it has the nerding at the top it talks about the submit button it's and we're making uh we're taking a statup mockup and actually looking at like how we could make this better we we've seen some examples of like drawing on a piece of paper and actually building this code out but I wanted to just take a static website and just say as a developer what are some things that we can do to make this better and so we start seeing the interactivity we talk about the submit button it talks about the tool tip as a helper icon I thought that was pretty in interesting talks about a progress loader or indicator uh some feedback so some error handling would be a best practice again this doesn't even know about the code it's just looking at a flat file of the web page it talks about keyboard accessibility this is the third time that we've seen it bring up wayag and then responsive design for the UI as well I I just think that's really cool even a dark mode toggle that's pretty interesting and then an auto complete feature so I wanted to take it a step further and say could you code these in nextjs and tailwind and so it goes through actually building out the project and then it's using uh pages so it's not using the app router but still I mean this code is is definitely working it does a mock so that you can actually do the indicator and the progress testing it's got accurate placeholders um you can even see like the loading for the button itself as well as doing a styling change there and then tells you how to do your configuration all right so on the last two I was super surprised and also tried to trick it so I took this old picture of Super Mario Brothers on NES and I was kind of just seeing if it would actually help me beat the game that's why I put in and ways to test however what it did is by identifying improvements and then the identity balance balance issues it actually acted as a game designer or developer looking at this and trying to figure out specifically like the varying types of shapes so it recognize the platforms it understands that you know maybe instead of having Mario just sit there as soon as it stands on them it has to drop it talks about enemy variety it talks about the powerups and blocks it even goes on to talk about how Mario is encouraged to explore and to have some sort of vertical gameplay the the next ones I thought were super interesting so it's talking about like gamer fatigue so if the level is long incorporating checkpoints to make sure that the user doesn't get frustrated or if they die and this also this idea of having a breather or safe areas again I just thought that was incredibly cool that it actually went through a ways to like improve as a game design er uh just by an image and then talking about ways to engage with the player and as far as test uh testing goes it kind of talks about just ways to you know continuously iterate and observe and provide feedback and have uh diverse player skill levels right so easy medium hard um so this was this one was a lot of fun to see in action it was definitely a little different it really inspired me for the last one so the last one's a little bit out there but I figured we might as well just go big or go home and what I I did was I took three different images and I said all right analyze these real world real world images and suggest virtual augmentations so basically if you were looking at the world and you had AR goggles or or an ar ar interface what would you s what would it suggest to interact with and I gave it you know a busy City a park setting and just like a home and this was super cool it talks about having holographic street signs above the road pointing out directions super imposed floating ads or billb Billboards in the sky uh robots and futuristic characters could be walking alongside real people um on the butter on the flower garden it talks about you know almost making it a fantasy land with you know butterflies and that land on flowers and mythical creatures uh you know and just rainbows and and all kinds of just it says it right here a magical Ambience so I thought that was really interesting and fun and then the last part was the uh the living room right and so it's talking about how interactive screens on the walls could be used for video calls browsing or entertainment you have virtual pets um even a virtual fish tank and then even give something that you can interact with which would be the coffee table could have a virtual magazine that you could flip through so again it really pushed the limits of how we could use this tool as developers designers and really start thinking of unique ways to take still images and try and push the limits of what we're doing in software all right I hope you enjoyed those use cases we went through everything from like SEO and wag to some pretty interesting and like obscure ideas related to AR and even game design which kind of came out of nowhere in my opinion so if you like this content again please like And subscribe we're going to leave an upcoming video specifically about the tips and tools we used in order to kind of make this quickly and with that we'll see you in the next one happy nerding --- *Generated for LLM consumption from nerding.io video library*