# Create a Web AI Memory Game in the Browser

## Metadata

- **Published:** 2/26/2025
- **Duration:** 15 minutes
- **YouTube URL:** https://youtube.com/watch?v=2MXk0YdrDsk
- **Channel:** nerding.io

## Description

In this tutorial, we dive into how you can build an AI-powered memory game that runs entirely in the browser—no backend required! Using WebGPU, we load and cache Gemma 2 (a 2-billion parameter LLM) directly in the browser, enabling offline AI processing for interactive gameplay.

🔥 What You’ll Learn:
- How to run AI models locally in the browser using WebGPU
- Implementing LLMs for game logic without backend calls
- Creating a custom memory game from markdown notes
- Integrating speech recognition and file parsing into a web app
- Optimizing browser cache storage for AI models

💡 Whether you’re an AI engineer, web developer, or just a tech enthusiast, this guide will show you how to harness client-side AI to create interactive, privacy-first applications.

👉🏻 Text Yourself: https://textyourself.app
📰 Newsletter: https://sendfox.com/nerdingio
📞 Book a Call: https://calendar.app.google/M1iU6X2x18metzDeA

🎥 Chapters
00:00 Introduction
00:20 Background
01:50 Demo
05:47 Code
14:26 Cache
15:17 Conclusion

🔗 Links
Source: https://github.com/nerding-io/memgame-webai
Jason Mayes: https://github.com/jasonmayes/WebAIAgent

⤵️ Let's Connect
https://everefficient.ai
https://nerding.io
https://twitter.com/nerding_io
https://www.linkedin.com/in/jdfiscus/
https://www.linkedin.com/company/ever-efficient-ai/

## Key Highlights

### 1. Offline AI Memory Game in Browser

The presenter demonstrates a memory game that runs entirely in the browser, leveraging a large language model (LLM) cached locally for offline functionality and privacy. No backend communication is required after initial load.

### 2. Gemma 2 LLM Integration with WebGPU

The game utilizes the Gemma 2 (2 billion parameter) LLM, loaded and processed using web GPU for efficient client-side inference. The model is fetched remotely initially and then cached in the browser for subsequent use.

### 3. Custom Game Creation via File Upload

Users can create custom memory games by uploading markdown files or using voice input. The LLM then parses the content to generate matching pairs and explanations dynamically.

### 4. Leveraging Browser Cache for Model Storage

The video showcases how browser caching can be used to store large AI models locally, enabling offline use and reducing the need for repeated downloads. This approach optimizes performance and reduces server load.

### 5. File Proxy Cache by Jason Mae

The presenter acknowledges Jason Mae's file proxy cache as critical to making the Gemma model accessible and usable, highlighting Jason's work in web AI client-side information processing.


## Summary

Here's a summary document for the provided video information:

**Video Title: Create a Web AI Memory Game in the Browser**

**1. Executive Summary:**
This tutorial demonstrates how to build an AI-powered memory game that runs entirely in the browser using WebGPU and a cached LLM (Gemma 2). It showcases the possibilities of client-side AI processing for interactive gameplay, offering a privacy-first and offline experience.

**2. Main Topics Covered:**

*   **Offline AI Implementation:** Running large language models (LLMs) locally in the browser.
*   **WebGPU Utilization:** Using WebGPU for efficient client-side AI inference with Gemma 2 (a 2 billion parameter model).
*   **Custom Game Creation:** Generating memory game pairs and explanations from uploaded markdown files or voice input.
*   **Browser Caching:** Storing large AI models in the browser cache for offline use and reduced download frequency.
*   **Speech Recognition Integration:** Using voice for game input.
*   **File Parsing:** Reading and processing markdown files for game creation.
*   **Model Loading & Initialization:** Steps for initializing the LLM and checking for local/remote availability.
*   **UI Updates:** How to update the UI and provide user feedback based on file loading progress.

**3. Key Takeaways:**

*   AI models can be effectively run directly in the browser using WebGPU, eliminating the need for backend calls during gameplay.
*   Browser caching allows for persistent storage of large AI models, enabling offline functionality and improving performance.
*   Users can create personalized gaming experiences by uploading custom content, which is then processed by the LLM to generate unique memory game pairs.
*   Client-side AI offers privacy benefits as no user data is sent to a backend server after the initial model load.
*   The demo showcases a fully functional web-based AI memory game, developed with HTML, CSS, and JavaScript, demonstrating the ease of integrating AI directly into a web page without a backend.

**4. Notable Quotes or Examples:**

*   "…you can actually run a model in the browser and locally where it'll cach it offline and then you can actually use it."
*   "This is great for a couple of different reasons...first and foremost it's free it's uh private..."
*   "Generating a Json object of a unique pair of uh matching and an explanation of why they match right so that's for the hints as well as the reasoning behind it"
*   Mention of Jason Mayes' file proxy cache as critical to making the Gemma model accessible and usable for client-side web AI.

**5. Target Audience:**

*   AI Engineers
*   Web Developers
*   JavaScript Developers
*   Tech Enthusiasts interested in client-side AI
*   Individuals interested in privacy-focused applications


## Full Transcript

hey everyone welcome to nerding IO I'm JD and today we're going to be going through a memory game that I built while here at AI engineering Summit 2025 and the idea is is that you can actually run a model in the browser and locally where it'll cach it offline and then you can actually use it so with that let's go ahead and get started all right so I just wanted to start with a quick demo you can see I don't have any data stored locally uh and I'm on like my Local Host no network calls no consoles and so what I'm going to do is I'm actually just going to refresh this so this will be like loading the page for the first time and what I'm going to do is go through this memory select game that I built so that I can uh could play it on the airplane and so what's happening is it's actually going out and it's getting the Gemma uh two Gemma 2 two billion parameter f billion uh file and it's actually loading it so you can actually see that as the the bits are being loaded it's coming across the wire and that's how we're doing our progress bar once this loads for the first time it's actually going to be stored in cach and what that means is that I have an AI model that is actually stored in Cache in the browser so that way rather than actually going and hitting the back end I can actually hit everything on the front end no matter what and this is great for a couple of different reasons um first and foremost it's free it's uh private so we don't we aren't actually storing any information we aren't actually hitting the back end for any kind of information and we can essentially play it offline because it's actually uh loading the browser or the the model itself into the browser you can see it's going to take about a minute um couple minutes really to actually and you have to be on really good Wi-Fi right um in order to to actually process this but this is all being done through web GPU and I'm going to kind of go through what the the demo actually is because we're going to be making calls directly to this model again without ever leaving the front end and we'll kind of go through the code to see what that looks like all right so now we can see that it's doing its final checks and that we've gotten all of the data we've successfully loaded and now we have our cards so each one of these cards is all it is is a uh label and a uh SVG for the icons and so I just put in like a bunch of them and what I'm going to do is I'm actually going to call out to an llm again we're going to call out to Gemma 2 locally as a file and so when I click this it's we're now generating the memory pairs so I'm actually looking to get a response back from the model that is adjacent object that I can play against and you can see I got a markdown I cleaned up that data and now I actually have pairs that I can actually run and and do my check on so what I did was I uh can now play this game it randomizes it so I don't know where everything is I can actually look for for a uh for a hint so now I know where England is and now I have a match and it gives me uh a little bit of information so I can actually like randomize the categories so that again it's generating new pairs as you can see it's calling out to the prompt here or sending a prompt to the llm again this is all in the browser not local through like ol or something and now it's actually refreshing uh the the pairs here I can also change categories at random I ironically it chose the the capitals again but um we can uh re uh there we go so now we have it checken chemistry so I actually clicked refresh I meant to click switch up you can actually switch all that up and then the other thing we can do is we can actually create our own game so we can actually type type in here or we can actually use voice so this is using speech recognition or we can actually drop a file in here and have it parse through file reader and actually create a game that way all right so I'm just going to take uh some notes through a markdown file that I have from the conference so I'm going to take this which is just an example can drop it in you can actually see it read all that information as you can see right here dropped the file I can even take and edit this information if I want say I didn't want to do the course Concepts I just wanted to do tools or something and then I can actually add that and click the generate memory game and it'll go through again I'm not loading the model anymore I'm literally just sending a prompt to the model that is local and I am expecting it to return back another Json object based on the information of the notes in my markdown file to create a custom game and so now I can actually test myself on the notes that I took at the conference so uh I just thought this was a super Nifty tool of how to build an actual again HTML CSS JavaScript and uh web AI to to give it a different kind of flavor so now let's actually uh jump into the code but before we do I actually want to just give a shout out to Jason Mae so he if you haven't you should definitely check him out he I'm actually standing on his shoulders with this AI agentic behaviors and he has a lot of different examples of web AI client side information so definitely go check him out um there's examples on uh Spotify as well as this uh flight suggester all right so let's dive into the code real quick everyone if you haven't already please remember to like And subscribe it helps more than you know also please go check out text yourself it's a simple application that I built that helps keep me on track all you have to do is SMS different tasks and reminders that you need to be sent back to yourself with that let's get back to it cool so as you can see there's only HTML JavaScript and CSS right so we have our core function our core uh HTML structure here and all it really is is some svgs and some containers and then we have two different uh or three different JavaScript files first is how we outload the llm and then we also have you know our categories which is basically just some const and then our game logic and we'll kind of go through each one so what happens first is inside the game llm logic we have a couple different things going on we have our text to speech so this is if we wanted to actually speak back to the user in a particular voice we can actually use I think it's cra I could be saying that wrong um but what we're really focusing on is also the fact that we're going to pull in information from media pipe to do the tasks gen Ai and in order to pull in the Gemma uh llm we're actually going to use this file proxy cache again you can see Jason may put this together so shout out to him and we're going to initialize all of that logic so the first thing we do is actually go in we'll initialize our llm we're going to tell update our state right so let the user know that we're initializing the llm we're going to take the file resolver for Gen tasks from the media pipe and then what we're going to do is we're going to check our local to see if we actually have this file locally if not then we're going to look for it remotely so when we were seeing all those requests in it wasn't actually loading the the the bin from our Local Host it was loading it from a remote server you can do it either way right like if you don't want to have the 2 billion parameter uh model sitting on your host you don't necessarily have to I actually have all this stored on an S3 bucket so then what we're going to do is since the local model is going to fail we'll say Okay if that doesn't exist then we're actually going to pull for the remote we're actually going to create our llm inference so once this is done loading we'll actually uh set up our llm we'll give another update and we'll actually say uh that this is complete while it's actually going through the loading process we're actually going to update our file progress call back so this is actually passing through a status and then we're actually going to look and see uh in the game itself of how we actually pull the back uh that status and display it this is actually our call so to the llm so again we're updating our model and this is our prompt we're just saying whatever our category is and in the G custom instance we're putting in all our notes and we're just saying generate a Json object of a unique pair of uh matching and an explanation of why they match right so that's for the hints as well as the reasoning behind it and then when that's called we actually just generate a response through the llm that we already have loaded in the browser and then we do our cleanup so once we get this response we know that what's going to happen is that response is going to have basically markdown uh example of like code and Json so we want to strip all of that out and make sure that if there's any any other excess lines or anything we're just going to trim everything so that we just end up with the Json object then we want to make sure that that Json object is actually ajacent object and uh valid and if not still try and clean it and then throw an error message and so this is the basic part of the llm that we're going to then pipe into the game so if we go and look at the game itself we are doing the same kind of thing we're going to initial initialize all our elements we're going to initialize our listeners we're going to update our UI so this is where we get our initial load UI so this is our HTML in order to fill up the progress bar so we're actually starting with nothing and we're going to say that we're loading our AI model and then based on that information we're going to update our loading call back right so after we try to initialize the the llm we're going to say that we want to bind this update loading status so that as the fallback or the call the uh the call for the llm is passing back the progress right here the file progress call back we're actually sending that information back to the llm to our progress loading status what it's going to be doing is actually uh giving us updates right so uh the response that we get back is actually string so we've got like a little bit of logic here that basically says uh if it's a string and it has this loading file and we want to actually convert that to something that's a percentage a number so that we can actually say we want a uh a width of you know x% and so that's going to update all of the the model and then we actually get into kind of our game logic so we have our initial game we're basically again once the LM is ready we're going to say that we um update our our model again still updating the modal if we're generating pairs this is where we saw the click on the initial category card and then we're just going to check is this custom because if it's custom we're actually going to take the custom prompt value and pass all of that if not we're just going to check and see if uh send the category straight As is we also look to see if part of those const right you could actually have in the categories you could actually have predefined pairs if you wanted to but instead we're actually going to just generate them on the fly every time again by actually hitting the um the uh The Prompt itself right so again this generate categories pairs is just going to hit and then we have all our Logic for shuffling cards and removing a or removing the cards and uh as well as like matching them directly doing the flip um as well so for the file reader itself we actually put that over in the llm so for the file reader basically we have a drop zone we're also going to be looking for our uh our change events so like the drag events as well as the dropping and then as soon as we actually handle the file we're going to be using this thing called file reader so what we're doing is we're making sure that we have the right types uh if there is a file want to look for things like uh I specifically in the HTML said to look for text files and markdown I haven't tried a PDF you could try that um be kind of interesting I just chose markdown because it's really great for llms to process and you can use this function called read as text and'll just pull that text information out of the file itself and then we put it into our custom prompt again sending it all to the to the local uh llm which is stored in our browser if we come back here what I want to show you is that in the application itself we now have our cache so we can see that it's around 2600 megabytes and it's all cached locally so when we refresh this we're not actually going to see another Network call go out we're actually just going to see the uh cache itself so let's go ahead and refresh we can see that we're initializing our llm it's verifying the file in our local blob right so this is the cache itself it's already loaded again it did not increase our cache or anything it literally just pulls it straight from here so we're not pulling the file from the network we're we're just pulling it from our blob all right everyone that's it for us today basically what we learned is how can you actually build a vanilla JavaScript memory game but also incorporate features like voice as well as being able to load markdown files directly into your uh browser and then use AI to actually process that into a game so that every single time you're actually getting something different and you can actually quiz yourself on your notes that happy nerding

---

*Generated for LLM consumption from nerding.io video library*