# How Transformer.js Can Help You Create Smarter AI In Your Browser #webml #ai

## Metadata

- **Published:** 9/29/2023
- **Duration:** 13 minutes
- **YouTube URL:** https://youtube.com/watch?v=MNJHu9zjpqg
- **Channel:** nerding.io

## Description

In this video, I'll introduce you to WebML and Transformer JS, which is all about using AI in the browser. We'll explore a library called transformers.js that allows us to use the hugging faces transformers Python library in JavaScript. I'll show you how to perform tasks like natural language processing, computer vision, and even multimodal tasks directly in your browser without needing a server. We'll also dive into examples of using WebML in vanilla.js, React, and Next. So let's get started!

📰 FREE eBooks & News: https://sendfox.com/nerdingio
👉🏻  Course:  https://www.udemy.com/course/ai-in-the-browser-with-js-chrome-extensions-huggingface/?couponCode=55A07DD0215025B3E590
📞 Book a Call: https://calendar.app.google/M1iU6X2x18metzDeA

🎥 Chapters
00:00 Introduction to WebML
01:02 Chrome Extension Example
04:18 Behind the Scenes of WebML
06:44 Defining Pre-trained Models
07:20 Convert Pre-trained Models
08:22 Demos
09:00 React Translator
10:54 Blind Chat
11:41 Code Gen
13:00 Review

🔗 Links
https://huggingface.co/docs/transformers.js/index
https://github.com/nerding-io/questionable/tree/main
https://huggingface.co/collections/Xenova/transformersjs-demos-64f9c4f49c099d93dbc611df
https://huggingface.co/spaces/Xenova/react-translator
https://huggingface.co/spaces/mithril-security/blind_chat

⤵️ Let's Connect
https://everefficient.ai
https://nerding.io
https://twitter.com/nerding_io
https://www.linkedin.com/in/jdfiscus/
https://www.linkedin.com/company/ever-efficient-ai/

## Key Highlights

### 1. Browser-Based AI with Transformers.js

Run ML models directly in the browser using Transformers.js, enabling NLP, computer vision, audio processing, and more without a backend server.

### 2. No-Server Transformer Power

Transformers.js allows the use of Hugging Face's Transformers library in JavaScript, eliminating the need for a server and providing browser-based AI processing.

### 3. ONNX Runtime Integration for Performance

Leverage the ONNX runtime within Transformers.js to load and run models efficiently in the browser, with tools to convert PyTorch, TensorFlow, and JAX models.

### 4. Caching Models for Speed

Transformers.js caches loaded models in the browser, resulting in significantly faster subsequent use, bypassing the need for repeated loading during interactions.

### 5. Customize Models and Tasks

You can define pre-trained models and configure local paths for them. Transformers.js can also access models on Hugging Face, allowing flexibility in model selection and usage.


## Summary

Here's a structured summary of the video, designed for quick understanding:

**Video Title:** How Transformer.js Can Help You Create Smarter AI In Your Browser #webml #ai

**1. Executive Summary:**

This video introduces Transformer.js, a library enabling browser-based AI processing using Hugging Face's Transformers in JavaScript.  It demonstrates how to perform tasks like NLP, computer vision, and code generation directly in the browser without relying on a backend server, highlighting the potential for faster, offline-capable AI applications.

**2. Main Topics Covered:**

*   **Introduction to WebML and Transformers.js:** Defining WebML as AI in the browser and introducing Transformers.js as a way to use Hugging Face's Transformers in JavaScript.
*   **Benefits of Browser-Based AI:** Eliminating the need for a server for AI processing, enabling faster and potentially offline AI applications.
*   **ONNX Runtime and Model Conversion:** Using ONNX runtime to load and run models in the browser and converting PyTorch, TensorFlow, or JAX models into ONNX format.
*   **Implementation Examples:** Demonstrations of WebML using vanilla JavaScript, React, and Next.js, including a Chrome extension and live translation.
*   **Caching and Model Management:** Caching models in the browser for faster subsequent use and defining pre-trained models with local paths.
*   **Code Generation Demonstration**: Showcasing a practical example of using Transformer.js for code completion directly in the browser.

**3. Key Takeaways:**

*   **Browser-Based AI is Possible:** Transformers.js brings the power of Hugging Face's Transformers library directly to the browser.
*   **No Server Required:** Process NLP, computer vision, audio and multimodal tasks without a backend server.
*   **Performance and Efficiency:** ONNX runtime enables efficient model execution, and caching significantly improves subsequent use.
*   **Customization and Flexibility:** Models can be loaded from Hugging Face Hub or local files, offering flexibility in model selection.
*   **Practical Applications:** The video showcases real-world examples such as Chrome extensions, live translation, and code completion.

**4. Notable Quotes or Examples:**

*   "What's meant by that is you can actually use different machine learning practices and run them in the web like in the browser without any backend." - Defining WebML.
*   Example of Sentiment Analysis in real time: Demonstrating positivity/sentiment analysis on every keystroke within an input field.
*   "Even though we're opening and closing this Chrome extension, that Chrome extension has only had to load the model a single time." - Highlighting the benefits of caching in Transformer.js.
*   "Now that this is uh loaded the model and it's running it through ONNX, it's pretty fast like faster than actually hitting an API and returning it." - Comparing performance with traditional server-side AI.
*   Code Generation Example: A demonstration of getting code completion within the browser, utilizing a specific code generation model.

**5. Target Audience:**

*   Web developers interested in integrating AI capabilities into their web applications.
*   JavaScript developers looking for a serverless AI solution.
*   AI/ML engineers exploring browser-based AI deployment options.
*   Anyone interested in using Hugging Face's Transformers library in a JavaScript environment.


## Full Transcript

hey everyone welcome to nerding IO I'm JD and today we're going to be learning about web ml or you can think of it as AI in the browser what's meant by that is you can actually use different machine learning practices and run them in the web like in the browser without any backend and so I'm really excited to look at this Library called transformers. JS and what this means is that they're trying to use hugging faces Transformers python library but use it in JavaScript specifically and the cool thing right out of the bat it tells out of the gate it tells you that there's no need for a server you can do Transformers directly in your browser so you can do things like natural language processing computer Vision Audio and even multimodal so like things like zero shot uh image classification and the main piece of this is that transformers. JS can use the Onyx runtime to load the models or run them directly in the in the browser and they even have a way to convert your pre-trained pytorch tensor flow or Jack models into Onyx so this is extremely awesome and you can see over here that they have examples for vanilla Js react and next um we're going to be doing a lot more with next so we can just kind of look at this really awesome tutorial and what you'll see is that they're just typing in an input field and actually having a positivity or uh sentiment analysis while they're typing so it's actually doing the analysis on every single keystroke this is incredibly powerful the other thing is that you can even look at things like a browser extension or electron so with the browser extension they don't actually have a full tutorial but a few months ago we actually built uh an open-source example of a Chrome extension so we're going to take a look at that so here we built this Chrome extension called questionable and it's a way to search your web pages with questions and we did this as a as a course so if you're interested in the full course um I'll leave a link in the description but you can also just check it out so we're going to take a look at what this means it basically allows you to search the information that you're viewing and ask questions in a Chrome extension so if we go over to our ever efficient AI website and we just do this question bubble we also have a hotkey and we can type in in can you build Chrome extensions it's going to take a minute what it's doing is actually pulling all this information from the page and chunking it and then doing this analysis in the background of the Chrome extension again this is all written in react there's no backend at all it's just using web workers so and now it's highlighted and found and scrolled to where uh we can show the browser extension capability you can also do some web accessibility so you're looking at like just being able to Tab and go through things hit enter and scroll um what's interesting about this though is that with a Chrome extension every single time it's loading you would think it's loading the model uh and what's cool and we'll go through this is that because of the way Transformers JS works you only have to load the model once so even though we're opening and closing this Chrome extension that Chrome extension has only had to load the model a single time so let's look at some of the the usage and and how this actually works so we saw a Chrome extension example well what's going on behind the scenes the first is you have these things called uh pre-trained or tasks that are associated with how you're going to build out your pipeline so sentiment analysis is is one of them we're setting what we want it our pipeline to B as a task then we can say okay this classifier we want to say we love Transformers so it's giving our uh sentiment analysis of positive and giving us a score you can also pass in arrays which is pretty awesome and this is where it gets really interesting you can actually Define your model right here so if we look back to how this is this is working in Python it's pretty much exactly the same we'll see that we have the the P the python import we have our pipeline sentiment analysis we have our pipe as well as here really the only difference is where in JavaScript we're using a weight and then we can Define our different models so you can see here there's a ton of different pre defined tasks and models that we can use they have things for vision they have things for audio tabs so like tabular data is coming and then they have multimodels so this you can actually do a question and answer on an image which is pretty awesome and then they have these presets so if we look at the the pipeline API what we saw was is not only not only are we defining our models but we can actually send this information to hugging face so what this means is the options that you have available for loading can either be on the hugging face Hub or locally on your system when I say your system I mean in your browser so again these are the different you have a pipeline and then you have uh your model and then you have different parameters that you can put with your task so the cool thing again is that we can actually Define pre-trained models so we're just going to look at like how relatively easy this is you just have to configure in your JavaScript the local path for your model you want to turn off the remote model to so we just set it to false and then we're defining where our our wasm files are so WM is web assembly and that allows the the Onyx files to actually be pulled into the browser if you want to convert uh there's a conversion script so if you want to convert your model you need to Define what the path ID is you need to make sure that you have a p torage tensorflow or Jackson model so you you would need all um your python librar set up locally and then with a single command you can actually convert these particular models to run locally what it's going to do is give you an output in this particular uh model path and so that will allow you then to load in this Onyx model and use the the models in your pipeline just as we saw right here so this is our task again and then here is our model so this is where it could be defined as a as a local uh local model all right so one of the things that I really liked about Zen noa's collection uh Transformer JS is they have a collection and it has a bunch of different examples from audio to images as well as a chat and so what we're going to do today is kind of look at a hello world example which is kind of like the react translator and then a little bit more complex which is the blind chat uh if definitely recommend taking a look at some of these if if you know audio or images Peak your interest it's defin some cool cool stuff here so if we take a look at at the react translator what this is going to show us is a couple different things it's going to show us different parameters that we can set it's going to show us how we can actually interface with this model uh live just on the browser but the really important thing I want you to see is when we start to do this translation it's actually going to load in the model and so you can see it's relatively quick uh that can vary obviously based on speeds and you see this only runs once and so what it's doing is it's actually somewhat caching it and what you can do is if you need to refresh it to get rid of the cach now it's actually streaming the the uh the information back to us and now if we go in and change this to say I don't know cat we can translate it and we don't have that load time again because it's already loaded in the model now that's stored in uh in the browser essentially now we can actually just switch let's say want to do German and we're seeing it stream instant we can also do I don't know like Spanish and you'll see that you can quickly switch between without any load time so now that this is uh loaded the model and it's running it through RM it's it's pretty fast like faster than actually hitting an API and returning it you would if you think about it every time that you would click translate technically you would be going out to like something like open AI API and waiting whatever time that was for it to go out process and come back whereas here you have that one heavy initial load but then everything else after that is incredibly fast so the next example that we're going to look at is this blind chat and the reason reason that I I kind of like this one is you can actually change the model in which you're doing things so if we'll just say who is a LEL again it's loading in the model because it's relevant to this tab and to this page Al together right and so while it this one takes a little bit longer which is fine so I'm going to pause and then we have uh the Ada love l as a was a British matician mathematician so cool now let's actually try and get some code out of this which would be a totally different model if we do a new chat we're going to go over here to where it says current model you can see that some others are coming soon and you can notice that we have the essentially the chat model but then we also have the ability to do a code gen example so what we're going to do is select this you can also see that you have a system prompt so you can think of that as like a roll or defining the uh the part of the object in the parameters that allows you to Define what the system actually is supposed to be so you here we'll just say software engineer as an example and we'll apply this you can see that it's changed and now what we're going to do is we're going to try and get some code completion so we'll just do uh hello world and we'll just say we're defining name and we'll see what's coming up so right now it's again loading the new model which would only happen once and we'll see what our output is cool and so what it's doing now is it's actually taking what we put in it's defining what it thinks it should be as a as a defined function and then it's actually giving us an example of the function itself we can also copy this so let's go over what we learned again today basically we learned that we know we can run natural language processing computer Vision Audio multimodels and actually convert a model into an onyx runtime and use similar code in JavaScript in the browser that's the key part is in the browser this is running so it's actually loading the model into our web uh browser interface so again I I think this is going to be really awesome and really huge in JavaScript we'll continue uh with some more examples in later videos uh if you have any suggestions for us please let us know and don't forget to like And subscribe and happy nerding

---

*Generated for LLM consumption from nerding.io video library*