# How to create a Custom LLM Integration with Hume's EVI ## Metadata - **Published:** 8/19/2024 - **Duration:** 14 minutes - **YouTube URL:** https://youtube.com/watch?v=uOo7qTCleT4 - **Channel:** nerding.io ## Description Elevate your user experiences with Hume’s Empathic User Interface (EVI) by integrating your custom language models! This feature is perfect for developers seeking deep configurability and personalized interactions. Key Highlights: - Seamless integration with your custom language models - Full control over conversation flow and text outputs - Ideal for advanced use cases like conversation steering, regulatory compliance, and context-aware text generation - Leverage real-time data and Retrieval Augmented Generation (RAG) to enhance interactions Discover how to tailor your user interfaces with precision using Hume’s EVI. Like, share, and subscribe for more advanced development tutorials! 🎥 Chapters 00:00 Introduction 🔗 Links https://www.hume.ai/ https://github.com/HumeAI/hume-api-examples/tree/main/evi-custom-language-model https://dev.hume.ai/docs/empathic-voice-interface-evi/tool-use#create-a-configuration https://smith.langchain.com/ ⤵️ Let's Connect https://everefficient.ai https://nerding.io https://twitter.com/nerding_io https://www.linkedin.com/in/jdfiscus/ https://www.linkedin.com/company/ever-efficient-ai/ #chatgpt #voice #ai #programming ## Key Highlights ### 1. Custom LLM Integration with Hume AI The video demonstrates integrating a custom LLM with Hume AI using websockets, Langchain, and Langsmith for tracing and debugging. This approach offers advantages such as advanced conversation steering and better context awareness. ### 2. Websocket Connection Requirement A key takeaway is the necessity of a websocket connection for the custom LLM integration. The LLM needs to be deployed on a websocket instance, as Hume expects communication via this protocol. ### 3. Langsmith Tracing for Emotion Analysis The integration allows emotion analysis of both inputs and outputs using Hume's emotion detection capabilities within Langsmith traces. This provides valuable insights for improving the LLM's interaction and response. ### 4. Flexibility with Custom LLMs & RAG Using a custom LLM with Hume enables flexibility in model selection (e.g., Claude, Mistral) and integration with Retrieval Augmented Generation (RAG) for enhanced context and knowledge retrieval. ## Summary ## Hume AI Custom LLM Integration: Video Summary **1. Executive Summary:** This video tutorial demonstrates how to integrate a custom Language Model (LLM) with Hume AI's Empathic User Interface (EVI) using websockets, Langchain, and Langsmith. It highlights the benefits of custom LLM integration, including enhanced conversation steering, context awareness, and emotion analysis capabilities within Langsmith tracing. **2. Main Topics Covered:** * **Introduction to Custom LLM Integration with Hume AI:** Overview of the advantages and use cases for connecting custom LLMs to Hume's EVI, such as conversation steering, regulatory compliance, and improved context. * **Setting up Websocket Connection:** Explanation of the requirement for a websocket connection for custom LLM integration and how to achieve it using tools like poetry, uvicorn, and ngrok. * **Configuration within Hume AI:** Step-by-step guide to configuring the EVI in Hume AI to connect to the custom LLM via the websocket endpoint. * **Leveraging Langchain and Langsmith:** Demonstrates the use of Langchain for orchestrating the AI flow and Langsmith for tracing, debugging, and visualizing emotion analysis of both inputs and outputs. * **Code Walkthrough:** Exploration of the code structure, focusing on agent definition, prompt templates, and handling Hume AI message structures. * **Testing and Demonstration:** A live demo of the integration, showcasing the interaction between Hume AI and the custom LLM, and examining the traces in Langsmith. **3. Key Takeaways:** * **Websocket Requirement:** A functional websocket connection is crucial for integrating custom LLMs with Hume AI. The LLM needs to be deployed and accessible via a websocket endpoint. * **Custom LLM Benefits:** Using a custom LLM offers greater control over the conversation flow, allows for the integration of RAG, and provides the flexibility to choose different LLMs (e.g., Claude, Mistral). * **Emotion Analysis via Langsmith:** The integration enables emotion analysis of both user input and LLM output, which can be traced and visualized in Langsmith for insights and improvements. * **Langchain Orchestration:** Langchain facilitates the AI flow, including tool calling, prompt management, and integration with external resources. * **Real-time Data Integration:** Custom LLMs can access real-time data and integrate with RAG (Retrieval Augmented Generation) to enhance the quality and relevance of interactions. **4. Notable Quotes or Examples:** * "The custom model needs a socket connection, and so this socket connection is going to be how we actually hit the model and then we're using it to actually structure our payloads and send it back to Hume." * "Using Lan chain and be able able to be more deterministic about your workflow uh by programming that it gives you more uh flexibility on controlling the the steering like the conversation itself" * (In context of Langsmith tracing) "It is showing the emotion in the responses...it's just really interesting to be able to if you wanted to annotate on any of that information or do some sort of data set of any of your findings since you are using a custom llm this would be a really interesting way to kind of associate that with it" * Demonstrates that the code shows how custom LLMs allow you connect to other Open AI-compatible models as well as those that include RAG. **5. Target Audience:** * Developers interested in integrating custom language models with Hume AI. * AI engineers looking to enhance user experiences through emotion analysis and personalized interactions. * Those seeking greater control over conversation flow and adherence to regulatory compliance within AI applications. * Individuals interested in using Langchain and Langsmith for building and debugging AI applications with Hume AI. ## Full Transcript hey everyone welcome to nerding IO I'm JD and today what we're going to be going through is Hume Ai and the ability to connect to a custom model so what we're going to be doing is looking at how to set this up with websockets we'll be using Lan chain and then we're also going to be looking at the trace itself in Langs Smith with that let's go ahead and get started all right so what we're going to be doing is we're going to be looking at Hume again and we're just going to dive right into some of documentation we're actually going to be using a uh the custom language model this is not a custom model it means that we're going to be connecting to a different llm and so they have this example repository that we're going to go through and then we're going to trace it but I thought it was interesting to look at some of the reasons that we might want to use a custom LM so there's the ability to kind of control the advanced uh conversation steering so you can think about that with when you have control and you're using something like Lan chain and be able able to be more deterministic about your workflow uh by programming that it gives you more uh flexibility on controlling the the steering like the conversation itself uh you can adhere to a stricter Regulatory Compliance right to uh make sure that you have your moderation or uh any other guard rails that you need to put in place it can have better context awareness the the real-time access we we can actually do this in their tools and search uh but this other one is also really huge right if we want to be able to access our rag then there's a couple different ways that we can do this where either we're using their tooling or we can actually Implement a llm that has the ability to have the the rag uh in place and so this is just kind of a workflow of how we're going to be doing this because it requires as you can see here this the custom model needs a socket connection and so this socket connection is going to be how we actually hit the model and then we're using it to actually structure our payloads and send it back to H so let's kind of dive into what that means first if we go to their uh example repository we're going to pull that down so we've actually used this before if you've seen it in the other videos they have like the function calling and things like that we're actually going to use this uh custom language model and what's different about this is It's using a handful of things so we're actually going to be using python we'll use poetry ocorn and then engro enrock is how we're actually going to make the connection to our local machine and then of course Lang chain for our uh AI flow so a couple of things to note is that in order to get poetry up and running the first thing that you're going to need to do you can see here I have an instance running and so you're going to want to do poetry poetry shell to get your instance started so I've already activated my environment and then you need to do poetry install to get everything that they have again I already have the dependencies in place that's one thing that I noticed that wasn't in the setup where it actually talks about starting the connection so now that we have this we can actually go ahead and do our uh our connection so we're going to go to poetry and we'll run this uh and but before we even do this another thing we want to do is set up our environment bar variables so we're going to go ahead and do that as well so if we jump over to the code the first thing we want to see is that we're going to use the C API we need an open AI key and then we want to put in our information for tracing this is specifically if we're using Lang Smith which comes out of the box with Lan chain so you can just set up these in environment variables and get off and running the reason that we need these two is because if we look at our file in our agent specifically we have that we are using this tool and we're actually connecting to open AI now we could actually change this so that it could be any llm that we wanted to connect to whether that's like claw or mistl or whatever or even an open AP an open AI compatibility model this is also where we could do a lot of like the rag instances and and any kind of information that we want to connect to so let's go ahead go ahead and get those variables again we're using Ser as as the Search tool to call out and then we're using open AI as the tool to uh get our um our llm so if we now try this we're going to get up and running we have no problems here the other thing that we're going to need to do is use n Ro so endro is actually a tool that allows you to basically like proxy into your local machine you might have used it for like stripe or something blow this up a little bit but the uh the way what we're going to do now is actually run enro and so this means that this URL is actually pointing to our local web interface here and uh you can't actually see it but instead of this web interface it would be the local host of 8,000 and so now what we're going to do is we're going to go back and we're actually going to go into our configuration see again right here we have our enrock instance so next it's telling us to go into our voice configuration in order to create a socket so if we go directly into uh a configuration in Hume we can just use an old one or you can create a new one if you're creating a new one uh I just went with default system because you're going to have to once you get to this stage you're going to have to change it to the custom language model and so now what we have to do is we have to put in our web socket and so in order to do that if you look at the instructions it's telling you that you want to have the uh the websocket of your custom language model which sits here right and so what we'll do is we'll pull that information from enrock so we just need this and we'll go into our custom model and we'll do a web socket protocol and point it there and then we have to do slash llm and the reason we have to do SL llm is because we need to save if we look at our code base and we go in here to our main which is how we're actually launching this application it's looking for this websocket endpoint it's not looking for HTTP so if we were to go to our local it's going to show not found and that's because even though we're running our local host on Port 8000 there's no endpoint that allows us to hit it so we can't hit uh like llm nothing would show up but in our websocket it will and so what this websocket is doing is once it accepts a connection it's actually pulling in this agent and then it's starting a stream so as long as information is coming back from this socket we be getting the responses and sending it text so what we're going to do is we're going to look at our agent and just kind of see a couple of things so again all of this is pulling in L chain so this is how we're actually going to be able to trace this is how we're actually going to be able to use our um our agent and down here we have our tool again this is no different than function calling or tool calling and we have our prompt which we're pulling from a template repository or the Hub and then we're using our open uh AI llm we're creating our Json chat and then these functions down here are actually explaining the uh paracity and the hum message structure right because it's a little bit different than the structure of like an open AI compatibility schema and then we're going to be doing our responses it'll give us our history and uh even the number of words so this is just a pretty simple way of showing how it's going to respond back with information what we're going to do now is we're actually going to go in and test our model so let's double check that this because I was playing with this earlier is the same here and we'll go ah and save because you can do it in both places and let's go ahead and give this a shot so we're going to go ahead and start a call real quick everyone if you haven't already please remember to like And subscribe it helps more than you know and with that let's get back to it cool so now we're just going to go ahead and start our call hey I'm just testing our Lang Smith tracing with youi hey hello I'm here to help with any questions or tasks you have please feel free to ask anything you'd like assistance with cool so I've muted myself and now we're going to actually take a look at the chain so we can actually see the information that's coming back here where it's saying that uh it had if there's any questions we can ask it and so it's here to help so let's go ahead and just ask another question can you get me information about link Smith sure as we can see it messed devel platform for every step of the L amow application life cycle it provides tools for debugging and updating business information additionally It's associated with music and so now what we can see is that all this information is starting to come back back right again it messed up the uh the name there but it knew even though the spelling was wrong to go out and find Lang Smith here and find information around that so that's uh we're kind of looking at like the tracing here and so now what we're going to do is actually go into lsmith and see the results that we got so remember if you look at the example we have this custom language model so if we go into Lang Smith we have our project here there's uh multiple run Counts from some of the testing and what's interesting is in these executions it is telling you the uh the emotions associated with the input as well as the output so here's all of the information that's going to open AI here's where it's actually doing the chain the the adjacent agent uh and parsing that into would then go into search this is where it's getting information around what it thinks lsmith is and then this is the actual response that we're getting and again it is showing the emotion in the responses so even though and that's because we're mashing it together so as it's speaking it's not saying slightly interested it's just associating that in the trans to display emotion and that was something that I was I thought was really interesting is that through this agent and the uh right here this is where it's like joining all of the emotions that are being felt from the information that's coming back from hu again in our transcript it's just really interesting to be able to if you wanted to annotate on any of that information or do some sort of data set of any of your findings since you are using a custom llm this would be a really interesting way to kind of associate that with it cool and so that's that was the the biggest thing about being able to use a custom model is that it's connecting through a web socket so whatever you're deploying this to would need to be a websocket instance all right that's it for us today everyone what we learned was how to actually build and connect a config to a custom model in this instance it's running locally on our uh and connecting through websockets and enro we also were able to trace that information back through L chain and L Smith and with that happy nerding --- *Generated for LLM consumption from nerding.io video library*