The future is still human driven. I think there's going to be a realization, especially with a lot of companies who think like Agentic, AI and all this stuff is going to solve every problem in the world. Um, I think those people are probably going to be sorely disappointed. Hello and welcome back to another episode of Insta Blinks. I'm your host, Brian Graf, and today we have David von Thienen, who is an AI ML engineer at NetApp with me. David, welcome to the show. Yeah. Hey, Brian, thanks for having me. Hey, we're really excited about this episode today. There's a lot to cover. We will see how much we can get through in the time that we have, but we want to talk all things AI. So basically, let's level set. We've all heard the terms AI for like decades, but over the last few years that's really I feel like accelerated a lot. And we're starting to hear a lot of other terms that I'm hoping, David, you can help kind of debunk or at least unpack for us of what they mean. Um, so, you know, maybe starting with Gen I, which I feel like is kind of the big one that everybody talks about, or at least they think that they're talking about and what that is compared to some of these other ones that we'llmention here shortly. Yeah, definitely. Um, yeah. So Gen, I like people use that term frequently and it's kind of this weird catch all bucket for everything. I um, when in reality there's different classifications and different buckets that, you know, your AI solutions can kind of fall into. So gen AI or generative AI, uh, the way I kind of like to think about it is it's basically an AI system where it generates. Right? You give it a prompt, um, and it generates a result, whether that's text, image, video. Um, so it's something that an AI system that generates something, um, typically these days, like gen, I like the definition of like what people really kind of like. The parallel to it is it's basically your LLM these days. Um, it's, you know, give it a prompt, it generates something. And that's kind of done. And the other term that you typically hear quite frequently is Rag. So retrieval augmented generation. So that's another term where you uh kind of take an LM so an existing large language model, and you want to augment it with a certain knowledge set to make it kind of like a domain or knowledge expert. So you want to train it in like medical or you want to train it in legal related topics. And so that's basically imparting knowledge or augmenting an LM with that kind of knowledge. And the last one you hear, which especially right now because it's the hot thing, is, uh agentic AI and these systems are really kind of, uh, where you're given you're turning over the keys to the thought process, uh, for, uh, like a large language model to be able to make its own decisions, to come, to formulate a plan of attack, to answering a user's question and then having thatchain of thought or that multi-step process to like, I need to do research. Here are the tools that I can access. And then out on the other end, you know, you get your answer for the user. And so like those are the three different buckets that I kind of like to look at. Um, when you're trying to classify like these AI solutions. Yeah. So I think when we talk about gen AI, you've got like mid journeys, you've got I mean even I think mid journey is a great one because you're basically prompting it and it's creating a visual for you. Right. Um, but then I also covers other things like text. Right. So reports um creatingcontent based off of um, my son, for example, asked me to ask Grock to have it tell us a story. Right? Create a story about Star Wars and tell it to us. So would that be a pretty good gen AI example right there? Yeah, definitely. So usually it's, you know, something is being generated based on a user's request. And it's more taking the initial context and without the reasoning mechanism, although it's the lines are getting blurred a little bit, they're generating something, whether it's video, a story or some other, uh, output for the user. Yeah. Okay. And so then I guess the next you talked about rag for a moment there, and that feels kind of like, uh, a dot subversion of, uh, AI where you go and you take something that potentially was already generated or you're adding in content. If I wanted to scan a bunch of books and have it come back and tell me or let's say prescription, uh, papers and say, hey, can you tell me, um, a summary of what these, uh, different prescriptions do, or are there any adverse reactions between these? That's more ofRag where it's working off of a pre-trained model. But I'm also adding additional context in for it to work off of. Correct. Yes. Yeah, definitely. So like another way to kind of think about it is like if you were to give an LLM who hasn't been trained in any kind of like medical knowledge or legal advice or legal expertise, and you were asked to, you know, ask like Lama2, lama3, any kind of medical related stuff. Yes, the LLM might have been trained on some medical information, but it's highly doubtful thatlarge language model has been trained on very specific details to like medicine and drug interactions and that kind of thing. And so that's the reason why when you ask it like hyper specific or hyper focused question, like an LLM, hyper focused or hyper specific questions, you tend to get hallucinations or plausible answers that might seem correct. Um, but if you know you aren't in that field, you'll be like, oh, that totally sounds reasonable. But in reality, it's really kind of just hallucinating an answer because that large language model hasn't been trained on the specific details about medical, you know, medical, uh, medical drug interactions, medical related terms and that entire domain set. So, um, that's where Rag comes in, right? It's imparting that knowledge about that specific field to a large language model so it can get hyper specific. I kind of like always think of it as like a subject matter expert, like you're making an LLM, a subject matter expert in a particular area. So and then for that you have to I mean, would you have to be constantly retraining your model to,be up to speed on everything that's going on around you for rag to work properly? It depends. Right. Um, it depends on the domain or the,knowledge set that you're kind of working with. Um, so like a good example might be, you know, hey, if you're training an LLM like whatever type of large language model that is about like, uh, since we're in this medical field, right. Like specific chemistry about like how drugs interact with each other. Those rules typically don't change all that frequently. We're not like inventing new forms of chemistry weekly, monthly or even yearly, maybe even, you know, on a decade level. Right? But if you were to ask an LLM, hey, tell me about like, what's currently happening in the AI space, like, what's the hot new thing? Those large language models probably have a certain cut off date where their knowledge or their training data has been cut off, like six months ago or nine or a year ago. And it's going to be missing the new latest trends in like what's happening in AI. So it might give you some information that seems plausible, but it will be, you know, potentially terribly out of date considering how fast everything's moving in the AI space. Right. Sowhat are some ways that people can get around that? I know we talked yesterday actually aboutsome of this. So I'm hoping you'll bring it up here as well. But like, is there a way to bridge the gap between stale data or having to retrain constantly forRag to work? Is there like a happy medium in there to still be able to get like latest, uh, and potentially even near real time results? Yeah, definitely. So there's a couple of different mechanisms. One is, uh, one of the hot topics, especially coming around these days is uh, MCP. So model context protocol. So that's a great way to augment. It's basically uh, MCP is a server. Uh, it's a protocol, but you have an MCP client which is connected or attached to an LM, which contacts an MCP server or an endpoint that provides some sort of capability to the large language model. So just think of it as a tool, right, that the large language model can use. And behind that MCP server or that endpoint, you can augment it and provide additional capabilities, or what I think is probably going to be the most widely used. Uh, use case for MCP is providing additional knowledge or real current relevant today happening kind of data sets to these large language models. And then the other mechanism too, is something that I've kind of been really interested in is using, uh, graph type Rag implementations that augment or play nicely together with graph or vector based, uh, Rag implementations of vector based Rag agents. So those are kind of like the two mechanisms that I see that are really going to be really popular for bringing some of this newer knowledge in quicker without having to do the retrain. So you've just brought up a few different topics that I want to jump into. But first going back to MCP. So just to clarify, MCP really gives the LLM, uh, the ability to understand how to interact with data for a specific tool. Correct. This is like basically laying out the framework of saying, here's how you can talk to or here are the commands you would use to interact with another system. Yeah, definitely. So kind of like the way I like to think about MCP is right. LikeUSB right. So USB hit the computer scene like many decades ago at this point, and USB allowed us to plug in different capabilities and augment our laptop to do different things, right. Whether it's a monitor or a storage device or whatever. So I think of MCP as kind of like a USB bus, where you can hook in different capabilities and provide different tools to your large language models so that they can do anything from get or set operations. So like read and obtain more information, augment its data set, or even do set operations like, hey, um, if this is like an e-commerce thing, Uh, go purchase this particular product onan e-commerce site. Um, but the key kind of takeaway for MCP, it's a natural language interface to, for an LLM to connect into and access those capabilities, which is super cool. So basically, you know, six months ago, for example, when I would type something in to, let's say, Claude, I um, it would tell me what I would need to do with some of these systems with MCP now, rather than it telling me what I need to go do step by step, I can actually integrate thosesystems with Claude and have it basically just do it for me. Correct? Yeah. And even like an intermediate step between the two. Right. Uh, you know, about what is it, six months ago ish, they, you know, large language models and these frameworks came up with function calling. Um, even then you had to, uh, in your AI solution, directly Import like SXSW and theseprogramming interfaces where they became very bespoke to the LLM and they came very tightly knitted with the LLM. So you couldn't like move the functionality somewhere else unless you were doing like a copy paste. Whereas MCP is this separate process, a thing that exists outside the law that where it can access that information in those APIs without being exposed to those APIs? Yeah. Great. So taking that next to the other two, uh, words that you used were, uh, vector and graph. And I think maybe for a lot of our listeners, they're unfamiliar with what these are. So let's maybe break down vectors, vector embeddings, vector databases, and how all that ties into llms with semantic search. Um, so that people can understand because I think, you know, we use this every day. You know, if you go to Amazon.com and you type in, you know, running shoes, for example. Um, you know, 15 years ago, even ten years ago, you would only get results ofproducts that had running shoes in the title or in the metadata. Right? So howhas that transitioned over the years? And how does, how do vector, um, vectors, vector embeddings change that with the way the AI works here? Yeah, definitely. So, uh, you know, vector embeddings, it's one of the methodologies or processes for, you know, going back to our, you know, rag agents. So our retrieval augmented generation agents that become subject matter experts. So that's like the process where you generate vector embeddings typically these days like the default is using a vector database to store this additional domain knowledge, uh into a vector database. So generate the embeddings, put them into a vector database. That vector database becomes like a layer that sits with the LM, and that's what gives the specialized knowledge to the LM so that it can, you know, talk about sneakers and the different options and varieties that are available, right. Um, so that's typically when we think of rag agent. That's what that is,that LM plus that vector database with the domain set vector embeddings to combine and create a Rag agent. That's a subject matter expert. And if we wanted to talk about, uh, graph based rag, graph based Rag is slightly different. Um, and there's a reason why you want to combine both, but graph based rag is where you kind of like take all of the, uh, domain set the,data that the data set for your particular domain, whether it's sneakers or medical related stuff. And you create structured, uh, ontologies in graph world that's basically just structured relationships between two pieces of data. And you put that into a graph database. And what that does is you when you ask a question about a particular area, whether it's sneakers, you in a graph based Rag implementation, you retrieve all of the information that's related to a related to sneakers or particular sneakers. Maybe they're Nike's or Jordans or whatever they are. It grabs all of that information and uses that in with your initial query to go sifting through that initial data to find the answer. So it's very different than, uh, vector, which is, hey, we've augmented the LM to know this stuff. Um, whereas graph based Rag is like, here's all the data that's associated with your query. The answer lies here. Go sift through this information. So it's very data relational Uh, oriented. Okay, so if we go to the next piece of,the discussion. So that kind of wraps up a little bit more around rag. And we'll probably end up talking about it more, uh, this episode, but there's another term that we're hearing pop up, and I'm seeing it at conferences, I'm seeing it online and all the forums. What is a genetic eye? Yeah. So a genetic eye is the it's basically large systems where you have, uh, turned over the,decision making process to an AI system on how to attack a problem. And usually these days they are combined with, uh, instead of just using simple large language models, they're paired with, uh, these days with, uh, reasoning based large language models. So these are models that take, you know, a minute, two minutes plus to, uh, churn and think about a particular answer before answering. So it allows the large language model to think about the answer for an extended period of time, to get the answer, to get a more precise answer. And in Agentic AI, we use these reasoning models, uh, to decompose a problem. Find out how, just like humans do, uh, decompose a problem, figure out how to attack the problem. These are the steps that are involved, uh, in order to arrive at a final answer. And it basically goes through each one of those steps, researches, thinks about the problem, and maybe it might need to refine those steps. Right. Based on the research that it's finding. And that's kind of the very interesting part about agentic AI. So these are systems that are kind of like non-deterministic in how they get to their answer, because you're literally turning over the decision making process to these AI systems to arrive at that answer. So I mean, at that point, I've seen this with, uh, with several different, uh, AI tools that I've used where I put it into like deep research mode, for example. And I say, give me the, uh, outlook or the, you know, the market sentiment about something. And if you expand what is going on while it's processing, because this could take, you know, five, ten, 20 minutes to run. I'm actually seeing some of that decision making going on where it says, hey, it pops up a question and then it answers itself saying, you know, should I check this out? Should I look at this aspect? Okay. And it goes through and it sifts through even more. Um, and I feel like it saved me hundreds of hours of potential work in research over the last, you know, the last few months alone, um, or potentially, you know, a bunch of interns, you know, looking something up for you. Um, when we have something like Agentic AI, is there even a need for Rag or Gen I? Or is everything going to be agentic? No, I think there's definitely a, you know, a good use case for having both, right. And it kind of comes down to like what how detailed and how specific do you want to get an answer in a particular problem set or domain. And if you were to, you know, a lot of these large language models like llama and, you know, Claude, they're trained for like general purpose, uh, question and answers. The data sets are very,large. But if you were to compare the, you know, the data sets that they were trained on for, like medical related stuff compared to just general use it, that is probably a far greater subset of its training data. And so there's definitely a good reason to use these rag agents, uh, with hyper specialized data sets, saying, I want you to become a subject matter expert in shoes or in medical related topics. Um, so there's definitely going to be a need, and I think you're probably going to see this explode quite a bit. Uh, especially, uh, right now there's this big movement to, uh, connect multiple agents together, uh, to form these agent systems and kind of like the architecture of that is, hey, when I kind of reach the limits when an agent and that says when I've reached the limits of my knowledge, I want you to go contact a different agent that's advertised its capabilities. That's saying, I know about shoes or I know about medical related stuff. I'm going to go query that agent instead to pick up that information. And when I get the response, thesuper knowledgeable medical response back, I can use that information and continue on with what I was processing in the original request. And so I think you're going to see a lot of these systems kind of like break apart creating little subject matter experts that are very knowledgeable in one particular area, but being able to collaborate together when they hit a particular topic, they're not they don't really know anything about because they're more general purpose. Let's go ask this other thing that knows way,more about it than I do. So I'm sitting here thinking about this going, okay, so a gigantic, uh, I hear isvery iterative. You know, it's not like I can just type something in and it just does one thing for me. Um, we see this with a lot with coding, too, and hearing the term coding. And people will put in a prompt and it starts to build things out. And it's not just a one. And I mean, sometimes they'll say they try and do like a single prompt and uh, one shot it. But uh, at the end of the day, you are tying all these things together. You're tying in MCP. For example, when I was working on a project over the weekend, I was using cursor. I was using super,bass. Um, I was using the MCP, uh, server for that as well. And, uh, started off with bolt new, um, where I basically said, this is what I want to try and create. And that kind of created a framework for me. And then I pulled that into Kerser and basically said, build out the database side, the schema side. And it started working through all this for me. It would find something didn't work because it would do a test and then it would fix itself and keep going. Myquestion here is like, how much trust do we give these right now? And how much trust should we give them? How much should we be kind of like holding back a little bit? Or is this just like full throttle right now? Everyone, it's a hot topic and everyone wants to be leading in this. What are your thoughts on that? Oh, my. Um, yeah, I actually have a ton of opinions on this area. I actually, I was speaking at, uh, render ATL, and someone wanted to meet with me before my session and talk about this subject. Um, so, and it's very interesting, I think. You know, the vibe coding thing. I think it's actually great, uh, because you can empower people who might not know how to code, get through an idea that they have, and kind of make its way and see its way through to the final, hey, this is working. You know, it can do what I kind of set have it set out to do. However, I think, like in the long run, at least for right now, um, you know, it's kind of very problematic to put those kinds of systems into production without having a real great understanding of like how the different components work and like what it's actually doing under the covers. And, you know, a lot of these AI systems haven't been trained on, like, what do I do when a, you know, a bad actor comes in and tries to do certain things? And so I think it's great for like demo and POC kind of purposes, but like in the long run, we still need engineers to like understand how these components, these interactions between the components work and to like, really think about like, uh, what's happening under the covers to understand security and like, how do I fend off bad actors and that kind of thing. Um, and a lot of it, you know, if we want to go into, like deep dive into the technical aspects of it, a lot of it is because of the context window, right? Like especially with small projects. Not a big deal. Um, but when you get to more, uh, real world useful projects,products, um, you know, there's only so much information you can put into context windows, even though like Google Gemini has, you know, quote unquote, like a 2 million token context window, you can't fit these massive storage code for storage systems or, you know, DevOps related code into these context windows have it expected to understand everything that's going on with the system and then producing code and updating whether it's patches, security releases or whatever. Um, to update the code to,reflect that. And at least for the time being, um, you definitely are going to need people to understand those larger, higher level concepts, understand like security and um, even especially now, kind of like where I was talking with that person at uh, at render ATL, keep LMS honest with the code that they produce. Um, so, you know, we need to like, lean into, uh, unit tests and integration tests and those types of, you know, validation so that we can keep these LMS and these AI systems honest. And, you know, what they're producing is actually what we want, right? I think one of the struggles that I have sometimes when I'm trying something new like that with vibe coding, is the LMS come off being, um, so confident in what they're saying. And I'm like, sure, this sounds good. Copy paste. Let's run this, you know. And then something breaks and I say, hey, like this is broken. It's like, oh, you're right, you know. And then it will, uh, tweak itself and,try to fix it. And I find that it's really sharp at the beginning of a project. Right. The context window is pretty small. You're probably at, you know, anywhere between a couple thousand, couple, tens of thousands of tokens. But once you start growing this and it's a project with, you know, 30, 40 files and the complexity increases, I start to see it that it's losing some of that context. It's like I told you about this. You actually helped me write the code forthis piece here, but you just completely messed up interacting with it. So I definitely see some of that. And I think my feelings the same. Like I would not, uh, not be running that in production quite yet. Like if it has access to systems, if you're telling it that it can control firewall ports and other things, you know, it's, um, I don't know that my confidence is quite there yet. Yeah, I had one mentor, you know, like one of his famous things that he always would tell me, you know, way back in the day was like, you know, it's okay to not know. Like, I'm a lifelong learnerI love learning about new things that I don't know about, whether it's in medical or this, that or the other. But like one of the biggest mistakes that you can make is like being confidently incorrect. Um, because you'renot challenging your own assumptions and like, you know, you're not challenging yourself to make sure that what you built is actually what you set out to do. And like,you're saying like those llms when they're confidently incorrect. It can be disastrous, for sure. Especially when you're trying to learn the technologies. Right? Like, I'm sitting here and I'm reading through the code, I'm like, okay, this makes sense. You know, tothe extent that I understand it. Um, but, you know, all of a sudden it left out anything around security because it just wanted to make it work. So all of a sudden I'm not using row level security in my database. And, you know, potentially a bad actor or another tenant could come in and read that data. All because I, the LLM, was so confident that it gave me what I needed. When I at first I even said make sure this is secure, right. So I think that it's great. It's helped me learn to code so much faster than I could have done. You know, even a couple years ago, just cruising through Stack Overflow, looking for answers. Um, now I'm learning full concepts, or I get to something and it tells me something. I don't quite understand it. And so then I just, I start chatting with it. Right? I say, help me understand this. So I think it's a great learning tool. Um, on top of,productivity. But, you know, we're still not quite where we need to be, I think. Yeah. And I think it's an amazing learning tool. And it's especially these days, at least for the past nine months, it's the first place I go to like, I likeyou said, I use deep research. Hey, go research this topic for me. Save me two days of pulling code examples together. It's definitely not going to be 100% correct. It might even be outdated code, but it's a good place to start to like break down the concepts and then you as an engineer go figure out like, oh, this is what it's trying to do. And let me put all those, you know, those best practices and those things around there like security and, you know, things we need to worry about in the real world and apply it to, like what? The research. The research that it's given me. Right. Yeah.exactly. So let's jump topics here really quick. You are presenting it a lot of conferences. I've seen your conference list. I've seen some of the sessions that you've done that have been posted on YouTube. Um. You're everywhere. You'redeep into AI and ML. What are some of the hottest things that you're seeing either currently or that are kind of on the horizon? That would be interesting for our listeners? Yeah, definitely. Um, so I had my last talk was at render ATL, and it was literally the title of the talk was like, uh, just in time user interfaces and User experiences. And I kind of like the talk was probably not what people were expecting, but I think it was very well received. It was kind of like making the case of what I think, like user interfaces were going to be, and it was kind of like laying out the case of like, what's happening. And I was literally pulling out news articles of like, what's happening today? So as an example, like, you know, OpenAI is pioneering, you know, madelarge language models and chatbots super accessible to everyone. And it's kind of very interesting to see, like some of the things that they're looking at. And so it was kind of breaking down the things that OpenAI was doing. So as an example, um, several releases go like six plus months ago, they started to really put in effort to making sure that all of your previous chat history was referenceable so that you can use it going forward in all of your future chat conversations with ChatGPT. And that in turn, uh, now makes the AI system understand you a lot better. And, uh, OpenAI also purchased, uh, the company that, uh, Johnny Ive had called io, um, which now there's a legal trouble about the name, but the purchase is still there, um, where they want to introduce a hardware device to kind of effectively. And they've kind of come out and said it to effectively have us use our phones less. And the idea is,that they're kind of, uh, trying to make this device so that it's more context aware and more aware about you as an individual, so that instead of having to search for things, it will, uh, asynchronously ping you and say, hey, you might be interested in purchasing these concert tickets, which was actually the demo that I presented, um, just based on searches that I was doing on LCD sound system. Um, which is my favorite group, by the way. Um, but so I think that's a lot of the cool, interesting things that are kind of happening where, uh, things are going to get more contextual, more focused. Um, and like I kind of said at The talk, they're super ahead and they're super advanced in where they're at because they're leading, you know, one of the leaders in the AI space. But now we as kind of like small medium businesses because that was like a large part of the group that was there. We have those tools and we now have those capabilities, and we have the knowledge now to do that. So how do we make it so we lower the barrier of entry for our users to get to where they want, like the answers, the tickets, whatever it is, uh, quicker. And I think that's going to be kind of the future of a lot of the stuff. We're going to see more data science. And the short answer on that. That's fantastic. Well, hey, thank you again for coming. We've talked about gen AI. We've talked about Rag. We talked about vectors and graph databases, MCP, Agentic AI. This has been a full 30 minutes. So really appreciate you coming along. Where can people find you if they want to learn more from you? Yeah for sure. Yeah. My all my socials are, uh, David von Thienen. So LinkedIn, uh, GitHub. Uh, all my socials. Uh, David von Thienen. One word. You know, no spaces. You can reach me there. YouTube, all the fun stuff. Um, yeah. Looking forward to chatting more about everything AI and ML. Awesome. Thank you very much and we'll talk to you all next episode. See ya. Bye.

INSIGHT, Artificial intelligence

Unlock the secrets to building smarter AI

If you've been curious about how AI tools are reshaping industries and tackling complex problems, learn about cutting-edge advancements in AI, from breaking down the buzz around generative AI and RAG, to the potential of Agentic AI.

Learn more about Instaclustr