Contextual AI CEO Douwe Kiela on RAG 2.0: active retrieval making AI more dynamic and context-aware

Jul 17, 2025 · Full transcript · This transcript is auto-generated and may contain errors.

Featuring Douwe Kiela

It's a It's a vacation home, but better. And next, we have uh Contextual, the founder of Contextual AI. Welcome to the studio studio. Bring him in. What's happening?

also if I have this correct the inventor of uh rag correct one of the the authors of the rag paper okay okay so yeah that was a team effort lots of folks and it's long history of you know research that has gone into that yeah give me the state of the union no man who invented rag by himself in a in a in a in a cave of scraps um give give me the state of the union on on rag uh Some people are saying, "Oh, just use a bigger context window.

" Like, what are what are companies actually using Rag for on a day-to-day basis? What's the state? And then what's the shape of the industry that's popped up around the technology? Yeah. So, the the rag is really a very simple idea, right? It's about having Gen AI work on your data.

Uh, and you do that through retrieval. That's the R. And you then use your retrieval results to augment, that's the A, your generative AI, that's the G. Yeah. So it's a very simple idea how people do brag right now is radically different from uh what we did in the paper originally.

I think the the buzz word these days is about context engineering, right?

So how how do you actually give language models the right context so that they're they can do their job and as it turns out all the language models are pretty good these days and there isn't that much of a difference between you know your claude and your open AAI models or Gemini.

If you give it the right context then it can solve the problem. If you don't give it the right context, then you can have an amazing language model, but it's going to fail. Uh, and so that I think is an opportunity that a lot of companies are are looking at now.

How can we make sure that this this context layer really works? Do you feel like there's a rag step or layer in the typical deep research projects products that I'm using on a day-to-day basis?

I feel like it's mostly like going out to the web, searching, and then kind of just, you know, summing all this stuff up, but I can't tell if it's actually like the the basis of rag is like embed all of that into weights and and then and and and then search over it.

Um, but is that happening at at the state-of-the-art right now? I think so. But you're right, a lot of it is web search. Yeah. Um, and uh if you want to do web search efficiently at scale, then you probably use simpler algorithms. Um, so uh things that aren't that involved.

But even web search very often uses embeddings and where do you search there? So you could argue that web search is also just interesting. Um why is why is search so bad right now?

I feel like I feel like I can't search my email for anything and Google has like frontier models, but yeah, I feel like search has just become like really hard, but then at the same time, I'm like having my mind blown by LLMs and deep research products, but I don't want to wait 15 minutes just to search my email.

But maybe that's what I need to do. Maybe that's the future looks like. Like what is going on there? Yeah, I mean that's a hard problem, right? But I mean AI should be able to search your inbox for you and just give right answer, right? Uh that's the goal. I I think it's happening.

It's just search is a really hard problem to do. Um you you don't want to really do it in a single step. Um so so um the way you do proper retrieval is multi-stage sort of cascading with smarter and smarter models that look at what might be a relevant result.

Um but you're right if if you retrieve the wrong things then you can never give the right answer. Um so that's a big big open where where are you guys focused uh today? a number of different use cases. Yeah. So the the the the use cases we're we're looking at are really your bread and butter use cases uh for rag.

So answering complex questions on complex documents. In our case we scale to millions of documents which is unusual. Uh so one of the most common misconceptions about rag is that people think that it's easy which is actually probably true.

If you have like two or three documents you understand the use case it's not that complicated. Uh but when you go to the real world and and you uh talk to some of of our enterprise customers, they have very difficult problems. The data is all over the place. It's very complex data.

Um there are millions of documents that it needs to work on top of and then search breaks down. So you can't give the right answer even if you have a great language model. Yeah. I mean the the the scale of data at some enterprises is probably what seven orders.

I mean, I imagine like the number of emails sitting in Gmail inboxes across the entire network is is is way beyond millions. Um, fascinating. That is also to your earlier point about like long context models. You can't fit all of that in the context of a language model. You need to retrieve no way.

Uh, what's the state of the business today? How big are you? Like kind of what what are the new challenges? What are the next milestones? Yeah. So we're uh 70 80 people uh working on a lot of interesting problems kind of nice kind of across the across the board trying to expand into to different use cases as well.

So uh beyond the traditional rank use cases looking at things like root cause analysis uh codegen because uh you know codegen is very hard and also often requires technical documentation that you'd want to incorporate in your codegen. Um, so, uh, yeah, making a a lot of good progress on on very interesting problems.

Very cool. Well, thank you so much for stopping by. Love to have you back for a longer conversation. Yeah, let's do it again soon. Hope you have a great rest of your day. We'll talk to you soon. All right. Thanks, B. Great to meet you. Um, let me tell you about Bezel. Go to getbzzle. com.

Your bezel concier is available now to source you any watch on the planet. If you like enterprise agentic workflows, you might like fine watches.