Richard Socher on You.com, enterprise AI search, and the path from Stanford PhD to Chief Scientist at Salesforce

Jul 9, 2025 · Full transcript · This transcript is auto-generated and may contain errors.

Featuring Richard Socher

Uh, get a pod five. 5-year warranty, 30 night risk-free trial, free returns, free shipping at eight. com. I'm basically you're in a hole. My household is is in sleep shambles. Well, let's bring in Richard. Talk to him. Putting up uh the fact that we're doing this on five hours of sleep consistently says it all.

How did you sleep last night, Richard? How'd you sleep? Good to meet you. Hey guys, nice to be here. Uh I slept all right. That's good. Complain. You would have slept better on an eight sleep. So, we'll work on that offline. But great. Actually, I did I did buy it. I I uh retired it.

It just I think it's better if my body sets its own temperature. Oh wow. Interesting. Um would you mind kicking us off with the introduction on yourself and the company? Happy to. Yeah. Um Richard uh did my PhD at Stanford. Brought neural networks into the field of natural language processing.

Laid a lot of the groundwork for what now is Chad GBT. Um was a professor also on the side at Stanford for a couple years because no one was teaching neural nets like transformers and so on to students back then. And this was 2014 to 2018. Uh but my main job was starting MetaMind.

Started made it very easy to train neural networks for other companies. We got acquired by Salesforce, became the chief scientist and after Tweed's executive vice president running most of the AI efforts starting the Einstein um kind of suite of things and so on. Uh build out the research team there.

In that research team, we ended prompt engineering uh paper cited by the early GPT papers from OpenAI and others. Um and in 2020 I decided to start you. com to bring better answers to the world. Uh to change what I thought initially was search but I think now I think is something else and also started AIX ventures.

It's a relatively small venture firm about half a billion AUM that invests in early stage AI companies. It's not that small. Half a billion AUM is pretty pretty solid. Congratulations. Humble 500 million of aum. Humble. Yes. uh on on you. com uh where is the business today? What are the biggest challenges?

I feel like we've been hearing more and more about the data wars and how high are the walled gardens, how high are the walls and the beautiful gardens that we have all tended to with our Slack installations and our Google drives.

And uh this feels like the logical thing that you know, yes, I own my data until I want to give it to you or literally you. com. That's right. Actually, you know, as a startup founder, you have to remind yourself every crisis is an opportunity.

And the opportunity here is actually that a lot of data is in silos and those companies don't want to give it out, but they do need to make it useful.

And so, one of the many things that we've learned uh over our like changes and focus deeper deeper on enterprise is that doing good internal search is actually quite hard and quite useful.

And so we're partnering with a lot of companies and do search over their entire archives of decades and also really upto-date things for publishers and insurance companies like pretty gnarly complex problems.

uh and combining that with web data which also has its own complexities and a lot of you know like folks are in some ways trying to pay people uh in news which makes a lot of sense but also uh it potentially threatens the entire free open internet you know if you have to pay for everything you read or crawl then only Google can really afford that and then you have an even bigger monopoly so I do think we need to keep the web uh open and free for that and so merging All of that together to make companies more productive is what we now focus on.

What's kind of the best practice in the modern enterprise these days? Is it like try and be really diligent about sucking data out of every platform you use into some sort of data lake or like snowflake installation and then dropping u. com on top?

or is it figuring out a way to actually deal with the sharp elbows of direct API integrations into the databases that are managed by the other companies that I'm uh you know purchasing SAS from? That is a great question. I wish there was a simple silver bullet always do X and just have it all in data lake in one place.

But the truth is it's kind of messy and usually there's some data that's just so large you don't want to have another copy somewhere else.

uh like and then there's some data where you know we have the whole internet like we have an internet index right and you can't bring that into your virtual private cloud and so on you know it's just too expensive for each company um but then in some cases it does make sense if you want really deep understanding reasoning over complex uh structured and unstructured data inside an enterprise then you have to often copy it over uh and bring it into a new setup.

So one of the big things we just announced actually is a big partnership with data bricks uh where we are sitting on top of data bricks and we can actually answer questions uh over data that is in data bricks and uh it's been a very exciting partnership already. That makes a ton of sense.

They enable all their LLMs to have web index apps. Oh sure okay yeah that yeah that makes a ton of sense. Um how are you feeling on acceleration deceleration? I it feels like the vibes have shifted most recently. We had Doresh Patel on the show on Monday. He's kind of pushed out his AGI timelines.

Um, folks are talking about uh reinforcement learning not scaling as fast as people thought it would, the problem of continual learning. Just if you take a step back and maybe put on more of your academic hat, um, how are you feeling about the current uh, state of AI?

Obviously, even if the there's also the take that like I don't know if you agree with this, but like even if the models plateau, there's still so much enterprise value and so many problems to go solve. But I'll let you answer where do you stand? Yeah, I'll try to keep it short because I could talk about that for hours.

Um, please there's uh I think it's true that there are so many simple jobs that can actually be done already with the technology that's there assuming you have good data access and you had all the recent information and you know the company context and all of that stuff. So there is already a lot of lowhanging fruit.

At the same time, it's actually quite exciting for the researcher in me uh that you know hasn't fully died yet as an entrepreneur and CEO for many years like um that it's time again for research like in many ways we've knew we've known that you need large neural nets with a lot of data on GPUs and highly paralyzable training.

Um and you want the whole thing to be ideally endtoend trainable in some fashion. It's sort of been known ingredients for over a decade now. And indeed, we've crossed thresholds by scaling all three, data, compute, and and model size. We scaled those three things up. And it worked better and better and better.

And it created these emerging properties similar to like a smaller brain of a monkey just not doing certain epic things. Even though the brain kind of looks similar that has neurons too, but then you cross a certain threshold, you get these emergent uh intelligent properties.

And so what that means is now is the time again for actual research. Not just engineering and scaling things up and throwing more data and better data at it and bigger GPUs and and and all of that, but to actually go back and say like, okay, what is true intelligence? How can we get to super intelligence?

What does it mean for intelligence to increase exponentially for a certain amount of time? I think they're actually different dimensions of intelligence too that you have to look at separately and some do have upper bounds.

Uh and sometimes these upper bounds are astronomically far away like grounded in physics and in other cases the bounds are not that hard like classify every object on the planet in computer vision. It's actually not that hard in comparison to have all knowledge about the universe.

You know, an intelligence should have a lot of knowledge and it will take us a while to get to as much knowledge and the bounds of how much knowledge you can collect are rooted in physics and the speed of light cones around all the sensors you can have. So there's basically a time again for research and that's exciting.

So yeah, it feels like we're kind of paradoxically in an AI bull market summer in the Stripe dashboard or in like the ARR sense like we never never been adding more EV, never been doing bigger contracts, everything's good on the business side, but maybe we're kind of counterintuitively in a little bit of an AI winter on the academic side.

My question is if you agree with that or not, but also um do you think that with the AI researchers, it feels like the top AI researchers getting poached into meta and they're going into open AI.

Um, do we think that the next transformer is coming from or the next major research breakthrough comes from a foundation lab or big tech company or is there a role for academia to step up and do some like kind of longer timeline unbounded research to try and go explore even without even an economic model in mind.

I do think uh you cannot build another openi by just building an LM.

The LM was the one thing that worked out for open air after they spent hundreds of millions on robotic hands and on Dota like computer games and reinforcement learning and all these other projects and one of them actually worked and so if you want to replicate that kind of success and do research again which I think is the time is now.

Uh it does make sense to have that kind of entity um and and it can I think be done. Um and so I do think academia has a role to play in that. Uh thanks to open source models, academia can be relevant again because they have access to them.

They just couldn't have afforded it before top uh open source models were available. Uh and I do think uh a lot of folks are chasing sort of the latest employees in the top labs like OpenAI, Anthropic uh and so on.

But you can also go a step further and look at who's actually uh who's trained those people, how did they learn how to do research? And then you get to folks like Chris Manning, who's one of my PhD adviserss too, who just recently joined AX, our venture fund, uh like in a much larger capacity as a GP.

And so those are the kinds of folks that I think we'll need to rely more on again. Uh and many of those are also moving out of academia in into kind of labs that pushed frontier research forward. What uh I have to ask because it's so current. What do you what's your thesis on what happened yesterday with Grock?

How how do you have uh that uh big of a general uh I don't know alignment? Oopsie. So, oopsy daisies. Uh in in prod, you know, I think uh when you ship very fast, these things are bound to happen, right? Like people can push them in the conversations into certain directions.

uh you sample from the same model multiple times, you get different answers.

And uh I think if you try to be sort of a like free speech maximalist, which on many levels makes sense, uh but turns out there's a lot of funky speech out there, you know, that with no guardrails whatsoever, it will go into those very dark places. Yeah, makes sense. What are you expecting?

Uh, since it's in 6ish hours, I I believe I hopefully they're still announcing and launching Gro 4 tonight, what are you expecting out of Gro 4 in terms of um maybe benchmarks aren't aren't the right um just even even uh way of thinking about it, but but in terms of progress, my hunch is it'll be more of everything, but nothing like wow like binary like a novel thing, right?

It'll be more multimodal and deal with better understanding of images and videos and maybe sound. It'll be larger memory. It'll be uh slightly better reasoning. Um it'll be probably I mean we won't know for most of them but more parameters and and so on.

Um but maybe not like completely novel research that has a capability that no one else has. Sure. Yeah, that makes sense. Um last question and we'll let you go. Um, uh, what are you finding most exciting on the investing side?

Um, for me, $500 million fund, it feels like the foundation model labs, the big training runs, the multi-billion dollar rounds, ship has kind of sailed on that front. So maybe the time is for application layer investing, but what what are you seeing? What are you excited about?

Yeah, we've we've been actually very fortunate at AIX Ventures uh to not have like to avoided uh some of these massive rounds. The way I describe that is that not every company that raises a ton of money um in a very early stage uh like is bound to fail.

At the same time uh you basically combine a seedstage risk with a late stage return in many cases. Uh and that you know and expected value just doesn't work out very well. And so there are a handful of foundational companies uh like Hugging Face that we invested in at a seed round uh and and Windsurf. Congratulations.

Those are great companies. Yeah. Yeah. These are all like companies who invested in their in the seed round. Invest in Perplexity and Flow and Whisper to Tobbit Ambience like bunch of really amazing companies.

And so uh I think there are only a one or two dozen foundational model companies that the world needs and then there are thousands of application companies. So short answer is yes you're right. Uh I think you will see a lot more new applications.

I also believe that all the stars are aligning for biology uh and hence medicine and hence health to have a major moment thanks to AI.

Now software is faster than hardware and hardware is faster than wetwware people and biology right and so it takes longer the cycles are longer but you have enough data you have the right compute and we can eventually simulate more and more of biology and then make it into not just oh memorize what nature has kind of hackily evolved towards but make it an engineering discipline.

think about how we can actually change a specific antibbody and target a specific gene and uh like change our epigenetics and improve aging and cure cancer. All of these things I think are within uh our grasp and and reach over the next few decades and I think that will be a massive group of applications. Amazing.

Well, thank you so much for stopping by. This is a really great conversation. We'd love to have you back later. I love back again soon.