Stem Player founder Alex Klein on reverse-engineering stems from any song, interactive streaming deals, and why generative music sounds like Muzak
Oct 17, 2025 · Full transcript · This transcript is auto-generated and may contain errors.
Featuring Alex Klein
[Music] How you doing? Welcome to the show. Yes. Stem FM mixes. Stem FM mixes. All right, let's turn it down to just a tad. For the first time. Amazing to have you. Introduce yourself. My name Alex Klein.
Um, yeah, I'm I'm the inventor and founder of Stem Player, which was the original AI music offering in the consumer space, which uh did a ton of revenue, made a lot of noise, and started a small research lab around discriminative AI for AI music about three and a half years ago. Okay. And we've been quietly quietly.
So, this all started in 2021. I've actually been building with my team uh of friends and colleagues in London since uh 2013. Okay. where we launched uh computers that kids could build and code themselves like Lego. Uh we took those to retail. We did over hund00 million dollars of revenue on that business. No way.
Um yeah, ran into uh uh someone who became a my closest creative collaborator, one of my closest friends. Um and one of uh history's foremost uh anti-semmites uh Kanye West um started. Yeah.
truly truly the the number one the number one guy but this but a good vibes guy ultimately who I will always have love for wor uh him for a very long time launched the stem player and uh then he's no he's no longer a part of he's no longer a part he kind of came on to help us distribute the product we were working at the time with Ghost Face Killer sure we've since had like Quo Justin Bieber all around it Kanye was of course an amazing partner for a time but for the past four years we we've been Kanyeless and uh and glad honestly.
So So I mean I I remember seeing the value with uh with with the Kanye album because Kanye's famous for finding incredible samples and then reconstituting those into iconic songs.
And so the value to an artist who's working with you is that they can uh they they can still exert the curatorial artistic vision, but then allow the the user, the consumer to kind of interact with their album in a more programmatic way and and reconstitute the samples that have been selected that all have the feel of that particular artist, but it's more interactive.
That's a really good way of putting it. Um it's it it literally is their music. It's just their music distributed. And for the first time since the original interactive streaming deals were done.
Um we have in place now and signed interactive streaming deals for full catalog accessibility on Stem FM from the major labels. Wild. Um which means you know if wait break break down the business in the most simple way. You guys make hardware devices. You have this speaker on the table.
you have this this uh headphones that are uh wild and very cool. I got to try them on before this. But uh and then what what is happening at the actual software layer and then at the licensing layer? Absolutely. So um how how nerdy should we get? As nerdy as you want. Extremely nerdy. Okay.
So um the the primitive the the the of an LLM is a token, right? A part of a word, right? And the very first uh technology that enabled this incredible boom of like um multiple multi-billion dollar outcomes in three to five years in the text domain, the speech domain um was that next token prediction.
We at the very outset partly because of that collaboration we we went so deep into what's called source separation and which basically is a form of discriminative AI where we are using masks and unmasks to reveal the vocals bass drums and instrumental the stems of a song.
Oh, so you can reverse engineer the so the stems from just a full MP3 that's already mixed out. Exactly. Allow me to demonstrate like um so like we got some like kaani little yachti and that's going to transition now.
So that's like a AI transition to some Lub and I'll bring in [Music] so the part of Kendrick and Lube are in now. Right. So and then the original primitive the tokenizer. Yeah. Oh, you can go back to the pure vocal. And now we're going to get some Mac Miller drums. Pure Mac Miller drums.
So like the tokens of words, stems are the tokens of songs. And by doing a form of next token prediction, we can give you the songs that you love, but in a new way. Yeah. No, you could just go to the artists and say, "Hey, like like upload all of your stems to this. " Correct. No. No.
You we get a feed from the we get a feed from the major labels or the artists exactly as they would release to Spotify, Apple Music, Amazon title. We ingest and then you create this format, the stem FM format.
The full song is transmitted from the server to the client, but in this new format where on their end this really does the broader public have a general misconception that stem makes cool looking speakers. No, because we do. No, no, I know.
I know you make cool I know you make I I I think it's like but I'm saying it's a misconception. No, no, no. I'm saying it's a misconception because you're turning it's much more than that.
The thing that's interesting is that you're turning music into a game that feels like somebody can uh you know what you're doing here. I'm I'm looking at it. It's not like you're doing this is software. This is this is this is the most advanced music intelligence in the world.
The reason I think the general public hasn't used it yet and may have used um applications that made music that sounds bad um is because the ones that launched earlier and raised huge amounts of venture capital um were based on a really an antiquated technology paradigm for generating music.
Um it's better to generate music as the genius of Kanye proved as you put it John through recompositing samples, stems and existing elements than it is to diffuse a song frame by frame. That's why all that [ __ ] sounds like music from Suno and Udia. Doesn't sound like music, you know?
So, this is a a a world of music that is giving a new value to music, which I think is the the kind of key goal because streaming is already growing so much, but we want to share the pie because more people are participating. Anyone can upload, mix their music with M.
Miller, mix their music with anyone legally, make money, participate. How do you think about uh the continuum of of like hardware devices?
Like if you on one side you have like the Beats pill um no interactivity and then uh you have your product and then I think one notch more customizable you have something like a teenage engineering and then you might have a lemur or or some sort of uh or or some sort of control surface for Ableton essentially where Exactly.
Yeah. where you're it's like more professional. This is is this proumer? Is this purely consumer? This is a consumer product at the end of the day. This this we started really by making it the world's best portable Bluetooth speaker. It's the loudest portable Bluetooth speaker made. Um gosh, speaking mic here.
Um and then it's also way smarter. It's got compute. It's got storage. But but uh I mean we were asking uh Sam Alman about this. Like with Sora there it is a different app. Like on Instagram, probably less than 1% of my time on Instagram, most people's time on Instagram is actually creating, uploading, filtering photos.
99% of the time is scrolling. On Sora, it feels like a lot of people are spending 20, 30, 40, 50% of their time creating versus consuming. And that's an interesting new paradigm. Maybe it's better, maybe it's worse. I don't know.
But like with this, how much time do you think is spent just, hey, just play the normal song versus I actually want to be interacting with the song. I want to be creating and and remixing. Like, how do you think it splits down? A great analogy. I I'll I'll turn back on the music.
This time it will be coming not from my phone, but actually from the device itself. So, um, and we'll just use it like background for now. Just steely down. And then when the moment strikes us, we might create, we might customize. Yeah. But what's most interesting to me is is taking something that everyone loves, right?
I mean, not everyone. I I you can find people out there that I don't like music. It's like, okay, like get you should get an award for that. Um, but uh but I think there's something to tapping into this like DJ culture.
Everybody wants to be a DJ, but then turning it into a form factor that anybody can use almost like a simple drum, right? Yeah. Yeah. Like Yeah. Exactly. What um Yeah. What? So when you look at the I was I I saw some reporting on Sunno's revenue growth uh recently.
Where is that revenue coming from from what from what you know? I'll say what I said to somebody the other day. I I think um you know I mentioned a lot of big artists who we work with and the reason we work with a lot of big artists is cuz like we care about music as well.
Um and some of the statements that were made by certain companies uh if you look at the legal filings um around the I think $5 billion damages or more that is being claimed against uh Sunno and at the moment by every major label and every major publisher in in together they some of those statements you could see that they weren't true and so that's all I'll say on the matter which is basically that the statements that the record labels no that the companies make.
Oh, interesting. Yeah.
Like they were basically trying to say that you couldn't prompt the thing with like names of artists and you couldn't prompt the thing with like real melodies like Smoke on the Water like and it just if you could look at the outputs and use stuff like our stuff which detects chords, melodies, beats and you you could see that it's copyrighted outputs.
Now will it get I like cool new things as well but but that was my question. My question is the number true. Yeah. My question also was like is it is it proumers? Is it creators using it? Like where's the actual revenue? Who's who's going and spending money with? Yeah.
I mean if I if I compare it to chatbt I imagine that uh generative audio apps are not used like very few people go to chatbt and say like actually just write me a full book. It's more like get me some recommendations of books I might like at this point.
And I feel like in in the musical context, you might you might ask one of these AI music generators to uh come up with a few different melodies that could inspire you to ultimately write what you wind up writing or take me through uh a bunch of different chord projection uh progressions or a different what does it look like if this is arpeggiated or what does it look like if it's in a different key.
Like there's all these different ways that AI as a tool can just be used to kind of like add twists and explore the creative sound like if Kanye and Taylor got in the studio rather than fighting. What would it sound like if Drake and Kendrick got in the studio and made made it together?
You know, like that that to me is the thing that people can stem can do that. Yeah, absolutely. Do you want to prompt something right now like I don't know what's I mean we got to do Metallica of course. Yeah, I was going to say uh without Ben off Metallica. Shout out Ben off as well. Yeah, shout out Mark Ben.
That's my guy. He's massive. Yeah. Um what uh uh yeah, I mean what uh the there was OpenAI news just yesterday that What about uh how about Metallica and Pop Smoke? All right, there we go. Let's give it a try. We're gonna add Metallica.
the the what what was the news from OpenAI about the uh the Martin Luther King Foundation uh said, "Hey, stop generating Soras of Martin Luther King's I have a dream speech. " No, it was just I think that but a bunch of other like kind of scenes or I did see that going viral on on Sora. [Music] Let's go.
We got to send Mr. Benny off. At the next at the next Agent Force, we got to get Pop Smoke and Metallica on on stage together. This is the way. It just sounds sick. It's like people already doing this in remix. It's like the top 15 songs are all remix 2007 2008 that like the whole like mashup culture was really huge.
There were a couple DJs that there that their whole thing was mashups. Uh and and they became extremely popular. This is This is great cuz it's it's kind of a bot, but it's the most unlikely combination. Well, you you prompted it, dog. This is your prompt. Like, yeah.
Uh it's the Harry Potter Balenciaga of music potentially. Um but uh yeah, I mean the uh just like getting lost in the music over here. Yeah, it's funny. Um but John, you put it really well. What is this create creation consu consumer percentage in the future like? Yeah. Um what are your design inspirations?
I feel like those headphones, we've talked to uh Carl from Nothing. Uh and and and and uh when when when Carl Pay from Nothing came on, he had a pair of headphones that were ve uh also very counterpositioned against what Apple's doing.
Uh I think it's very dangerous if you're starting a consumer tech company or building a consumer product to try and uh steal from Apple's design language because you're just going to get steamrolled. Uh you've gone a very different direction. He went a very different direction. You two are on opposite ends.
I feel like that looks like human flesh almost whereas his looks like complete clanker coated. Um but what what are your inspirations? How do you land on this particular material? My my chief inspiration as a designer is God. Um uh my chief inspiration is God.
Um but to put that in more tech friendly terms like nature and science nature nature and science mathematics principles of mathematics and then feels almost kind of UAP uh too like like that was a alien technology. Yeah, that's the Well, the Apple the Apple like campus looks kind of like a UFO in Certino.
That new huge like UFO circle, but it's it's it's very glass. It's titanium. And this is very like fleshy almost. Exactly. Yeah. A couple questions on my side. So, uh, being able to put this experience together legally seems monumental. That's right. Yes. Exactly.
So, so the the music industry is clearly very afraid of AI. They're sort of reluct clearly like engaging with it and excited about it in certain instances. But what in particular uh about this instantiation of AI music were they excited about?
Was it that this is able to basically tap into DJ culture stuff that's already happening and give people this? I think the reality is it's cuz it's hot like the music sounds better so the music people like it more.
You know, we've all had fun like doing the older generation of music AI, which is like text to to song, but because it makes us laugh and it's and there's a business there, some producers. I was having lunch with my my friend Smeino in the in uh Chaa Chicken. He's an amazing artist. He's got 3. 5 monthly listeners.
Fantastic storyteller and 3. 5 humans. 3. 5 million monthly million. Okay. Half human, you know, guy's got no one. three adults and one child. Let's say I'm not saying that's equal. You're so full human.
Um but he was saying like they bring the old generation AI into the studio sometimes like people bring it into pitch to him as the musician. And he's like oh okay that's oh wow the tech is cool but I was like did you use it? And he was like no they artists aren't really using it.
And also this the business model here uh one seems a lot better for you than just selling like a smart hardware device. [ __ ] yeah.
Because you can sell a a subscription that's actually aligned and that it's a novel product and that you have a software AI layer on top of what probably is like pretty standard streaming platform. Is that right?
What's fascinating, that was what I thought when we started really building this as a streaming service, an all catalog streaming service 3 years ago.
But to do what we do, those to to hear those parts of the audio separate, to have them blend into each other seamlessly like a perfect DJ, to have the ensemble of models that detect these musical features, that rich metadata be uh usable almost like uh sheet music for the the front-end experience running on all these platforms.
We basically had to change what music streaming is.
We built a new audio codec on stem FM which is a it it uses a new concept in music processing called music aare processing well audio processing called music aware processing where basically the core logic of how the the file is is chunked and dechunked as well as uh unwoven time multipplexed out of its interle format.
that logic um is aware at all times of the beat and key of the song. So you think everything on Spotify you listen to all the tech going into it isn't listening to the song. It's just it just sees it all as audio audio frames frequency bands.
So we've effectively encoded the intelligence of a of a DJ in the streaming codec which is like that's been the hardest thing we've ever worked on. And shout out to the team like big [ __ ] shout out to the Stem FM team. like holy [ __ ] Is is licensing like purely commodity at this point?
Like Apple Music, Spotify, uh title? Do they all pay the same royalties to musicians or are musicians thinking if I partner with you I will get twice as much revenue per stream or something like that? Effectively, there were two very new things we did.
The first is that music aware processing that more interactive music stem format.
The second was our licensing and revenue model, which we've come up with an idea that we think can really level the playing field, create an even bigger music industry and get artists the fair understanding of their payouts and the fair payouts on in a way that can stick and break the tie and break the the bad vibes in the industry around business right now.
and it's called the timebased artist compensation system. Tabbacks for short or time backs if you like. Get your time back with tabbacks. The uh so it's very simple.
All jokes aside, um you're a subscriber to STEM FM, you pay say $20 a month, you the listening time you spend with a particular artist or their label, that will be the percentage of your subscription fee as a percentage of overall time. There's no pay per stream, which is a really a hangup of the CD age. Sure.
It's really like detritus from when people were selling these physical mechanical. So, you're sort of creating like a creator revenue share pool that then is divided up based on listening time.
That's more aligned with with artists who might have three and a half million of listeners that that listen to them for like 30% of all their listening time, but they're getting a tiny fraction because Yeah.
And so, uh, if I'm listening to Metallica and Popm Smoke at the same time for my entire month straight because I'm just obsessed with that mashup, half my revenue share goes to Metallica, half my share goes to Pop Smoke in theory. Yeah.
And and the the the business uh, as you were referring to earlier, the the unit economics of streaming are tight. So basically you have to make sure that you provide like the major labels with a deal that gets them involved but we also open up to verified lens source.
They can come on our platform and upload and and participate. What happens if somebody's generating AI next token music and they try to bring it on stem? Yeah, we were thinking about converting our entire three-hour podcast into a song, three-hour musical, technology and business musical.
A musical and then uploading that as a three-hour song. We We'll do it on the platform. I'll send it to you guys. TV the musical. The daily musical. Daily musical. I feel I feel like it would be great. It's like an odd couple story like word. Yeah. Yeah. Let's turn Let's turn our benny off.
How much how much AI music is getting up uploaded to streaming platforms broadly every day? I don't see AI generated music recommended in my feed. I don't know about you.
Like I've only seen screenshots of like this AI artist went viral, but I haven't actually seen it surfaced in my like you know recommendations or anything like that. A few a few a few people I know have like songs have like gone viral and the headline was like it's an AI song.
But the truth is as a human producer and that's what Mino was saying as well. A great musician can make music out of anything, you know, including these more musicacky things. And of course, of course, it's like any other tool. I mean, it's the same thing with EDM.
Like, for a long time, people were like, you know, oh, Ableton is not like, you know, the same as having an, you know, whole orchestra or band or real samples. We thought about getting a live band to instead of the soundboard. Soundboard is is uh, you know, technology 2. 0.
I actually just realized I could even try prompting this TBPN um musical right now. Yeah, cuz you can just prompt stuff with YouTube videos. Oh, interesting. So, we pull that in. Yeah, it has to be a short one though, like a clip. We we we we have shorts on there. We we have shorts on YouTube for sure. Yeah.
But um how do how does the all the legal debacles between uh the generated music platforms and the and the labels actually get resolved?
Well, I think what what really happens in the end is that, you know, the the labels and the artists, it's ultimately they're bit they're providing the product and the technology platforms are the licensors and subagregators.
Um, and so I think it ideally gets resolved with like a deal where the power of advanced technology and the beauty and purity of of music, which itself is a technology, uh, can co coexist. And um, it feels like with uh, with YouTube initially it was like if you uploaded copyrighted content, they just take it down.
Then it was like, "Hey, if you use a Metallica song in this video, we're going to send the money that you made from that video to Metallica," which was fine. But then what happens when you use 5 seconds of a Metallica song and then 5 seconds of a Justin Bieber song?
You need to parcel up and split out all the revenue used by our Yeah. by our our our video today. For sure. For sure. Well, that was the amazing thing. And is it fair use? It needs to determine fair use. But I think LLM can do a lot of that already.
I I don't want to be like uh like selling too hard, but basically this as we got into the research here with Queen Mary University, we have we've been in a deep research partnership with their center for digital music for three years now.
You know, submitted papers, um accepted papers, literally changing the paradigm of beat structure and segment detection in music information retrieval by using a more structural approach that can actually detect music better than the previous generation approach. So, these are early experiments for us.
And how many of these do you want to sell over the next year? Um, cuz I imagine this is this is like the perfect product to go viral on TikTok. And suddenly, I mean, we did we did uh we sold out and did about $20 million of revenue on the STEM. That's a lot. Oh, we need the numbers. Yeah.
And we still like we haven't launched this in like 2 years, a new stem. So, but and we really put the technology into this one. Like this is where we went hard on the technology for like a long time. The first one was like almost the art house film of the tech like oh look you can decompose the stems.
This one is like the block but this is like the blockbuster. So how many do we hope to sell? God willing this will be this. God willing, you know, my my voice to God. We won't be getting these JBLs or whatever. Get this. It's a smarter speaker. It sounds better. It's 299. you know, let's go.
Like, uh, Jack Dorsey was hit that gong again. What the [ __ ] Thanks. No, I just think I I I uh it's rare that I see a new consumer electronic device outside of like a new MacBook Pro, and I think I really want that. Last question for me. Thank you so much. I'm excited. I'm excited to get it.
It feels like something that's like I can imagine uh my my uh uh Saturday mornings with the family. My son's like I want dinosaur music and I want dragon music. And I could be like well I want imagine dragons. I want pop smoke. So we're getting pop smoke dinosaur music.
First a thousand units founders edition available now on stem player. com. Sorry you uh yeah last question for me. Uh Jack Dorsey was taking a victory lap about title not having ads. Okay. What do you what's your take on ads? Are you going to put ads in there?
I I I we have no plans to put in ads and uh we we've actually in our signed deals with uh with the majors, we we have agreed not to do do ads for the first two terms and we it was never like close to my heart. So, and I think you know the goal is to increase the value of the music. So, look at this. I just bought it.
Took me two seconds. Thank you so much. Yeah. Kudos. I'm excited to get one. I really appreciate that. And if you know, no, it's just like it's it's um it's rare to see a consumer hardware device that feels like it is genuinely a novel experience. Yep.
It's very rare, but not so novel that it's so different that I'm like, ah, this like like gathering around and enjoying music with a group and this like DJing is like a Lindy thing. And then also have a sustainable business model, right? Where it's like you're selling this hardware device.
I'm sure you'll have some margin in there, but hopefully I'm entertained by it for a long time and you you know my LTV is is infinity. So, be great. Well, thank you so much. Thank you so much for coming on. This was super great hanging. We have Hamont from General Catalyst coming on in just a few minutes.
First, let me tell you about Profound. Get your brand mentioned in chat GPT. Reach millions of consumers who are using AI to discover new products and brands. Uh Noah Smith says solar power is good. Batteries are good. Air conditioning is good. AI is good. Nuclear power is good.
Stop thinking like a medieval peasant people. He's putting on a clinic of techno optimism. He says, "I will say though that smartphone enabled social media is not good. " He posts on X. I am a bit of a of a of a what's the word? Eleazar Eleazer in the reply. Solar power is good. Batteries are good. AC is good.
Nuclear power is good. Gene therapy is good. Building a new alien species of superhuman intelligence with very little ability to understand or direct it is not like other technologies and will kill you and your kids. Rough. Well, you know what won't? Uh, linear.
Linear is a purpose-built tool for planning and building products. Meet the system for modern software development. Streamline issues, projects, and product road maps. Our next guest is Hamont from he's the CEO of General Catalyst. He has