Interview

Hey Clicky founder Farza Majeed built a voice-controlled AI desktop agent in 8 weeks — now using Claude 4 by default

Jun 10, 2026 with Farza Majeed

Key Points

  • Hey Clicky founder Farza Majeed built a voice-controlled desktop agent in eight weeks that routes requests across four AI models, using Claude 4 by default for pixel-understanding tasks.
  • The $20/month product caps interactions at 150 per user because agentic work costs up to 25 cents per action at the API level, forcing unit-economics discipline before fundraising.
  • Majeed treats content creation as a distribution channel for the core business and focuses on integration depth, with users already connecting 15+ services to automate workflows across applications.
Hey Clicky founder Farza Majeed built a voice-controlled AI desktop agent in 8 weeks — now using Claude 4 by default

Hey Clicky

Farza Majeed built Hey Clicky in eight weeks, starting from a personal frustration: hour-long YouTube tutorials for DaVinci Resolve weren't cutting it, so he built an AI that could watch his screen and teach him the software while he used it. A friend posted it, it went viral, and the emergent behavior from early users pushed the product in a direction he hadn't planned — people were talking to their screens to learn Blender, watch anime, and control applications well beyond video editing.

The product today is a voice-controlled desktop agent. Users talk to their computer; Hey Clicky handles the task. It only takes a screenshot when the user presses a button — detection of the active application runs passively, which lets the agent prompt users proactively ("Hey, you've been in Notion for ten minutes — can I help?"). Majeed describes the experience as having a new-grad intern always watching over your shoulder, spotting patterns and offering to take things off your hands.

It started as an AI teacher and I guess now it's an AI that where you can essentially talk to your computer and it does whatever you wanted to do... We use GPT real time upfront to give you like the really quick answer. But then if you want like a deeper kind of like a thought process over the image, we now use Fable five actually — by default.

Under the hood

The architecture routes across four models. GPT Realtime handles the first layer and does the routing, which Majeed says OpenAI's own team hadn't fully anticipated as a use case — it turns out GPT Realtime's strength in tool calling makes it an effective request router. Heavier pixel-understanding tasks route to Claude 4 (referred to in the transcript as "Fable five") by default. Agentic work routes to GPT-4.5 ("GBD 5.5" in the transcript). Majeed packaged the Rust binary from Codex directly into the app, so agent calls are effectively spawning a Codex subprocess. The router was custom-built; nothing off the shelf fit the use case.

Economics

Pricing is $20/month with a cap of 150 agent interactions before Hey Clicky starts losing money. The cost structure is lopsided: calling Sonnet or Opus for a simple query is cheap, but agentic work is expensive. Majeed estimates that telling Codex to click "Add to Cart" on Amazon costs 25 cents per action at the API level. That math concentrates the risk in heavy agentic usage, which is why the interaction cap exists.

Hey Clicky is pre-funding — Majeed says he's in the process of raising. Running lean without a war chest has forced unit-economics discipline earlier than most funded startups would face it.

Distribution

Majeed has been making videos for fifteen-plus years. He treats content as a distribution channel, not a career — the media ability drives attention to the product, not the other way around. He's clear-eyed that video and music are poor businesses on their own; what changes is having a real business engine underneath.

On platform strategy, he's uninterested in becoming an OS-level controller competing with Apple. The near-term focus is integration depth — users are already connecting around 15 services (G Suite, Notion, Dropbox) to Hey Clicky and running workflows across them. His pitch is that the computer increasingly asks "can I just do that for you?" rather than waiting to be told.

The fundraise is in progress. No amount or lead has been disclosed.

Every deal, every interview. 5 minutes.

TBPN Digest delivers summaries of the latest fundraises, interviews and tech news from TBPN, every weekday.