Integral raises $18M Series A to sanitize proprietary data so AI builders can access sensitive datasets
Jul 1, 2026 with Shubh Sinha
Key Points
- Integral raises $18M Series A to strip identifying information from proprietary datasets like medical records and financial transactions, letting AI builders access behavioral signals without privacy or regulatory exposure.
- The company targets mid-market hospitals with rare-disease datasets and fitness apps seeking new revenue streams by licensing anonymized user behavioral data.
- Integral's edge over commodity de-identification tools stems from healthcare-grade privacy engineering developed under the strictest compliance requirements.
Summary
Read full transcript →Integral raises $18M Series A
Integral, a New York-based data privacy engineering company, has closed an $18 million Series A. Co-founder and CEO Shubh Sinha describes the business as sitting between data holders who want privacy and revenue, and AI builders who want signal without exposure to sensitive raw information.
The core problem Integral is solving: proprietary real-world datasets — medical records, financial transactions, fitness app logs — contain behavioral patterns that AI builders want, but the data holders can't hand them over without violating privacy regulations or contractual obligations. Integral's pitch is that it can strip out the sensitive identifiers while preserving the underlying signal, making the data both usable and compliant.
“What Integral does is we sanitize proprietary real world data sets such that AI builders can get very bespoke, very sensitive data sets. Data holders can also make sure that privacy and compliance is adhered to. This looks like medical records, financial transactions — a lot of this contains the real world human behavior patterns that people like you and me have.”
Who holds the data
Sinha points to mid-market hospitals as one target segment. Hospitals facing compressed revenues may be sitting on highly specific datasets — centered on a rare disease or a specialized procedure — that aren't available anywhere else. Fitness apps are another example: an app monetizing only paid subscribers could open up an entirely new revenue stream by licensing the behavioral data its free users generate, while keeping the app free.
The circular logic Sinha is betting on is that the same companies monetizing data today will eventually become buyers of AI products trained partly on that data.
Where Integral fits in the stack
Basic sanitization — regex patterns that flag and mask phone numbers, for instance — is already a commodity. Integral's differentiation, according to Sinha, is deeper privacy engineering developed through healthcare, where the compliance bar is highest. Healthcare data is almost entirely proprietary, which forced Integral to build more sophisticated de-identification than a rules-based layer can handle.
On the question of whether frontier models pose a re-exposure risk when used for sanitization, Sinha acknowledges the concern but doesn't address it directly. The implication is that Integral's value proposition includes keeping that risk off the table for both sides — the lab doesn't want to see raw PII any more than the data holder wants to share it.
Every deal, every interview. 5 minutes.
TBPN Digest delivers summaries of the latest fundraises, interviews and tech news from TBPN, every weekday.