Noetik trains transformers to match cancer patients to treatments
Noetik, an AI biotech startup, is building transformer-based models to identify which patients will respond to which cancer treatments — arguing that the 95% clinical trial failure rate is a patient-selection problem, not a drug-discovery problem — and recently closed a $50M deal with GSK.
Score breakdown
Developers and practitioners building AI for life sciences should note that Noetik's platform-licensing deal with GSK signals that pharma companies are beginning to pay for biotech AI as software infrastructure, not just as a path to drug co-development — validating a pure-tools business model in the space.
- 0195% of cancer drugs fail in clinical trials, which Noetik attributes to poor patient selection rather than poor drug development.
- 02Noetik uses three core data modalities: pathology (H&E), spatial transcriptomics, and genomic alterations.
- 03The company built its own data pipelines and tumor-processing infrastructure from scratch, spending roughly 18 months generating data before training models.
Noetik was founded on a contrarian thesis: the reason 95% of cancer drugs fail in clinical trials is not that the drugs themselves are ineffective, but that they are tested on the wrong patients. Co-founder and CEO Ron Alfa, a physician-scientist by training, and VP of AI Dan Bear, who has a background in neuroscience, computational neuroscience, and self-supervised learning, appeared on the Latent Space podcast to explain how the company is building AI models to solve this patient-matching problem. Their core argument is that some patients do respond to drugs that fail in broad trials — and that identifying the biological subtypes of patients who respond is the key to dramatically improving clinical success rates using treatments that already exist.
Their core data modalities include H&E pathology slides, spatial transcriptomics, and genomic alterations.
To build these models, Noetik generated its own high-quality, intentional datasets from scratch, sourcing human tumor samples and building custom processing pipelines over roughly 18 months before they could even begin training models. Their core data modalities include H&E pathology slides, spatial transcriptomics, and genomic alterations. The team employs self-supervised learning and deliberately avoids bias from electronic health records. They also use a tool called PerturbMap and in-vivo mouse models to validate predictions made by their human-data AI models.
GSK recently signed a $50M deal for Noetik's technology, which also includes an undisclosed long-term licensing agreement for their models. The deal is structured as a software platform license — not a drug co-development partnership — which distinguishes it from most big AI plays in biotech, where tool companies typically end up becoming drug companies. The Latent Space hosts note this signals a growing appetite among large pharma companies for biotech AI tools as a software category, alongside other developments in the space such as Boltz and Isomorphic.
Key facts
- 0195% of cancer drugs fail in clinical trials, which Noetik attributes to poor patient selection rather than poor drug development.
- 02Noetik uses three core data modalities: pathology (H&E), spatial transcriptomics, and genomic alterations.
- 03The company built its own data pipelines and tumor-processing infrastructure from scratch, spending roughly 18 months generating data before training models.
- 04Noetik employs self-supervised learning and avoids using electronic health records to reduce bias.
- 05A tool called PerturbMap and in-vivo mouse models are used to validate the company's AI predictions.