AIM Media House

MIT Built an AI That Hunts for the Next CRISPR. It Found Over 600 Candidates.

MIT Built an AI That Hunts for the Next CRISPR. It Found Over 600 Candidates.

The AI found a structural pattern that human researchers were not looking for because they did not know it existed.

The tool that gave scientists the ability to edit the human genome with precision came from bacteria. CRISPR was discovered in bacterial immune systems and repurposed into a gene therapy revolution that took decades to develop.

MIT scientists have now used AI to ask what else is hiding in those same systems, and the answer is considerably more than anyone expected.

DefensePredictor, an AI built by MIT researchers and reported by Singularity Hub in April, 2026, scans bacterial genomes for undiscovered immune defense proteins in five minutes. The same search using conventional methods takes weeks or months.

The speed difference is not incremental. It is the difference between a research program that can systematically explore the bacterial world and one that can only sample it.

DefensePredictor is built on ESM-2, a protein language model that learns the structure and function of proteins from sequence data alone, the same architectural approach that underlies large language models, applied to the molecular alphabet of life rather than human text.

The model was trained on roughly 15,000 known antiphage proteins and 186,000 proteins unrelated to defense, drawn from approximately 17,000 microbial genomes.

From that training data, DefensePredictor learned general characteristics that make a protein likely to be part of a bacterial immune system. It then applied that knowledge at scale across genomes it had never seen.

Across 69 strains of E. coli, the AI identified over 600 proteins not previously linked to immune defense. More than 100 were unlike anything previously discovered.

Critically, nearly half were not clustered together in the genome the way scientists expected, they were scattered across it. That scattering is precisely why conventional detection methods had missed them.

The AI found a structural pattern that human researchers were not looking for because they did not know it existed. To validate the results, the team engineered a vulnerable E. coli strain to express candidate proteins and exposed it to two dozen aggressive phages.

Nearly 45% offered protection against at least one phage. The search was then expanded to 1,000 additional microorganisms, surfacing thousands more potential defense proteins with no known equivalent.

The same week, a Pasteur Institute team using a parallel AI approach predicted nearly 2.4 million antiphage proteins across 32,000 bacterial genomes, the vast majority previously unknown.

Both teams published independently and reached the same conclusion: the diversity of bacterial defense systems is far larger than the scientific community had mapped.

The MIT team described E. coli alone as harbouring "a much broader landscape of antiphage defense than previously realized, expanding the likely number of systems by multiple orders of magnitude."

Every new bacterial genome sequenced is potentially hiding the next CRISPR. With AI doing the searching, the wait to find it is measured in minutes, not decades.