What Is AlphaGenome?
AlphaGenome is a large-scale deep learning model from Google DeepMind that predicts how segments of DNA regulate gene expression — and how genetic variants may disrupt that regulation.
When the human genome was first sequenced, around 98% of it — everything that doesn't encode proteins — seemed inert. It isn't. This "non-coding" DNA contains enhancers, silencers, and other regulatory elements that control where, when, and how much each gene is expressed. Deciphering this regulatory code is one of biology's central unsolved problems, and one with major implications for understanding disease.
AlphaGenome is designed to crack that code. Given up to 1 million base pairs of raw DNA sequence, the model outputs thousands of functional genomic tracks — predictions of gene expression levels, chromatin accessibility, histone modifications, transcription factor binding patterns, splice site usage, and more — at single-nucleotide resolution.
How It Works
Prior regulatory genomics models faced a fundamental trade-off: short-context models (like SpliceAI or BPNet) achieved base-pair resolution but missed distal regulatory elements beyond ~10 kb. Long-context models (like Enformer or Borzoi) captured broader sequence context — up to 500 kb — but at reduced output resolution (32–128 bp bins), blurring fine-scale features like splice sites and transcription factor footprints.
AlphaGenome resolves both constraints in a single architecture.
The model produces two types of learned representations: one-dimensional embeddings at 1 bp and 128 bp resolutions, and two-dimensional pairwise embeddings at 2,048 bp resolution (critical for predicting splice donor-acceptor interactions). These feed into 11 sets of task-specific output heads.
Training used a two-stage process: first, an ensemble of cross-validation "teacher" models trained on held-out genome folds; then a single "student" model distilled from the ensemble, yielding improved robustness and variant effect prediction accuracy in a single device call per variant.
AlphaGenome was trained on human and mouse genomes. The final model predicts 5,930 human or 1,128 mouse genome tracks across diverse cell types and tissues.
Key Functions & Capabilities
AlphaGenome's core function is predicting how a DNA sequence drives regulatory biology. Its 11 predicted modalities span the major layers of gene regulation:
Gene expression (RNA-seq, CAGE, PRO-cap)
Chromatin accessibility (DNase, ATAC-seq)
Histone modifications
Transcription factor binding
Chromatin contact maps (3D genome)
Splice site usage
Splice junction coordinates & strength
Transcription initiation
Polyadenylation signals
TF footprints
Variant effect scoring (all modalities)
Beyond genome track prediction, AlphaGenome is specifically designed for variant effect prediction (VEP): comparing model outputs for a reference sequence versus a mutated sequence to predict the functional consequence of a genetic variant. A single variant can be scored across all modalities in under a second.
The model includes a genome interpretation suite with contribution scores from in silico mutagenesis (ISM) experiments, allowing researchers to identify which nucleotides in a sequence are most critical for a predicted regulatory output.
In a key validation experiment, AlphaGenome successfully recapitulated the known disease mechanism in T-cell acute lymphoblastic leukemia (T-ALL): a non-coding mutation activates the TAL1 oncogene by creating a MYB transcription factor binding motif. The model predicted this mechanism without being told the gene or disease context — from DNA sequence alone.
Who Uses AlphaGenome?
Since its June 2025 launch, nearly 3,000 researchers from 160 countries have used AlphaGenome, generating approximately 1 million API calls per day by January 2026. Its primary audience is academic and translational researchers working on regulatory genomics problems.
Identifying non-coding mutations in cancer genomes that drive tumor proliferation by activating or silencing regulatory elements — without requiring protein-coding impact.
Predicting variant effects on RNA splicing for diseases like spinal muscular atrophy (SMA) and cystic fibrosis, where splicing mutations in non-coding regions are pathogenic.
Mapping and characterizing the regulatory architecture of the genome — finding enhancers, promoters, silencers — across cell types and developmental contexts.
Understanding how non-coding variants associated with disease (from GWAS) affect regulatory elements and gene expression, prioritizing mechanistic hypotheses for therapeutic intervention.
Investigating regulatory variation contributing to Alzheimer's, Parkinson's, and related disorders, where non-coding loci constitute a large fraction of GWAS signals.
Fine-tuning and adapting the model for domain-specific tasks — AlphaGenome's learned sequence representations serve as a powerful general-purpose genomic foundation.
How to Access & Use AlphaGenome
AlphaGenome is available for non-commercial research use at no cost. As of January 2026, source code and model weights have been made publicly available.
- 1Review terms of useAccess is free for non-commercial research. Review DeepMind's terms at
deepmind.google.com/science/alphagenomebefore proceeding. - 2API access (server-side inference)Use the Python SDK to query DeepMind's hosted API. The SDK handles sequence formatting, request batching, and result parsing. Suitable for most research workflows without local GPU requirements.
- 3Source code & weights (local deployment)Since January 2026, model weights and source code are available at
github.com/google-deepmind/alphagenome_research. Enables fine-tuning on domain-specific datasets and offline inference. TPU or high-memory GPU recommended for the full 1 Mb context. - 4Genome interpretation suiteAn accompanying analysis toolkit provides streamlined variant scoring with quantile calibration, ISM-based contribution score computation, and genome track visualization. Useful for interpreting model outputs in a biological context.
- 5Hugging Face model hubThe
google/alphagenome-all-foldsmodel is also hosted on Hugging Face, providing an alternative access path compatible with the transformers ecosystem.
Recent Advances
AlphaGenome has moved rapidly from preprint to open-source availability over an eight-month period.
- Jun 2025Public preview and preprint. DeepMind released AlphaGenome in preview for non-commercial research alongside a bioRxiv preprint (DOI: 10.1101/2025.06.25.661532). Free API access launched for researchers globally.
- Jun–JanRapid adoption. Within seven months of launch, nearly 3,000 researchers from 160 countries began using the model, with API call volume reaching approximately 1 million requests per day.
- Jan 2026Nature publication. The peer-reviewed paper — "Advancing regulatory variant effect prediction with AlphaGenome" — was published in Nature (DOI: 10.1038/s41586-025-10014-0), confirming performance across 25 of 26 variant effect prediction benchmarks.
- Jan 2026Open-source release. DeepMind released model source code, weights, and variant scoring implementations publicly at
github.com/google-deepmind/alphagenome_research, expanding access beyond the hosted API. - RoadmapFuture directions. The DeepMind team has indicated planned work on expanded tissue-specific prediction capacity, training on additional species beyond human and mouse, and community fine-tuning for domain-specific tasks.
Comments (0)
No comments yet. Log in to leave a comment.
You must be logged in to leave a comment.
Login to Comment