Protein-Ligand Co-Design Benchmark¤
Level: Advanced | Runtime: ~3-5 minutes (CPU) / ~1-2 minutes (GPU) | Format: Python + Jupyter
Prerequisites: Understanding of protein-ligand interactions, drug discovery, and molecular modeling | Target Audience: Computational chemists and drug discovery researchers
Overview¤
This example demonstrates a comprehensive benchmark suite for evaluating protein-ligand co-design models. Learn how to use the CrossDocked2020 dataset, compute binding affinity predictions, assess molecular validity, evaluate drug-likeness, and systematically compare model architectures for computational drug discovery.
What You'll Learn¤
-
Molecular Modality
Domain-specific framework for chemical structure representation
-
CrossDocked2020
Large-scale protein-ligand binding dataset with 22.5M complexes
-
Binding Affinity
Predict and evaluate protein-ligand binding energies (kcal/mol)
-
Molecular Validity
Assess chemical plausibility of generated structures
-
Drug-likeness (QED)
Quantify pharmaceutical potential using QED scores
-
Benchmark Suite
Systematically evaluate and compare model architectures
Files¤
This example is available in two formats:
- Python Script:
protein_ligand_benchmark_demo.py - Jupyter Notebook:
protein_ligand_benchmark_demo.ipynb
Quick Start¤
Run the Python Script¤
# Activate environment
source activate.sh
# Run the complete demo
python examples/generative_models/protein/protein_ligand_benchmark_demo.py
Run the Jupyter Notebook¤
# Activate environment
source activate.sh
# Launch Jupyter
jupyter lab examples/generative_models/protein/protein_ligand_benchmark_demo.ipynb
Key Concepts¤
1. Protein-Ligand Co-Design¤
Simultaneously optimizing both the protein binding site and ligand molecule for strong, specific binding:
Protein Pocket + Ligand → Protein-Ligand Complex
↓ ↓ ↓
Flexibility Chemistry Binding Affinity
Specificity Drug-like Stability
Applications:
- De novo drug design
- Lead optimization
- Binding site engineering
- Personalized medicine
2. CrossDocked2020 Dataset¤
Large-scale protein-ligand binding dataset:
from workshop.benchmarks.datasets.crossdocked import CrossDockedDataset
dataset = CrossDockedDataset(
num_samples=50,
max_protein_atoms=200,
max_ligand_atoms=30,
pocket_radius=8.0,
rngs=rngs
)
# Get a sample
sample = dataset[0]
# sample = {
# "protein_coords": (200, 3), # Protein atom coordinates
# "protein_types": (200,), # Atom types (C, N, O, S, etc.)
# "ligand_coords": (30, 3), # Ligand atom coordinates
# "ligand_types": (30,), # Ligand atom types
# "binding_affinity": -8.5, # In kcal/mol (lower = stronger)
# "pocket_indices": [12, 45, ...], # Binding pocket atom indices
# }
Dataset Statistics:
- Total complexes: 22.5 million docked pairs
- Protein size: ~50-500 atoms (binding pocket)
- Ligand size: ~10-50 atoms (drug-like)
- Binding affinity range: -15 to 0 kcal/mol
3. Molecular Modality Framework¤
Domain-specific functionality for chemical structures:
from workshop.generative_models.modalities.molecular import MolecularModality
modality = MolecularModality(rngs=rngs)
# Chemical constraints
config = ModalityConfiguration(
name="molecular_config",
modality_name="molecular",
metadata={
"use_chemical_constraints": True,
"bond_length_weight": 1.0, # Enforce realistic bond lengths
"bond_angle_weight": 0.5, # Enforce bond angles
"use_pharmacophore_features": True,
"pharmacophore_types": [
"donor", # H-bond donors
"acceptor", # H-bond acceptors
"hydrophobic" # Hydrophobic regions
],
}
)
extensions = modality.get_extensions(config, rngs=rngs)
# extensions = {
# "chemical_constraints": <ConstraintModule>,
# "pharmacophore_features": <PharmacophoreModule>,
# }
4. Binding Affinity Metric¤
Evaluates binding affinity prediction accuracy:
from workshop.benchmarks.metrics.protein_ligand import BindingAffinityMetric
metric = BindingAffinityMetric(rngs=rngs)
# True binding affinities (in kcal/mol)
true_affinities = jnp.array([-8.2, -6.5, -9.1, -7.8])
# Model predictions
predictions = jnp.array([-8.5, -6.2, -8.9, -7.5])
results = metric.compute(predictions, true_affinities)
# results = {
# "rmse": 0.32, # Root Mean Square Error (kcal/mol)
# "pearson_r": 0.95, # Correlation coefficient
# "mae": 0.28, # Mean Absolute Error
# }
Performance Targets:
- Excellent: RMSE < 0.5 kcal/mol
- Good: RMSE < 1.0 kcal/mol
- Acceptable: RMSE < 1.5 kcal/mol
5. Molecular Validity Metric¤
Checks chemical plausibility of generated molecules:
from workshop.benchmarks.metrics.protein_ligand import MolecularValidityMetric
metric = MolecularValidityMetric(rngs=rngs)
results = metric.compute(
coordinates=ligand_coords, # (batch, num_atoms, 3)
atom_types=atom_types, # (batch, num_atoms)
masks=atom_masks # (batch, num_atoms)
)
# results = {
# "validity_rate": 0.96, # Overall validity (target: >0.95)
# "bond_validity": 0.98, # Valid bond lengths
# "clash_free": 0.94, # No atomic clashes
# "connectivity": 0.97, # Proper atom connectivity
# }
Validity Checks:
- Bond lengths: 1.2-2.0 Å for most bonds
- No clashes: Atoms >1.0 Å apart (except bonded)
- Connectivity: All atoms form connected graph
- Valence: Atoms respect valence rules
6. Drug-likeness Metric (QED)¤
Quantitative Estimate of Drug-likeness:
where \(p_i\) are desirability functions for 8 molecular properties.
from workshop.benchmarks.metrics.protein_ligand import DrugLikenessMetric
metric = DrugLikenessMetric(rngs=rngs)
results = metric.compute(
coordinates=ligand_coords,
atom_types=atom_types,
masks=atom_masks
)
# results = {
# "qed_score": 0.75, # Overall drug-likeness (target: >0.7)
# "lipinski_compliance": 0.85, # Lipinski's Rule of Five
# "molecular_weight": 385.4, # Daltons (target: 180-500)
# "logp": 2.3, # Lipophilicity (target: 0-5)
# "h_bond_donors": 2, # Target: ≤5
# "h_bond_acceptors": 4, # Target: ≤10
# }
Lipinski's Rule of Five:
- Molecular weight ≤ 500 Da
- LogP ≤ 5
- H-bond donors ≤ 5
- H-bond acceptors ≤ 10
7. Benchmark Suite¤
Comprehensive evaluation across all metrics:
from workshop.benchmarks.suites.protein_ligand_suite import ProteinLigandBenchmarkSuite
suite = ProteinLigandBenchmarkSuite(
dataset_config={
"num_samples": 50,
"max_protein_atoms": 200,
"max_ligand_atoms": 30,
},
benchmark_config={
"num_samples": 20,
"batch_size": 4,
},
rngs=rngs
)
# Run evaluation
results = suite.run_all(model)
# results = {
# "binding_affinity": {
# "rmse": 0.45,
# "pearson_r": 0.92,
# },
# "molecular_validity": {
# "validity_rate": 0.97,
# "bond_validity": 0.98,
# },
# "drug_likeness": {
# "qed_score": 0.78,
# "lipinski_compliance": 0.89,
# },
# }
Code Structure¤
The example demonstrates six main components:
- Molecular Modality Framework - Chemical constraints and pharmacophore features
- CrossDocked2020 Dataset - Protein-ligand complex loading and statistics
- Binding Affinity Metric - RMSE evaluation for binding predictions
- Molecular Validity Metric - Chemical plausibility assessment
- Drug-likeness Metric - QED and Lipinski compliance
- Benchmark Suite - Comprehensive evaluation and model comparison
Features Demonstrated¤
- ✅ Molecular modality with chemical constraints
- ✅ CrossDocked2020 dataset with pocket extraction
- ✅ Binding affinity prediction (RMSE, correlation)
- ✅ Molecular validity checks (bonds, clashes, connectivity)
- ✅ Drug-likeness evaluation (QED, Lipinski)
- ✅ Complete benchmark suite execution
- ✅ Model comparison across quality levels
- ✅ Performance target assessment
Experiments to Try¤
- Adjust Model Quality
model = ExampleProteinLigandModel(rngs)
model.model_quality = "excellent" # Try "poor", "good", or "excellent"
results = suite.run_all(model)
- Increase Dataset Size
dataset_config = {
"num_samples": 100, # More samples
"max_protein_atoms": 300,
"max_ligand_atoms": 40,
}
- Custom Pocket Radius
- Add Custom Metrics
class CustomMetric(nnx.Module):
def compute(self, predictions, targets):
# Your custom evaluation logic
return {"custom_score": score}
Next Steps¤
-
Molecular Generation
Generate novel drug-like molecules
-
Protein Folding
Predict protein structures
-
Advanced Docking
Learn molecular docking methods
-
Framework Features
Understand modality system
Troubleshooting¤
ImportError for Molecular Modality¤
Symptom: Cannot import molecular modality classes
Solution: Install molecular extras
Dataset Loading Too Slow¤
Symptom: Long wait times for dataset initialization
Solution: Reduce number of samples
CUDA Out of Memory¤
Symptom: GPU memory error during evaluation
Solution: Reduce batch size
Low Molecular Validity Rates¤
Symptom: Most generated molecules are invalid
Cause: Incorrect coordinate scaling or atom types
Solution: Check coordinate normalization
# Ensure coordinates are in angstroms
coordinates = coordinates * coordinate_scale
# Use realistic atom types (1-6 for C, N, O, S, P, F)
atom_types = jax.random.randint(key, (batch, num_atoms), 1, 7)
Additional Resources¤
Documentation¤
- Molecular Modality Guide - Chemical structure representation
- Protein-Ligand Benchmarks - Complete benchmarking guide
- CrossDocked2020 Dataset - Dataset API reference
Related Examples¤
- Geometric Benchmark Demo - 3D generation
- Loss Examples - Loss functions
Papers and Resources¤
- CrossDocked2020: "Protein-Ligand Docking and Scoring with Deep Learning" (Francoeur et al., 2020)
- QED: "Quantifying the chemical beauty of drugs" (Bickerton et al., 2012)
- Lipinski's Rule: "Experimental and computational approaches to estimate solubility" (Lipinski et al., 2001)
- Autodock Vina: Popular molecular docking software
External Tools¤
- RDKit: Open-source cheminformatics library
- Open Babel: Chemical toolbox for file conversion
- PyMOL: Molecular visualization
- Protein Data Bank (PDB): Protein structure database