Made GNN-PCNA: a graph neural network that scores which residues on PCNA — a notoriously hard cancer drug target — are worth
investigating as hidden ("cryptic") pockets. It reads a protein as two graphs at once (3D contacts + backbone chain) and fuses them
with a learned attention gate — a GATv2 design the standard baseline (PocketMiner) doesn't use.
Hardest part wasn't the model — it was catching homology leakage in my own data. My first benchmark looked great until I ran MMseqs2
and found near-identical proteins (99% identity!) sitting across the train/test split. So I deleted my own headline numbers and
rebuilt everything on a clean 30%-homology split. The honest score came out lower — and that's the whole point.
Proud of: it's engineered to not overclaim. Every output is framed as "hypothesis generation," an automated verifier catches my own
mistakes, and an ablation proves how much comes from sequence vs. structure.
To test it: open the demo, pick a PCNA structure (try 8GLA or 1W60), and watch it score every residue in ~2s with a 3D viewer +
heatmap. Heads-up: the live demo runs the lightweight model; the headline AUPRC/AUROC come from a heavier ESM2 model (too big to host
free). Full story in the repo's DEVLOG.