Degron Variation — Interesting Haplotypes ?

Reevaluated degron class-change haplotypes ranked by structural impact score (from 23 genes)

Help — Degron Variation Viewer

Overview

This page lists haplotypes from the JoGo haplotype-resolved proteome where amino acid substitutions cause degron class changes — transitions between stable (high abundance) and strong degron (high degradation) states. These represent cases where single nucleotide polymorphisms alter protein degradation propensity via the ubiquitin-proteasome system.

Haplotypes are ranked by a composite impact score that integrates the magnitude of the degron score change with structural context from AlphaFold predictions.

Data Source

  • Haplotype sequences: JoGo a-level haplotypes (174,376 sequences from 19,193 gene regions)
  • Degron scoring: PAP (Peptide Abundance Predictor) CNN2w1 model — 30-residue sliding window, full-coverage scoring
  • Structural features: AlphaFold per-residue pLDDT, rASA, secondary structure, MobiDB disorder from mane_protein_features.sqlite3
  • Population frequencies: JoGo actg-haplotype database (5 super-populations + 24 sub-populations)
  • Gene annotations: MANE Select v1.2 (GRCh38), 19,316 genes

Degron Classes

ClassScore RangeDescription
Strong degron< 0.04High degradation propensity — protein targeted for rapid proteasomal degradation
Intermediate0.04 – 0.22Moderate degradation propensity
Stable> 0.22Low degradation propensity — protein is stable

Reference: Voutsinos et al., Science Advances (2025). Un-tagged GFP baseline ~0.6.

Direction Types

DirectionBadgeDescription
Degron gaingainstable → strong: SNPs create a new degron, potentially increasing protein degradation
Degron losslossstrong → stable: SNPs destroy an existing degron, potentially stabilizing the protein

Structural Context

The structural context at the worst-scoring variant position is classified using AlphaFold predictions:

ContextBadgeCriteriaBiological Meaning
FunctionalfunctionalrASA ≥ 0.25 AND (disorder=1 OR pLDDT < 70)Exposed + disordered/flexible — most likely to be an active degron in vivo
ExposedexposedrASA ≥ 0.25 AND structuredSurface-accessible but structured — degron may become active upon unfolding
BuriedburiedrASA < 0.25Interior of protein — degron likely inactive unless protein unfolds

Table Columns

ColumnDescription
ImpactComposite impact score (0–1) with star rating. Higher = more biologically significant
GeneGene symbol. Click to open the per-gene degron viewer page
RegionGenomic region name (GENE_chrN_start_end). Click to open JoGo browser
DirDirection of degron class change: gain or loss
HapIDHaplotype identifier (a-level, e.g., a0003)
ProtLenProtein length in amino acids
SNPsNumber of class-changing SNPs in this haplotype
Worst VariantThe variant with the largest degron score change (e.g., T580C)
DeltaScore change at worst variant. Negative = towards degron (gain); Positive = away from degron (loss)
EffectAbsolute magnitude of the worst delta (effect_mag = |worst_delta|)
Class ChangeDegron class transition (e.g., stable → strong)
ContextStructural context at worst variant position (functional/exposed/buried)
Win%%FuncFraction of the 30-residue PAP scoring window at functional positions
pLDDTMean AlphaFold pLDDT for the protein. <70 low, 70–90 medium, >90 high confidence
FreqGlobal haplotype frequency across all populations
Obs/DenomObserved count / total denominator (e.g., 1/340)
MaxPopPopulation with highest frequency for this haplotype (e.g., EAS, AFR)

Impact Score

Haplotypes are ranked by a composite impact score:

impact = 0.35 × effect_norm + 0.35 × window_frac_functional + 0.15 × prot_frac_disorder + 0.15 × is_c_terminal

Where:

  • effect_norm = min(|worst_delta| / 0.6, 1.0) — normalized effect magnitude
  • window_frac_functional = fraction of 30-aa PAP window at functional positions
  • prot_frac_disorder = whole-protein MobiDB disorder fraction
  • is_c_terminal = 1 if worst position is within last 5 residues

Higher scores indicate degron changes in exposed, disordered regions with strong effect magnitude — most likely to affect protein abundance in vivo.

Star Ratings

StarsImpact RangeInterpretation
★★★★★≥ 0.70Very high impact — strong effect in functional context
★★★★≥ 0.55High impact
★★★≥ 0.40Moderate impact
★★≥ 0.25Low-moderate impact
< 0.25Low impact — buried or weak effect

Filters

FilterDescription
SearchFilter by gene name (case-insensitive substring match)
DirectionShow only degron gain or loss haplotypes
ContextFilter by structural context at worst variant (functional/exposed/buried)
Min ImpactOnly show haplotypes with impact score ≥ threshold

Per-Gene Viewer

Clicking a gene name opens the detailed degron viewer page ({REGION}_degron_viewer.html) which shows:

  • JoGo Integration — Haplotype Explorer and 3D Protein Viewer via Togostanza web components
  • Protein Structural Summary — UniProt accession, pLDDT, SS donut, disorder fraction
  • Abundance Score Chart — Interactive SVG line chart with brush zoom, disorder shading, SS overlays
  • Structural Annotation Track — Context ribbon, SS ribbon, pLDDT bars below chart
  • Multiple Alignment — Amino acid alignment with degron score coloring, structural annotations
  • Variant Impact Table — Per-variant score changes with structural columns

Methods

The pipeline consists of the following steps:

  • Extract amino acid sequences from JoGo haplotype TSV
  • Run PAP CNN2w1 full-coverage scoring (30-residue sliding window)
  • Parse scores, identify class-change variants, enrich with structural features from AlphaFold
  • Generate per-gene HTML viewers (19,055 files) with interactive charts and structural panels
  • Build SQLite database, annotate with structural context, join with population frequencies
  • Re-evaluate rankings with window-level and protein-level structural features
  • Generate this summary page from the re-evaluated gain/loss tables

Reference

Voutsinos V, et al. Systematic identification of protein degradation signals in human cytosolic proteins. Science Advances, 2025. doi:10.1126/sciadv.adz3483

Nagasaki M, et al. JoGo 1.0: the ACTG hierarchical nomenclature and database covering 4.7 million haplotypes across 19,194 human genes. Nucleic Acids Research, 2026. doi:10.1093/nar/gkaf1232

Impact Gene Region Dir HapID ProtLen SNPs Worst Variant Delta Effect Class Change Context Win%%Func pLDDT Freq Obs/Denom MaxPop