Hypometric genetics: Improved power in genetic discovery by incorporating quality control flags

Tanigawa and Kellis. Am J Hum Genet. (2024).


Phenotype: BLQ: PLs to Tot. Lipids in CMs and XXL VLDL %


Predictive performance of iPGS models

We evaluated the predictive performance of the inclusive polygenic score models using the held-out test set individuals.

Population Model PGS trait type Metric Predictive Performance 95% CI P-value
Population Model PGS trait type Metric Predictive Performance 95% CI P-value
white BritishCovariate-only modelBLQ (derived)AUROC0.637[0.630, 0.644]1.9x10-288
white BritishGenotype-only modelDerived (percentage traits, excl. BLQ measurements)AUROC0.515[0.508, 0.523]6.5x10-05
white BritishGenotype-only modelDerived (percentage traits, incl. BLQ measurements)AUROC0.590[0.583, 0.597]4.1x10-124
white BritishGenotype-only modelBLQ (derived)AUROC0.626[0.618, 0.633]4.5x10-238
white BritishFull model (covariates and genotypes)Derived (percentage traits, excl. BLQ measurements)AUROC0.617[0.609, 0.624]9.8x10-198
white BritishFull model (covariates and genotypes)Derived (percentage traits, incl. BLQ measurements)AUROC0.651[0.644, 0.658]<1.0x10-300
white BritishFull model (covariates and genotypes)BLQ (derived)AUROC0.687[0.680, 0.694]<1.0x10-300
Non-British whiteCovariate-only modelBLQ (derived)AUROC0.675[0.642, 0.708]6.3x10-23
Non-British whiteGenotype-only modelDerived (percentage traits, excl. BLQ measurements)AUROC0.523[0.488, 0.558]1.9x10-01
Non-British whiteGenotype-only modelDerived (percentage traits, incl. BLQ measurements)AUROC0.592[0.558, 0.626]1.2x10-07
Non-British whiteGenotype-only modelBLQ (derived)AUROC0.622[0.589, 0.654]2.9x10-11
Non-British whiteFull model (covariates and genotypes)Derived (percentage traits, excl. BLQ measurements)AUROC0.654[0.621, 0.688]7.5x10-18
Non-British whiteFull model (covariates and genotypes)Derived (percentage traits, incl. BLQ measurements)AUROC0.674[0.642, 0.707]8.3x10-23
Non-British whiteFull model (covariates and genotypes)BLQ (derived)AUROC0.718[0.687, 0.748]3.2x10-31
South AsianCovariate-only modelBLQ (derived)AUROC0.599[0.536, 0.662]8.5x10-04
South AsianGenotype-only modelDerived (percentage traits, excl. BLQ measurements)AUROC0.532[0.468, 0.596]2.7x10-01
South AsianGenotype-only modelDerived (percentage traits, incl. BLQ measurements)AUROC0.612[0.553, 0.672]2.8x10-04
South AsianGenotype-only modelBLQ (derived)AUROC0.657[0.596, 0.718]7.9x10-07
South AsianFull model (covariates and genotypes)Derived (percentage traits, excl. BLQ measurements)AUROC0.614[0.553, 0.675]7.2x10-04
South AsianFull model (covariates and genotypes)Derived (percentage traits, incl. BLQ measurements)AUROC0.657[0.597, 0.717]1.2x10-06
South AsianFull model (covariates and genotypes)BLQ (derived)AUROC0.673[0.611, 0.735]2.8x10-08
AfricanCovariate-only modelBLQ (derived)AUROC0.576[0.529, 0.623]3.0x10-03
AfricanGenotype-only modelDerived (percentage traits, excl. BLQ measurements)AUROC0.516[0.468, 0.563]4.5x10-01
AfricanGenotype-only modelDerived (percentage traits, incl. BLQ measurements)AUROC0.553[0.505, 0.600]3.2x10-02
AfricanGenotype-only modelBLQ (derived)AUROC0.573[0.526, 0.620]1.3x10-02
AfricanFull model (covariates and genotypes)Derived (percentage traits, excl. BLQ measurements)AUROC0.568[0.521, 0.615]4.6x10-03
AfricanFull model (covariates and genotypes)Derived (percentage traits, incl. BLQ measurements)AUROC0.587[0.541, 0.634]5.8x10-04
AfricanFull model (covariates and genotypes)BLQ (derived)AUROC0.594[0.548, 0.641]1.5x10-04
OthersCovariate-only modelBLQ (derived)AUROC0.661[0.641, 0.681]3.8x10-51
OthersGenotype-only modelDerived (percentage traits, excl. BLQ measurements)AUROC0.517[0.496, 0.538]3.5x10-02
OthersGenotype-only modelDerived (percentage traits, incl. BLQ measurements)AUROC0.586[0.565, 0.607]4.7x10-16
OthersGenotype-only modelBLQ (derived)AUROC0.610[0.590, 0.631]6.2x10-24
OthersFull model (covariates and genotypes)Derived (percentage traits, excl. BLQ measurements)AUROC0.651[0.631, 0.671]1.8x10-42
OthersFull model (covariates and genotypes)Derived (percentage traits, incl. BLQ measurements)AUROC0.669[0.649, 0.689]1.0x10-53
OthersFull model (covariates and genotypes)BLQ (derived)AUROC0.695[0.676, 0.714]9.1x10-68

The predictive performance (R2), its 95% confidence interval (CI), and statistical significance (P-value) are shown for each population in UK Biobank in the held-out test set. The "model" column indicates whether the predictive performance is from the covariate-terms alone (covariate-only model), PGS terms alone (Genotype-only model), or the full model containing both PGS and covariate terms. We used the following sets of covariates in our analysis: age, sex, age2, age*sex, Townsend deprivation index, and genotype PCs (PC1-PC18). Please refer to our publication for a more detailed description of the methods.


Coefficients (BETA) of PGS models

/static/data/tanigawakellis2024/per_trait/BIN_FC10023879/pgscoeffs.png

We show the coefficients (BETA) of PGS models. Our iPGS model selected 3584 variants with non-zero coefficients. The genetic variants with the large absolute values of coefficients are annotated in the plot. There is no guarantee that our iPGS model selects causal variants. We use the GRCh37/hg19 reference genome.

The top 100 genetic variants with the largest absolute value of coefficients

CHROM POS Variant Variant ID Effect Allele Consequence Gene symbol Effect Weight
CHROM POS Variant Variant ID Effect Allele Consequence Gene symbol Effect Weight
61609611376:160961137:T:Crs3798220CPAVsLPA0.267543665616649
61610101186:161010118:A:Grs10455872GIntronicLPA0.260889064549462
1111664891711:116648917:G:Crs964184CUTRZPR10.253766399978882
8198135298:19813529:A:Grs268GPAVsLPL-0.252349308829295
19842932319:8429323:G:Ars116843064APAVsANGPTL40.195171514590724
8198197248:19819724:C:Grs328GPTVsLPL0.1798830076797
194541564019:45415640:G:Ars445925AOthersAPOC1-0.16711641924687
2212315242:21231524:G:Ars676210APAVsAPOB0.143189817685514
2277309402:27730940:T:Crs1260326CPAVsGCKR0.123540576252722
174192612617:41926126:C:Trs72836561TPAVsCD300LG-0.0849418072371098
1631181961:63118196:A:Crs10889353CIntronicDOCK70.069966181570363
7730203377:73020337:C:Grs3812316GPAVsMLXIPL0.0597310928028423
1111669229311:116692293:C:Ars12721043APAVsAPOA40.0579541874311493
8198246678:19824667:C:Trs15285TUTRLPL0.057193378432485
204455401520:44554015:T:Crs6065906COthers-0.0571638975119981
194541445119:45414451:T:Crs439401COthersAPOC1-0.0570564627263489
165699332416:56993324:C:Ars3764261AOthersCETP0.0564250207177209
106492782310:64927823:C:Grs1935GPAVsJMJD1C0.049974790906134
5558618945:55861894:G:Ars9687846AIntronic-0.0490284208912885
22271014112:227101411:A:Grs2972144GOthers-0.0488805678457319
12302999491:230299949:T:Crs10779835CIntronicGALNT20.0488308616725306
21655286242:165528624:G:Trs1128249TIntronicCOBLL10.0473840981312839
1111665756111:116657561:C:Trs3741298TIntronicZPR10.0471045601446934
81265073898:126507389:C:Ars2954038AIntronic0.0467202700279208
7730120427:73012042:G:Ars35332062APAVsMLXIPL0.0446239580927961
61611070186:161107018:G:Ars9457997AOthers0.0443971336622366
154382071715:43820717:C:Trs55707100TPAVsMAP1A-0.0434695017597161
61610181746:161018174:C:Trs7770628TIntronicLPA-0.042977164000354
71304382147:130438214:G:Ars13234407AOthers0.0429118610507504
1111663394711:116633947:G:Ars10488698APAVsBUD130.0389866353135638
116157138211:61571382:G:Ars174549AUTRFADS1-0.0388879123785724
81264817478:126481747:A:Grs2980875GIntronic0.0379038900494411
116160351011:61603510:C:Ars174576AIntronicFADS2-0.0356475605722811
8198194398:19819439:A:Grs326GIntronicLPL0.0353100781633204
8199430278:19943027:G:Ars13265868AIntronic0.0340714406212693
81264882508:126488250:C:Trs2980869TIntronic0.032295035894021
632631702HLA-DQB1*0201HLA-DQB1*0201+PAVsHLA-DQB10.0303449593858499
19719993919:7199939:A:Grs1035941GIntronicINSR0.0298939417295855
165699723316:56997233:G:Ars1864163AIntronicCETP-0.0295505683826192
7730170057:73017005:A:Grs13226650GIntronicMLXIPL0.0288858878882262
1212444011012:124440110:C:Ars4765219AIntronicCCDC920.0286604677423889
204454504820:44545048:C:Trs4810479TOthersPLTP0.0283369463460214
1111174934911:111749349:A:Trs611010TPAVsALG9, FDXACB10.0283306039585698
4261268384:26126838:A:Grs7673206GOthers-0.0278284156791296
1629400971:62940097:G:Ars1979722AIntronicDOCK70.0277962542841061
165701509116:57015091:G:Crs5880CPAVsCETP-0.027389847056835
1212450428312:124504283:T:Crs825508CIntronicRFLNA-0.0271308010750202
19861558919:8615589:A:Grs4804311GPAVsMYO1F0.0263474685190707
51563916285:156391628:T:Crs6874202COthersTIMD4-0.0256243568346538
1510191055015:101910550:G:Ars20543APAVsPCSK6-0.0248346694704683
167214417416:72144174:T:Crs9302635CIntronicDHX380.024502836702553
22423226802:242322680:T:Crs3771555CIntronicFARP20.0244975819808881
3123931253:12393125:C:Grs1801282GPAVsPPARG0.0242395557300247
61398340126:139834012:T:Grs632057GOthers0.0242365536540326
168153479016:81534790:T:Crs2925979CIntronicCMIP0.0240004646010112
9866172659:86617265:A:Grs1982151GPAVsRMI1-0.0238319158073456
106384113010:63841130:G:Ars4948296AIntronicARID5B0.0234431085531264
4896688594:89668859:C:Trs7657817TPAVsFAM13A0.0234056791250016
1272399201:27239920:C:Grs6659176GPAVsNR0B2-0.0230532911739718
8199414488:19941448:C:Trs6989064TIntronic0.0229825432056551
61607663216:160766321:C:Trs540713TOthersSLC22A3-0.0229622308832103
4880522194:88052219:T:Crs342467CPAVsAFF1-0.0226135327708438
71066421237:106642123:C:Trs7786720TOthers0.0223975515567
116159236211:61592362:A:Grs174566GIntronicFADS1, FADS2-0.0222576323066466
166582159316:65821593:G:Ars459950AOthers0.0216930304734614
161514864616:15148646:C:Ars11075253AIntronicNTAN1, PDXDC10.0214540029176294
191939140219:19391402:C:Trs6511026TOthersSUGP10.0214016963043704
1212456684112:124566841:A:Grs10846600GIntronicRFLNA0.0212066389306182
195748842319:57488423:C:Trs8102873TOthers-0.0211489556461408
1011538174710:115381747:G:Ars868738APAVsNRAP-0.0210418259406792
139503474913:95034749:G:Ars1535692APAVsGPC6-0.0209731323271908
6437655336:43765533:A:Grs1885659GOthers-0.0208151592865078
31863377133:186337713:T:Crs4917CPAVsAHSG-0.0202603387420634
3523596783:52359678:T:Crs6796333CIntronicDNAH10.020218714573702
1311223011413:112230114:T:Crs2774440COthers-0.0202038668177041
1111659898811:116598988:A:Grs180360GOthers0.020044473461756
174193137517:41931375:A:Grs12453522GPAVsCD300LG-0.0198383811583418
125784371112:57843711:G:Ars2229357APAVsINHBC0.0197023749493042
122047375812:20473758:C:Ars7134375AOthers0.019686325631182
51310081945:131008194:T:Crs26008CPAVsFNIP10.01962260642378
1111666240711:116662407:G:Crs3135506CPAVsAPOA5-0.0193305800912549
7259918267:25991826:T:Crs4722551COthersMIR148A0.0193241399678198
147638321914:76383219:A:Grs6574257GIntronicIFT43, TTLL50.0193056247872435
61604720326:160472032:T:Grs4709393GIntronicIGF2R0.0192731619506836
116159721211:61597212:C:Trs174570TIntronicFADS2-0.0192195906621729
31501283923:150128392:G:Ars879634APAVsTSC22D2-0.0190371462155053
149484484314:94844843:T:Grs1303GPAVsSERPINA10.0188275170086147
685347156:8534715:A:Grs230245GOthers0.0187066063701972
223887746122:38877461:T:Grs12004GPAVsKDELR3-0.0186244372809979
214067046021:40670460:G:Crs2056844CPAVsBRWD1-0.0185908572478044
7264038017:26403801:A:Grs2698717GIntronicSNX100.018580431803413
2992261722:99226172:AT:Ars66468243APTVsUNC50-0.0185493233899736
61337897286:133789728:G:Ars9493627APAVsEYA40.0183991752992543
4877699294:87769929:T:Crs13106574CPAVsSLC10A6-0.0183540487784942
61605608456:160560845:A:Grs628031GPAVsSLC22A10.0183487802940291
1115794701:11579470:G:Ars2072994APAVsDISP3-0.0183224976880094
31359884123:135988412:G:Ars4678428AIntronicPCCB-0.0182837455507849
1111690828311:116908283:A:Grs7115242GIntronicSIK30.0181635610550212
51806869585:180686958:C:Trs943957TIntronicTRIM520.0179897890279879
223746959022:37469590:C:Trs387907018TPAVsTMPRSS6-0.0179156766760858

There is no guarantee that our iPGS model selects causal variants. We show the top 100 variants with the largest effect size (BETA). To see 3584 variants included in our iPGS model, please download the iPGS coefficients by clicking the download button. We use the GRCh37/hg19 reference genome.


Follow-up analysis

There are several ways to use the resource in your research. First, you may use our iPGS coefficients and compute individual-level polygenic scores for your cohort. Second, you may also investigate the genetic variants with non-zero coefficients and their annotated genes to learn more about biology by taking advantage of the sparsity of our iPGS models. For your convenience, here we suggest several resources as an example of follow-up analysis. We do not intend to cover all the relevant follow-up analyses.

Using iPGS coefficients

By clicking the download button above, you may download the iPGS coefficients. Our FAQ page shows the description of file format and how you may use iPGS coefficients in your research.

HaploReg

HaploReg is a tool for exploring annotations of the non-coding genome at variants on haplotype blocks. The button above submits the top 100 genetic variants with the largest absolute value of coefficients as a query to HaploReg using the default parameters in HaploReg v4.2 (LD threshold r2 >= 1, ChromHMM 15-state model, SiPhy-omega, and GENCODE genes). HaploReg's ability to browse haplotypes is useful here as there is no guarantee that our iPGS model selects causal variants. The 'top 100 variant' cutoff is an arbitrary threshold; we aim to demonstrate how one may investigate the selected variants. Please check Ward and Kellis. Nucleic Acids Res. 2012 and Ward and Kellis. Nucleic Acids Res. 2016 for more information on HaploReg.


References