Hypometric genetics: Improved power in genetic discovery by incorporating quality control flags

Tanigawa and Kellis. Am J Hum Genet. (2024).


Phenotype: Trunc.: Free Chol. in CMs and XXL VLDL

  • Estimated h2 in white British population in UKB: 0.088 (95% CI:[0.070, 0.106]).

Predictive performance of iPGS models

We evaluated the predictive performance of the inclusive polygenic score models using the held-out test set individuals.

Population Model PGS trait type Metric Predictive Performance 95% CI P-value
Population Model PGS trait type Metric Predictive Performance 95% CI P-value
white BritishCovariate-only modelTruncated (excl. BLQ measurements)R20.030[0.027, 0.033]1.3x10-194
white BritishGenotype-only modelBLQ (binarized at BLQ threshold)R20.055[0.050, 0.059]<1.0x10-300
white BritishGenotype-only modelOriginal (incl. BLQ measurements)R20.071[0.066, 0.076]<1.0x10-300
white BritishGenotype-only modelTruncated (excl. BLQ measurements)R20.060[0.055, 0.065]<1.0x10-300
white BritishFull model (covariates and genotypes)BLQ (binarized at BLQ threshold)R20.055[0.051, 0.060]<1.0x10-300
white BritishFull model (covariates and genotypes)Original (incl. BLQ measurements)R20.105[0.100, 0.111]<1.0x10-300
white BritishFull model (covariates and genotypes)Truncated (excl. BLQ measurements)R20.094[0.089, 0.100]<1.0x10-300
Non-British whiteCovariate-only modelTruncated (excl. BLQ measurements)R20.020[0.006, 0.034]2.1x10-06
Non-British whiteGenotype-only modelBLQ (binarized at BLQ threshold)R20.052[0.031, 0.074]1.4x10-14
Non-British whiteGenotype-only modelOriginal (incl. BLQ measurements)R20.079[0.053, 0.104]2.3x10-21
Non-British whiteGenotype-only modelTruncated (excl. BLQ measurements)R20.068[0.044, 0.092]1.6x10-18
Non-British whiteFull model (covariates and genotypes)BLQ (binarized at BLQ threshold)R20.053[0.031, 0.075]1.0x10-14
Non-British whiteFull model (covariates and genotypes)Original (incl. BLQ measurements)R20.104[0.075, 0.133]5.4x10-28
Non-British whiteFull model (covariates and genotypes)Truncated (excl. BLQ measurements)R20.090[0.063, 0.117]2.8x10-24
South AsianCovariate-only modelTruncated (excl. BLQ measurements)R20.010[-0.004, 0.024]1.3x10-02
South AsianGenotype-only modelBLQ (binarized at BLQ threshold)R20.052[0.021, 0.083]9.5x10-09
South AsianGenotype-only modelOriginal (incl. BLQ measurements)R20.078[0.041, 0.115]1.5x10-12
South AsianGenotype-only modelTruncated (excl. BLQ measurements)R20.066[0.031, 0.100]1.1x10-10
South AsianFull model (covariates and genotypes)BLQ (binarized at BLQ threshold)R20.053[0.022, 0.084]8.2x10-09
South AsianFull model (covariates and genotypes)Original (incl. BLQ measurements)R20.088[0.049, 0.126]5.8x10-14
South AsianFull model (covariates and genotypes)Truncated (excl. BLQ measurements)R20.067[0.033, 0.102]6.2x10-11
AfricanCovariate-only modelTruncated (excl. BLQ measurements)R20.003[-0.006, 0.011]4.9x10-01
AfricanGenotype-only modelBLQ (binarized at BLQ threshold)R20.004[-0.006, 0.013]4.3x10-01
AfricanGenotype-only modelOriginal (incl. BLQ measurements)R20.023[-0.001, 0.047]4.2x10-02
AfricanGenotype-only modelTruncated (excl. BLQ measurements)R20.012[-0.006, 0.030]1.4x10-01
AfricanFull model (covariates and genotypes)BLQ (binarized at BLQ threshold)R20.003[-0.006, 0.013]4.4x10-01
AfricanFull model (covariates and genotypes)Original (incl. BLQ measurements)R20.022[-0.002, 0.046]4.7x10-02
AfricanFull model (covariates and genotypes)Truncated (excl. BLQ measurements)R20.011[-0.006, 0.028]1.6x10-01
OthersCovariate-only modelTruncated (excl. BLQ measurements)R20.044[0.032, 0.056]8.7x10-33
OthersGenotype-only modelBLQ (binarized at BLQ threshold)R20.041[0.029, 0.052]2.8x10-30
OthersGenotype-only modelOriginal (incl. BLQ measurements)R20.058[0.044, 0.071]1.6x10-42
OthersGenotype-only modelTruncated (excl. BLQ measurements)R20.049[0.036, 0.061]6.2x10-36
OthersFull model (covariates and genotypes)BLQ (binarized at BLQ threshold)R20.042[0.030, 0.053]9.7x10-31
OthersFull model (covariates and genotypes)Original (incl. BLQ measurements)R20.105[0.088, 0.123]6.8x10-78
OthersFull model (covariates and genotypes)Truncated (excl. BLQ measurements)R20.098[0.081, 0.115]2.1x10-72

The predictive performance (R2), its 95% confidence interval (CI), and statistical significance (P-value) are shown for each population in UK Biobank in the held-out test set. The "model" column indicates whether the predictive performance is from the covariate-terms alone (covariate-only model), PGS terms alone (Genotype-only model), or the full model containing both PGS and covariate terms. We used the following sets of covariates in our analysis: age, sex, age2, age*sex, Townsend deprivation index, and genotype PCs (PC1-PC18). Please refer to our publication for a more detailed description of the methods.


Coefficients (BETA) of PGS models

/static/data/tanigawakellis2024/per_trait/INI10023486/pgscoeffs.png

We show the coefficients (BETA) of PGS models. Our iPGS model selected 4551 variants with non-zero coefficients. The genetic variants with the large absolute values of coefficients are annotated in the plot. There is no guarantee that our iPGS model selects causal variants. We use the GRCh37/hg19 reference genome.

The top 100 genetic variants with the largest absolute value of coefficients

CHROM POS Variant Variant ID Effect Allele Consequence Gene symbol Effect Weight
CHROM POS Variant Variant ID Effect Allele Consequence Gene symbol Effect Weight
61610101186:161010118:A:Grs10455872GIntronicLPA-0.0026592269385286
1111664891711:116648917:G:Crs964184CUTRZPR1-0.0021027056400256
8198197248:19819724:C:Grs328GPTVsLPL-0.0019448894155628
194541564019:45415640:G:Ars445925AOthersAPOC10.0018035585956146
8198135298:19813529:A:Grs268GPAVsLPL0.0016548006086517
61609611376:160961137:T:Crs3798220CPAVsLPA-0.0016041859293876
2277309402:27730940:T:Crs1260326CPAVsGCKR-0.0015618553924319
19842932319:8429323:G:Ars116843064APAVsANGPTL4-0.0011944555873145
2212315242:21231524:G:Ars676210APAVsAPOB-0.001115343828001
191937954919:19379549:C:Trs58542926TPAVsTM6SF2-0.0010518363169133
7730203377:73020337:C:Grs3812316GPAVsMLXIPL-0.0010251208952596
174192612617:41926126:C:Trs72836561TPAVsCD300LG0.0008100942926654
194541445119:45414451:T:Crs439401COthersAPOC10.0007677742035256
1111665756111:116657561:C:Trs3741298TIntronicZPR1-0.0007349923731355
154382071715:43820717:C:Trs55707100TPAVsMAP1A0.0006139086650501
81265073898:126507389:C:Ars2954038AIntronic-0.0005767138610534
1111670556811:116705568:G:Trs10750098TOthersAPOA1-AS-0.0005450738571889
61610173636:161017363:G:Ars73596816AIntronicLPA-0.0005284676741893
176421058017:64210580:A:Crs1801689CPAVsAPOH-0.0005283648759885
81264882508:126488250:C:Trs2980869TIntronic-0.000528270907347
7730358577:73035857:T:Crs7800944CIntronicMLXIPL-0.0004833125216033
8182724388:18272438:C:Trs4921914TOthers-0.0004663063268441
632424101HLA-DRB3*0301HLA-DRB3*0301+PAVsHLA-DRB30.0004530332784288
194543255719:45432557:G:Crs7259004CIntronicAPOC1P10.0004530219292308
61610995036:161099503:G:Ars5014650AOthers-0.0004471014071737
1629570301:62957030:G:Ars10889333AIntronicDOCK7-0.0004343039419804
1011416927610:114169276:A:Grs3736946GPAVsACSL5-0.0004177959763628
116159236211:61592362:A:Grs174566GIntronicFADS1, FADS20.0004046326095295
224432472722:44324727:C:Grs738409GPAVsPNPLA3-0.0003816827759839
61605576436:160557643:C:Trs2282143TPAVsSLC22A1-0.0003740810235637
116403124111:64031241:C:Trs35169799TPAVsPLCB30.0003692060009306
165699071616:56990716:C:Ars247617AOthers-0.0003652045610397
5558618945:55861894:G:Ars9687846AIntronic0.0003496301747129
8198194398:19819439:A:Grs326GIntronicLPL-0.000339653865704
1629677471:62967747:A:Grs1168032GIntronicDOCK70.0003392612763967
22270998542:227099854:T:Crs2972147COthers0.0003360865746143
1629828911:62982891:C:Trs1168045TIntronicDOCK70.0003324231416647
8199145988:19914598:C:Ars6586891AOthers-0.0003233789708496
61398340126:139834012:T:Grs632057GOthers-0.0002909989036733
81264793628:126479362:C:Trs6982502TIntronic-0.000287897805279
21655286242:165528624:G:Trs1128249TIntronicCOBLL1-0.0002789052560188
61609119086:160911908:C:Trs9365166TIntronicLPAL2-0.0002762338257858
116157138211:61571382:G:Ars174549AUTRFADS10.0002629078303584
125784371112:57843711:G:Ars2229357APAVsINHBC-0.0002607480594184
1111663994111:116639941:A:Grs1263149GIntronicBUD13-0.0002523187821666
61607663216:160766321:C:Trs540713TOthersSLC22A30.000249812329956
61611085366:161108536:C:Trs6935921TOthers-0.0002478045274066
8198521348:19852134:G:Trs17411024TOthers0.0002447885777789
135104378813:51043788:C:Trs9316497TIntronicDLEU10.0002435514978262
176420828517:64208285:C:Grs1801690GPAVsAPOH-0.0002356415650738
106492782310:64927823:C:Grs1935GPAVsJMJD1C-0.0002348239418012
11498710031:149871003:C:Trs1349532TPAVsBOLA1-0.0002330806834518
194538959619:45389596:G:Ars7254892AIntronicNECTIN20.0002316349882885
51563916285:156391628:T:Crs6874202COthersTIMD40.0002218910319732
1111662470311:116624703:G:Trs180326TIntronicBUD13-0.0002204618171647
6437588736:43758873:G:Ars6905288AOthersVEGFA0.0002192867372129
108109607110:81096071:T:Crs7077812COthers0.0002148551261791
91393689539:139368953:G:Ars3812594APAVsSEC16A0.0002118042886551
71304333847:130433384:C:Trs4731702TOthers-0.0002112699757577
1111709449111:117094491:C:Trs11216322TUTRPCSK70.0002111203561875
8199430278:19943027:G:Ars13265868AIntronic-0.0002110155622671
X109689152X:109689152:A:Grs10521528GIntronicRTL9-0.0002103514211044
194537356519:45373565:G:Ars395908AIntronicNECTIN20.0002089603013178
5557991845:55799184:C:Ars157843AOthers0.000206778179486
4882313924:88231392:T:TArs72613567TAPAVsHSD17B130.0002000370331292
10526068210:5260682:C:Grs17134592GPAVsAKR1C4-0.0001988153583076
1311455313413:114553134:C:Trs7994900TIntronicGAS60.0001985741779729
61611522406:161152240:G:Ars4252125APAVsPLG0.0001914416216379
7730203017:73020301:T:Crs799157CPCVsMLXIPL-0.0001910577789362
4879729684:87972968:T:Crs6531948CIntronicAFF10.0001893762135107
156379323815:63793238:T:Grs11635675GOthersUSP30.0001850125733404
116160034211:61600342:A:Crs174574CIntronicFADS2-0.0001790759479296
111335577011:13355770:C:Trs6486121TIntronicBMAL10.0001779106873061
223856900622:38569006:A:Grs738322GIntronicPLA2G6-0.000172348659059
1248868181:24886818:G:Ars12122463AIntronicNCMAP0.0001710409542012
8952726058:95272605:G:Crs2170363CPAVsGEM0.0001706999618547
7756150067:75615006:C:Trs1057868TPAVsPOR0.0001683804530469
11544269701:154426970:A:Crs2228145CPAVsIL6R0.0001675031549409
161517211816:15172118:T:Crs11644601CIntronicPDXDC1, RRN3-0.0001659190374127
6437570826:43757082:T:Ars4711750AOthersVEGFA0.0001649797913005
167970491516:79704915:G:Trs8047723TOthers0.0001636935151322
12302956911:230295691:G:Ars4846914AIntronicGALNT2-0.0001634313334379
139525894413:95258944:T:Crs6492721CIntronicGPR180-0.0001626191943105
122047375812:20473758:C:Ars7134375AOthers-0.0001606444853179
1111726631211:117266312:C:Grs2305830GPAVsCEP1640.0001600010698893
204454504820:44545048:C:Trs4810479TOthersPLTP-0.000158794857749
204457650220:44576502:T:Crs7679CUTRPCIF10.0001581235600597
432349804:3234980:G:Ars362272APAVsHTT-0.0001580906350296
154409392715:44093927:T:Crs12702CPAVsHYPK, SERF20.0001577684438171
168153479016:81534790:T:Crs2925979CIntronicCMIP-0.0001571978917608
147138384814:71383848:T:Crs2810073CIntronicPCNX10.000157107732716
176651985717:66519857:C:CTrs3841514CTPTVsPRKAR1A0.000154673839596
8594120668:59412066:T:Grs8192870GIntronicCYP7A1-0.0001529543393426
632491592HLA-DRB5*0101HLA-DRB5*0101+PAVsHLA-DRB5-0.0001525913300381
22249192:224919:A:Grs2290911GPAVsSH3YL1-0.0001520854708983
1400649611:40064961:G:Ars12037222AOthers0.0001518041632961
6327966856:32796685:A:Grs241448GPTVsTAP20.0001502900023065
156661834215:66618342:A:Grs3803412GPAVsDIS3L-0.0001502654579124
12430315312:4303153:A:Grs1861178GOthers0.0001500811430435
3123931253:12393125:C:Grs1801282GPAVsPPARG-0.0001499050170417

There is no guarantee that our iPGS model selects causal variants. We show the top 100 variants with the largest effect size (BETA). To see 4551 variants included in our iPGS model, please download the iPGS coefficients by clicking the download button. We use the GRCh37/hg19 reference genome.


Follow-up analysis

There are several ways to use the resource in your research. First, you may use our iPGS coefficients and compute individual-level polygenic scores for your cohort. Second, you may also investigate the genetic variants with non-zero coefficients and their annotated genes to learn more about biology by taking advantage of the sparsity of our iPGS models. For your convenience, here we suggest several resources as an example of follow-up analysis. We do not intend to cover all the relevant follow-up analyses.

Using iPGS coefficients

By clicking the download button above, you may download the iPGS coefficients. Our FAQ page shows the description of file format and how you may use iPGS coefficients in your research.

HaploReg

HaploReg is a tool for exploring annotations of the non-coding genome at variants on haplotype blocks. The button above submits the top 100 genetic variants with the largest absolute value of coefficients as a query to HaploReg using the default parameters in HaploReg v4.2 (LD threshold r2 >= 1, ChromHMM 15-state model, SiPhy-omega, and GENCODE genes). HaploReg's ability to browse haplotypes is useful here as there is no guarantee that our iPGS model selects causal variants. The 'top 100 variant' cutoff is an arbitrary threshold; we aim to demonstrate how one may investigate the selected variants. Please check Ward and Kellis. Nucleic Acids Res. 2012 and Ward and Kellis. Nucleic Acids Res. 2016 for more information on HaploReg.


References