Hypometric genetics: Improved power in genetic discovery by incorporating quality control flags

Tanigawa and Kellis. Am J Hum Genet. (2024).


Phenotype: BLQ: Free Chol. to Tot. Lipids in XL VLDL %


Predictive performance of iPGS models

We evaluated the predictive performance of the inclusive polygenic score models using the held-out test set individuals.

Population Model PGS trait type Metric Predictive Performance 95% CI P-value
Population Model PGS trait type Metric Predictive Performance 95% CI P-value
white BritishCovariate-only modelBLQ (derived)AUROC0.649[0.641, 0.657]<1.0x10-300
white BritishGenotype-only modelDerived (percentage traits, excl. BLQ measurements)AUROC0.451[0.443, 0.459]9.5x10-33
white BritishGenotype-only modelDerived (percentage traits, incl. BLQ measurements)AUROC0.441[0.433, 0.449]1.8x10-47
white BritishGenotype-only modelBLQ (derived)AUROC0.629[0.622, 0.637]1.2x10-225
white BritishFull model (covariates and genotypes)Derived (percentage traits, excl. BLQ measurements)AUROC0.608[0.600, 0.616]1.9x10-160
white BritishFull model (covariates and genotypes)Derived (percentage traits, incl. BLQ measurements)AUROC0.589[0.581, 0.597]8.1x10-108
white BritishFull model (covariates and genotypes)BLQ (derived)AUROC0.699[0.691, 0.706]<1.0x10-300
Non-British whiteCovariate-only modelBLQ (derived)AUROC0.684[0.649, 0.720]2.7x10-23
Non-British whiteGenotype-only modelDerived (percentage traits, excl. BLQ measurements)AUROC0.484[0.447, 0.522]2.2x10-01
Non-British whiteGenotype-only modelDerived (percentage traits, incl. BLQ measurements)AUROC0.458[0.421, 0.495]5.6x10-03
Non-British whiteGenotype-only modelBLQ (derived)AUROC0.624[0.590, 0.658]2.1x10-11
Non-British whiteFull model (covariates and genotypes)Derived (percentage traits, excl. BLQ measurements)AUROC0.654[0.617, 0.690]7.0x10-17
Non-British whiteFull model (covariates and genotypes)Derived (percentage traits, incl. BLQ measurements)AUROC0.628[0.591, 0.665]6.4x10-12
Non-British whiteFull model (covariates and genotypes)BLQ (derived)AUROC0.725[0.692, 0.758]1.9x10-31
South AsianCovariate-only modelBLQ (derived)AUROC0.641[0.580, 0.701]2.5x10-05
South AsianGenotype-only modelDerived (percentage traits, excl. BLQ measurements)AUROC0.483[0.417, 0.549]6.1x10-01
South AsianGenotype-only modelDerived (percentage traits, incl. BLQ measurements)AUROC0.456[0.391, 0.522]1.7x10-01
South AsianGenotype-only modelBLQ (derived)AUROC0.629[0.566, 0.692]4.7x10-05
South AsianFull model (covariates and genotypes)Derived (percentage traits, excl. BLQ measurements)AUROC0.620[0.560, 0.680]2.4x10-04
South AsianFull model (covariates and genotypes)Derived (percentage traits, incl. BLQ measurements)AUROC0.600[0.536, 0.664]2.4x10-03
South AsianFull model (covariates and genotypes)BLQ (derived)AUROC0.677[0.614, 0.741]2.1x10-08
AfricanCovariate-only modelBLQ (derived)AUROC0.600[0.554, 0.647]4.4x10-05
AfricanGenotype-only modelDerived (percentage traits, excl. BLQ measurements)AUROC0.519[0.471, 0.566]6.3x10-01
AfricanGenotype-only modelDerived (percentage traits, incl. BLQ measurements)AUROC0.488[0.441, 0.536]6.4x10-01
AfricanGenotype-only modelBLQ (derived)AUROC0.582[0.535, 0.628]2.7x10-04
AfricanFull model (covariates and genotypes)Derived (percentage traits, excl. BLQ measurements)AUROC0.592[0.546, 0.639]7.6x10-05
AfricanFull model (covariates and genotypes)Derived (percentage traits, incl. BLQ measurements)AUROC0.581[0.534, 0.628]5.2x10-04
AfricanFull model (covariates and genotypes)BLQ (derived)AUROC0.628[0.582, 0.674]2.0x10-07
OthersCovariate-only modelBLQ (derived)AUROC0.678[0.657, 0.699]6.0x10-56
OthersGenotype-only modelDerived (percentage traits, excl. BLQ measurements)AUROC0.444[0.421, 0.467]1.5x10-06
OthersGenotype-only modelDerived (percentage traits, incl. BLQ measurements)AUROC0.430[0.407, 0.453]5.7x10-10
OthersGenotype-only modelBLQ (derived)AUROC0.604[0.582, 0.627]3.9x10-20
OthersFull model (covariates and genotypes)Derived (percentage traits, excl. BLQ measurements)AUROC0.641[0.619, 0.663]4.1x10-36
OthersFull model (covariates and genotypes)Derived (percentage traits, incl. BLQ measurements)AUROC0.620[0.597, 0.642]8.6x10-27
OthersFull model (covariates and genotypes)BLQ (derived)AUROC0.702[0.681, 0.722]3.6x10-68

The predictive performance (R2), its 95% confidence interval (CI), and statistical significance (P-value) are shown for each population in UK Biobank in the held-out test set. The "model" column indicates whether the predictive performance is from the covariate-terms alone (covariate-only model), PGS terms alone (Genotype-only model), or the full model containing both PGS and covariate terms. We used the following sets of covariates in our analysis: age, sex, age2, age*sex, Townsend deprivation index, and genotype PCs (PC1-PC18). Please refer to our publication for a more detailed description of the methods.


Coefficients (BETA) of PGS models

/static/data/tanigawakellis2024/per_trait/BIN_FC10023887/pgscoeffs.png

We show the coefficients (BETA) of PGS models. Our iPGS model selected 3395 variants with non-zero coefficients. The genetic variants with the large absolute values of coefficients are annotated in the plot. There is no guarantee that our iPGS model selects causal variants. We use the GRCh37/hg19 reference genome.

The top 100 genetic variants with the largest absolute value of coefficients

CHROM POS Variant Variant ID Effect Allele Consequence Gene symbol Effect Weight
CHROM POS Variant Variant ID Effect Allele Consequence Gene symbol Effect Weight
61609611376:160961137:T:Crs3798220CPAVsLPA0.262187435939604
8198135298:19813529:A:Grs268GPAVsLPL-0.246236903888113
1111664891711:116648917:G:Crs964184CUTRZPR10.242674748514889
8198197248:19819724:C:Grs328GPTVsLPL0.224411283710363
61610101186:161010118:A:Grs10455872GIntronicLPA0.222745791359101
19842932319:8429323:G:Ars116843064APAVsANGPTL40.173891753595705
2277309402:27730940:T:Crs1260326CPAVsGCKR0.120663087871097
165699332416:56993324:C:Ars3764261AOthersCETP0.110699436867439
2212315242:21231524:G:Ars676210APAVsAPOB0.106958148923797
194541445119:45414451:T:Crs439401COthersAPOC1-0.0848794100133673
7730203377:73020337:C:Grs3812316GPAVsMLXIPL0.0824655961397462
8198194398:19819439:A:Grs326GIntronicLPL0.0779777284157464
1111704240811:117042408:C:Trs186808413TPAVsPAFAH1B20.073656658908399
191937954919:19379549:C:Trs58542926TPAVsTM6SF20.0709640850653141
1111669229311:116692293:C:Ars12721043APAVsAPOA40.0657851738961661
1111666240711:116662407:G:Crs3135506CPAVsAPOA5-0.0630500468353082
165701509116:57015091:G:Crs5880CPAVsCETP-0.0619598614840871
174192612617:41926126:C:Trs72836561TPAVsCD300LG-0.059468935924382
194542294619:45422946:A:Grs4420638GOthersAPOC1-0.0578313879859862
1630270241:63027024:C:Trs4329540TIntronicDOCK7-0.057700164238627
5558618945:55861894:G:Ars9687846AIntronic-0.0530235878803515
81264817478:126481747:A:Grs2980875GIntronic0.0499372551172046
51563902975:156390297:T:Crs6882076COthersTIMD4-0.0494581129915311
1212442730612:124427306:T:Ars11057401APAVsCCDC920.0479502876853912
81265073898:126507389:C:Ars2954038AIntronic0.0475942301852416
165700659016:57006590:C:Trs7499892TIntronicCETP-0.0460689712329151
21655286242:165528624:G:Trs1128249TIntronicCOBLL10.0460006853297454
154382071715:43820717:C:Trs55707100TPAVsMAP1A-0.0452065586277042
1111666370711:116663707:G:Ars662799AOthersAPOA50.0439269434344251
71304455747:130445574:G:Ars17789506AOthers0.0422634175765411
116157976011:61579760:T:Crs174555CIntronicFADS1, FADS2-0.0407315273104276
125784371112:57843711:G:Ars2229357APAVsINHBC0.0401712175854857
81264779788:126477978:G:Crs2001945COthers0.0374987398728629
2212252812:21225281:C:Trs1042034TPAVsAPOB-0.0350530721873482
2212949752:21294975:G:Ars541041AOthers-0.0340512626240066
167214417416:72144174:T:Crs9302635CIntronicDHX380.0334701256513979
61611070186:161107018:G:Ars9457997AOthers0.032552466439934
61607663216:160766321:C:Trs540713TOthersSLC22A3-0.0324099221590199
1111665756111:116657561:C:Trs3741298TIntronicZPR10.0312695257663159
12302976591:230297659:C:Trs2281719TIntronicGALNT20.029304653827525
174193137517:41931375:A:Grs12453522GPAVsCD300LG-0.029161840480679
191120230619:11202306:G:Trs6511720TIntronicLDLR0.0290321280797989
19861558919:8615589:A:Grs4804311GPAVsMYO1F0.0287711952028561
5746565395:74656539:T:Crs12916CUTRHMGCR-0.0284369901359606
1111690744911:116907449:C:Trs721783TIntronicSIK30.0267875304059368
8199414488:19941448:C:Trs6989064TIntronic0.026726926365468
4880522194:88052219:T:Crs342467CPAVsAFF1-0.0266374466113737
204453465120:44534651:G:Ars6065904AIntronicPLTP-0.0259973330383332
22271163652:227116365:A:Grs2972143GOthers-0.0259164335540099
61609603596:160960359:T:Crs6919346CIntronicLPA0.0254920502760066
1629400971:62940097:G:Ars1979722AIntronicDOCK70.0254541440699622
8593984618:59398461:G:Ars10504255AOthersCYP7A10.0252339768283138
7730328357:73032835:T:Crs7785479CIntronicMLXIPL0.0247772736083811
1272399201:27239920:C:Grs6659176GPAVsNR0B2-0.0245831012403745
51187692985:118769298:A:Crs154632COthers0.0243992668443117
22271014112:227101411:A:Grs2972144GOthers-0.0225310162999186
61274647546:127464754:C:Trs2503320TIntronicRSPO3-0.0221396762315472
204455401520:44554015:T:Crs6065906COthers-0.0221070644466941
22196688132:219668813:A:Grs6436089GIntronicCYP27A10.0219707692072942
61610181746:161018174:C:Trs7770628TIntronicLPA-0.021421217840883
8324533588:32453358:G:Ars3924999APAVsNRG10.0212428486834611
182112044418:21120444:T:Crs1805082CPAVsNPC10.021191625395964
16125244116:1252441:T:Crs4984636CPAVsCACNA1H-0.0210628598218787
61605074786:160507478:A:Grs3798178GIntronicIGF2R0.0205865476111448
61605608456:160560845:A:Grs628031GPAVsSLC22A10.0205072298376881
2203742862:20374286:G:Ars6531216AOthersRN7SL140P, RNU6-961P-0.0204473189197718
1212426568712:124265687:T:Crs11057353CPAVsDNAH100.0203943055254742
434460914:3446091:G:Trs3748034TPAVsHGFAC-0.0203767798770875
213437759421:34377594:A:Grs2834063GIntronic0.0203647747204614
2649286032:64928603:T:Crs12471768CIntronicSERTAD2-0.0203599400315003
81076989668:107698966:A:Grs776965GIntronicOXR10.0203137354515941
61611379906:161137990:G:Ars783147AIntronicPLG-0.0202945254659763
114727025511:47270255:C:Trs2167079TPAVsACP20.020265793200463
161514864616:15148646:C:Ars11075253AIntronicNTAN1, PDXDC10.0201444682317207
1548872971:54887297:G:Ars4927104AOthers-0.0198919269238457
1111704237711:117042377:G:Ars4936367APAVsPAFAH1B20.0198793630039633
12303018111:230301811:T:Grs11122450GIntronicGALNT20.0198667770039407
11098183061:109818306:G:Trs629301TUTRCELSR2-0.019785405764671
195748842319:57488423:C:Trs8102873TOthers-0.0196924884834516
22423956742:242395674:G:Ars4675812AIntronicFARP20.0194613711464502
1551365291:55136529:T:Crs480963CPAVsMROH7, MROH7-TTC40.0193223049320656
1111895217311:118952173:A:Grs15818GPAVsVPS11-0.0192724538613645
1113481999311:134819993:G:Ars6590782AOthers-0.019107287448714
109481905310:94819053:C:Trs8211TUTREXOC6-0.0190665582848367
81166119028:116611902:T:Crs2737206CIntronicTRPS10.0188469857802781
1111651152211:116511522:C:Trs519000TIntronic-0.0187966977512065
185773627018:57736270:G:Ars1942867AOthers-0.018701759926104
1311454901513:114549015:T:Crs6602910CIntronicGAS6-0.0186489230173576
4897338824:89733882:A:Grs6814344GIntronicFAM13A-0.0185420534911622
632631702HLA-DQB1*0201HLA-DQB1*0201+PAVsHLA-DQB10.018460962663123
61092011296:109201129:A:Crs17316116CPTVsARMC2-0.0184156297718647
1254096071:25409607:A:Grs311477GIntronic-0.0183930201711939
8198057088:19805708:G:Ars1801177APAVsLPL-0.0183674165397695
20317133720:3171337:C:Trs11591TPAVsDDRGK1-0.0181943363694016
6327966856:32796685:A:Grs241448GPTVsTAP2-0.0181909216869054
347852303:4785230:G:Ars12638018AIntronicITPR1-0.0180889595681882
22221834942:222183494:C:Ars977984AOthers-0.0179784320732363
122047375812:20473758:C:Ars7134375AOthers0.0179362825033035
4879967454:87996745:G:Ars17605615AIntronicAFF1-0.0178285657554537
5569089825:56908982:T:Crs10059916CIntronic-0.0177477880225331

There is no guarantee that our iPGS model selects causal variants. We show the top 100 variants with the largest effect size (BETA). To see 3395 variants included in our iPGS model, please download the iPGS coefficients by clicking the download button. We use the GRCh37/hg19 reference genome.


Follow-up analysis

There are several ways to use the resource in your research. First, you may use our iPGS coefficients and compute individual-level polygenic scores for your cohort. Second, you may also investigate the genetic variants with non-zero coefficients and their annotated genes to learn more about biology by taking advantage of the sparsity of our iPGS models. For your convenience, here we suggest several resources as an example of follow-up analysis. We do not intend to cover all the relevant follow-up analyses.

Using iPGS coefficients

By clicking the download button above, you may download the iPGS coefficients. Our FAQ page shows the description of file format and how you may use iPGS coefficients in your research.

HaploReg

HaploReg is a tool for exploring annotations of the non-coding genome at variants on haplotype blocks. The button above submits the top 100 genetic variants with the largest absolute value of coefficients as a query to HaploReg using the default parameters in HaploReg v4.2 (LD threshold r2 >= 1, ChromHMM 15-state model, SiPhy-omega, and GENCODE genes). HaploReg's ability to browse haplotypes is useful here as there is no guarantee that our iPGS model selects causal variants. The 'top 100 variant' cutoff is an arbitrary threshold; we aim to demonstrate how one may investigate the selected variants. Please check Ward and Kellis. Nucleic Acids Res. 2012 and Ward and Kellis. Nucleic Acids Res. 2016 for more information on HaploReg.


References