Power of Inclusion: enhancing polygenic prediction with admixed individuals

Tanigawa and Kellis. Am J Hum Genet. (2023).


Phenotype: Neutrophil percentage


Neutrophil % iPGS coefficients

Our FAQ page shows the description of the file format and how you may use iPGS coefficients in your research.


iPGS prediction in the held-out test set individuals

We compared the polygenic prediction from our iPGS model and the phenotype values using the held-out test set individuals in UK Biobank. Note the difference in the number of individuals in the five population groups.

/static/data/tanigawakellis2023/per_trait/INI30200/INI30200.WB.PGS_vs_phe.png
/static/data/tanigawakellis2023/per_trait/INI30200/INI30200.NBW.PGS_vs_phe.png
/static/data/tanigawakellis2023/per_trait/INI30200/INI30200.SA.PGS_vs_phe.png
/static/data/tanigawakellis2023/per_trait/INI30200/INI30200.Afr.PGS_vs_phe.png
/static/data/tanigawakellis2023/per_trait/INI30200/INI30200.others.PGS_vs_phe.png

Predictive performance

Population Model Metric Predictive Performance 95% CI P-value
Population Model Metric Predictive Performance 95% CI P-value
white BritishCovariate-only modelR20.012[0.011, 0.014]1.1x10-179
white BritishGenotype-only modelR20.083[0.079, 0.087]<1.0x10-300
white BritishFull model (covariates and genotypes)R20.095[0.091, 0.099]<1.0x10-300
Non-British whiteCovariate-only modelR20.008[0.002, 0.015]1.1x10-06
Non-British whiteGenotype-only modelR20.081[0.062, 0.100]1.8x10-53
Non-British whiteFull model (covariates and genotypes)R20.090[0.070, 0.110]2.5x10-59
South AsianCovariate-only modelR20.012[0.001, 0.024]2.4x10-05
South AsianGenotype-only modelR20.047[0.026, 0.068]1.1x10-16
South AsianFull model (covariates and genotypes)R20.060[0.037, 0.083]6.3x10-21
AfricanCovariate-only modelR20.031[0.012, 0.050]2.2x10-09
AfricanGenotype-only modelR20.043[0.020, 0.065]1.5x10-12
AfricanFull model (covariates and genotypes)R20.067[0.040, 0.094]4.2x10-19
OthersCovariate-only modelR20.055[0.045, 0.064]3.2x10-96
OthersGenotype-only modelR20.093[0.081, 0.106]6.0x10-167
OthersFull model (covariates and genotypes)R20.121[0.108, 0.135]6.3x10-219

The predictive performance (R2), its 95% confidence interval (CI), and statistical significance (P-value) are shown for each population in UK Biobank in the held-out test set. The "model" column indicates whether the predictive performance is from the covariate-terms alone (covariate-only model), PGS terms alone (Genotype-only model), or the full model containing both PGS and covariate terms. We used the following sets of covariates in our analysis: age, sex, age2, age*sex, Townsend deprivation index, and genotype PCs (PC1-PC18). Please refer to our publication for a more detailed description of the methods.


Coefficients (BETA) of PGS models

/static/data/tanigawakellis2023/per_trait/INI30200/INI30200.BETAs.png

We show the coefficients (BETA) of PGS models. Our iPGS model selected 16931 variants with non-zero coefficients. The genetic variants with the large absolute values of coefficients are annotated in the plot. There is no guarantee that our iPGS model selects causal variants. We use the GRCh37/hg19 reference genome.

The top 100 genetic variants with the largest absolute value of coefficients

CHROM POS Variant Variant ID Effect allele Consequence Gene symbol Beta
CHROM POS Variant Variant ID Effect allele Consequence Gene symbol Beta
11591746831:159174683:T:Crs2814778CUTRDARC-2.965
11591754941:159175494:C:Trs34599082TPAVsDARC-0.745
22189999822:218999982:G:Ars55799208APAVsCXCR2-0.535
632552086HLA-DRB1*0103HLA-DRB1*0103+PAVsHLA-DRB10.414
8616601638:61660163:A:Grs11775560GIntronicCHD70.401
31283164353:128316435:A:Grs4328821GOthers0.372
1211188460812:111884608:T:Crs3184504CPAVsSH2B30.328
7924083707:92408370:C:Trs445TIntronicCDK6-0.326
194415310019:44153100:A:Grs4760GPAVsPLAUR-0.300
175635650217:56356502:A:Grs56378716GPAVsMPO0.290
168593883516:85938835:G:Ars11646550AIntronicIRF80.273
191652783419:16527834:T:Crs4808047CIntronicEPS15L1-0.271
173816687917:38166879:T:Crs8078723COthersCSF30.263
1569062741:56906274:G:Ars7537229AOthers0.261
765023677:6502367:T:Crs6796CUTRKDELR2-0.255
3470458463:47045846:C:Trs2305637TPAVsNBEAL20.245
632552086HLA-DRB1*0101HLA-DRB1*0101+PAVsHLA-DRB1-0.225
7287150567:28715056:A:Grs16874653GIntronicCREB50.217
12992311312:9923113:T:Crs724666COthers0.209
173814354817:38143548:C:Trs4065321TIntronicPSMD3-0.206
7992705397:99270539:C:Trs776746TPTVsCYP3A5-0.202
4553941724:55394172:C:Trs218237TOthers0.202
61444113386:144411338:G:Ars73008259AOthersSF3B5-0.194
8795758048:79575804:T:Crs1021156COthersZC2HC1A0.191
12650213112:6502131:T:Grs2364482GOthersRP1-102E24.8-0.190
6315066916:31506691:G:Ars2071596APAVsDDX39B0.187
1510171892715:101718927:G:Ars3743193APAVsCHSY10.183
6248065946:24806594:C:Trs9358799TPTVsFAM65B0.181
194911635919:49116359:T:Crs447802CPAVsFAM83E-0.180
173817084517:38170845:G:Ars2227319AOthersRP11-387H17.6-0.179
2436100272:43610027:C:Trs13408002TIntronicTHADA0.178
1369455591:36945559:G:Ars3917925AIntronicCSF3R-0.177
1793588271:79358827:G:Trs1968956TPAVsELTD10.175
1112252025311:122520253:T:Crs1945390COthers-0.172
6425060996:42506099:G:Ars11755487AOthers0.170
11017045771:101704577:C:Grs41287280GPAVsS1PR1-0.170
19107995919:1079959:G:Ars36084354APAVsHMHA10.168
4749630494:74963049:C:Trs9131TUTRCXCL2-0.165
1270351031:27035103:G:Ars17262006AIntronicARID1A-0.165
7502583137:50258313:C:Trs1870028TOthers0.157
5358762745:35876274:A:Grs3194051GPAVsIL7R-0.153
6879685656:87968565:A:Grs9362415GPAVsZNF292-0.150
41035570774:103557077:G:Ars2866413APAVsMANBA-0.149
3393071623:39307162:G:Ars3732378APAVsCX3CR1-0.147
22272914152:227291415:A:Crs11686139COthers-0.147
81305863558:130586355:C:Trs12677963TIntronicCCDC260.147
7282792437:28279243:G:Ars4722771AIntronicJAZF1-AS10.147
159101126215:91011262:A:Grs2238325GIntronicIQGAP1-0.144
7504176327:50417632:A:Grs62447197GIntronicIKZF10.143
20193070420:1930704:C:Trs4470399TIntronicRP4-684O24.50.142
21607290052:160729005:C:Trs1397706TPAVsLY75, LY75-CD3020.141
91139159059:113915905:T:Crs10980800CIntronicRP11-202G18.1-0.141
4747098594:74709859:A:Trs13110736TIntronicCXCL6-0.139
191644278219:16442782:C:Trs34006614TOthersKLF20.139
174406102317:44061023:G:Ars62063786APAVsMAPT0.138
12989743412:9897434:G:Ars10772104AOthers-0.138
205782930120:57829301:T:Crs259956CPAVsZNF8310.137
7503058637:50305863:T:Grs4917014GOthers0.136
17137351817:1373518:T:Crs9905106CPAVsMYO1C0.136
6824633766:82463376:C:Trs915125TOthersFAM46A-0.134
8416304058:41630405:G:Ars4737009AIntronicANK1-0.134
21437817312:143781731:A:Crs13008636CIntronicKYNU-0.132
51482064735:148206473:G:Crs1042714CPAVsADRB2-0.132
7282701897:28270189:A:Grs493195GIntronicJAZF1-AS10.130
12361072411:236107241:A:Crs6429432COthersRP5-940F7.2-0.130
19783398219:7833982:C:Trs1045997TUTRCLEC4M0.129
204283855020:42838550:A:Grs2143606GIntronicOSER1-0.129
134286911913:42869119:A:Grs9594724GIntronicAKAP110.127
31283362213:128336221:A:Crs2712429COthersRPN1-0.125
7504737517:50473751:G:Ars6944602AOthersIKZF1-0.124
191047565219:10475652:C:Ars2304256APAVsTYK2-0.124
224176713522:41767135:C:Trs4820438TIntronicTEF-0.124
2435199772:43519977:C:Trs35720761TPAVsTHADA0.123
1111395814211:113958142:GT:Grs35092495GIntronicZBTB160.122
191812373819:18123738:T:Crs7259041CPAVsARRDC20.121
6297963766:29796376:C:Ars12722477APAVsHLA-G0.121
191645411019:16454110:A:Grs6512102GOthers0.120
9915692489:91569248:G:Trs11137467TOthers0.119
191978952819:19789528:A:Grs2304130GPAVsZNF1010.119
191415329319:14153293:T:Crs35026308CPAVsIL27RA-0.119
146987094414:69870944:A:Grs12884741GIntronicSLC39A90.118
1369348051:36934805:C:Grs3917991GPAVsCSF3R0.117
6147158826:14715882:T:Crs1267499COthers0.115
51482050525:148205052:A:Grs2400707GOthersADRB2-0.115
512871945:1287194:G:Ars2853677AIntronicTERT-0.115
947631769:4763176:T:Crs385893COthers0.115
137467557313:74675573:T:Crs2104388CIntronicKLF120.114
1659370651:65937065:G:Trs7531110TIntronicLEPR0.114
7922480767:92248076:C:Trs42235TIntronicCDK60.114
61080533646:108053364:G:Ars12526696AUTRSCML4-0.114
11117262131:111726213:A:Grs694180GPAVsCEPT10.113
2277309402:27730940:T:Crs1260326CPAVsGCKR-0.113
166768080616:67680806:G:Ars117556162APAVsRLTPR-0.113
7288768887:28876888:T:Crs2190306COthers0.112
147597113314:75971133:A:Crs175702COthers-0.112
178099564517:80995645:T:Crs8065396CIntronicB3GNTL1-0.111
12128796441:212879644:C:Trs4951458TOthers-0.110
7271725717:27172571:G:Ars4719884AIntronicHOXA3, RP1-170O19.22, HOXA-AS3, HOXA-AS2-0.110
2254910562:25491056:A:Grs7583409GIntronicDNMT3A-0.110
7287234077:28723407:G:Ars886816AIntronicCREB50.110

There is no guarantee that our iPGS model selects causal variants. We show the top 100 variants with the largest effect size (BETA). To see 16931 variants included in our iPGS model, please download the iPGS coefficients by clicking the download button. We use the GRCh37/hg19 reference genome.


Follow-up analysis

There are several ways to use the resource in your research. First, you may use our iPGS coefficients and compute individual-level polygenic scores for your cohort. Second, you may also investigate the genetic variants with non-zero coefficients and their annotated genes to learn more about biology by taking advantage of the sparsity of our iPGS models. For your convenience, here we suggest several resources as an example of follow-up analysis. We do not intend to cover all the relevant follow-up analyses.

Using iPGS coefficients

By clicking the download button above, you may download the iPGS coefficients. Our FAQ page shows the description of file format and how you may use iPGS coefficients in your research.

HaploReg

HaploReg is a tool for exploring annotations of the non-coding genome at variants on haplotype blocks. The button above submits the top 100 genetic variants with the largest absolute value of coefficients as a query to HaploReg using the default parameters in HaploReg v4.2 (LD threshold r2 >= 1, ChromHMM 15-state model, SiPhy-omega, and GENCODE genes). HaploReg's ability to browse haplotypes is useful here as there is no guarantee that our iPGS model selects causal variants. The 'top 100 variant' cutoff is an arbitrary threshold; we aim to demonstrate how one may investigate the selected variants. Please check Ward and Kellis. Nucleic Acids Res. 2012 and Ward and Kellis. Nucleic Acids Res. 2016 for more information on HaploReg.

GREAT

GREAT: Genomic Regions Enrichment of Annotations Tool evaluates enrichment of pathway and ontology terms. The ability of GREAT to map non-coding genetic variants to their downstream target genes would be suitable for investigating pathway and ontology enrichment of genetic variants selected in our sparse iPGS model. The button above submits the top 1000 genetic variants with the largest absolute value of coefficients as a query to GREAT using the default parameters in GREAT v4.0.4. The 'top 1000 variant' cutoff is an arbitrary threshold; we aim to demonstrate how one may investigate the selected variants. Please check McLean et al. Nat Biotechnol. 2010 and Tanigawa*, Dyer*, and Bejerano. PLoS Comput Biol. 2022 for more information on GREAT.


References