Power of Inclusion: enhancing polygenic prediction with admixed individuals

Tanigawa and Kellis. Am J Hum Genet. (2023).


Phenotype: Platelet crit


Platelet crit iPGS coefficients

Our FAQ page shows the description of the file format and how you may use iPGS coefficients in your research.


iPGS prediction in the held-out test set individuals

We compared the polygenic prediction from our iPGS model and the phenotype values using the held-out test set individuals in UK Biobank. Note the difference in the number of individuals in the five population groups.

/static/data/tanigawakellis2023/per_trait/INI30090/INI30090.WB.PGS_vs_phe.png
/static/data/tanigawakellis2023/per_trait/INI30090/INI30090.NBW.PGS_vs_phe.png
/static/data/tanigawakellis2023/per_trait/INI30090/INI30090.SA.PGS_vs_phe.png
/static/data/tanigawakellis2023/per_trait/INI30090/INI30090.Afr.PGS_vs_phe.png
/static/data/tanigawakellis2023/per_trait/INI30090/INI30090.others.PGS_vs_phe.png

Predictive performance

Population Model Metric Predictive Performance 95% CI P-value
Population Model Metric Predictive Performance 95% CI P-value
white BritishCovariate-only modelR20.089[0.085, 0.093]<1.0x10-300
white BritishGenotype-only modelR20.163[0.158, 0.168]<1.0x10-300
white BritishFull model (covariates and genotypes)R20.252[0.246, 0.257]<1.0x10-300
Non-British whiteCovariate-only modelR20.075[0.057, 0.094]1.1x10-49
Non-British whiteGenotype-only modelR20.177[0.152, 0.202]7.1x10-121
Non-British whiteFull model (covariates and genotypes)R20.248[0.221, 0.276]1.9x10-176
South AsianCovariate-only modelR20.139[0.106, 0.171]2.1x10-48
South AsianGenotype-only modelR20.120[0.089, 0.151]1.2x10-41
South AsianFull model (covariates and genotypes)R20.248[0.210, 0.286]1.8x10-90
AfricanCovariate-only modelR20.123[0.089, 0.158]9.0x10-35
AfricanGenotype-only modelR20.044[0.021, 0.066]6.3x10-13
AfricanFull model (covariates and genotypes)R20.152[0.115, 0.189]2.8x10-43
OthersCovariate-only modelR20.095[0.083, 0.107]1.4x10-169
OthersGenotype-only modelR20.138[0.124, 0.152]1.7x10-252
OthersFull model (covariates and genotypes)R20.229[0.213, 0.245]<1.0x10-300

The predictive performance (R2), its 95% confidence interval (CI), and statistical significance (P-value) are shown for each population in UK Biobank in the held-out test set. The "model" column indicates whether the predictive performance is from the covariate-terms alone (covariate-only model), PGS terms alone (Genotype-only model), or the full model containing both PGS and covariate terms. We used the following sets of covariates in our analysis: age, sex, age2, age*sex, Townsend deprivation index, and genotype PCs (PC1-PC18). Please refer to our publication for a more detailed description of the methods.


Coefficients (BETA) of PGS models

/static/data/tanigawakellis2023/per_trait/INI30090/INI30090.BETAs.png

We show the coefficients (BETA) of PGS models. Our iPGS model selected 27034 variants with non-zero coefficients. The genetic variants with the large absolute values of coefficients are annotated in the plot. There is no guarantee that our iPGS model selects causal variants. We use the GRCh37/hg19 reference genome.

The top 100 genetic variants with the largest absolute value of coefficients

CHROM POS Variant Variant ID Effect allele Consequence Gene symbol Beta
CHROM POS Variant Variant ID Effect allele Consequence Gene symbol Beta
1211188531012:111885310:G:Ars72650673APAVsSH2B30.022
171685218717:16852187:A:Grs34557412GPAVsTNFRSF13B-0.007
205759797020:57597970:A:Crs463312CPAVsTUBB1-0.007
1211188460812:111884608:T:Crs3184504CPAVsSH2B3-0.007
191976549919:19765499:C:Trs45522544TPAVsATP13A10.004
951263439:5126343:G:Ars41316003APAVsJAK20.004
176421058017:64210580:A:Crs1801689CPAVsAPOH0.004
225065610922:50656109:A:Crs55955211COthersSELO, TUBGCP60.004
947631769:4763176:T:Crs385893COthers0.004
154382071715:43820717:C:Trs55707100TPAVsMAP1A0.004
12477197691:247719769:G:Ars56043070APTVsGCSAML-0.003
6335402096:33540209:A:Grs210134GOthersBAK10.003
186092085418:60920854:C:Trs17758695TIntronicBCL2-0.003
175546577117:55465771:C:Trs17834140TIntronicMSI2-0.003
191618555919:16185559:G:Ars8109288AIntronicTPM4-0.003
1212236558312:122365583:C:Trs7961894TIntronicWDR660.003
947660229:4766022:C:Trs10815074TOthersRP11-307I14.4-0.003
203029468220:30294682:T:Crs80054178CIntronicBCL2L1, RP11-243J16.70.003
61354186356:135418635:C:Trs7775698TIntronicHBS1L0.003
31840902663:184090266:C:Trs6141TUTRTHPO0.003
947673419:4767341:T:Crs420470COthersRP11-307I14.40.003
1211188248512:111882485:C:Ars2238154AIntronicSH2B30.002
1212224711412:122247114:G:Ars77408535AIntronicSETD1B0.002
1410117043614:101170436:G:Ars12888043AOthers-0.002
125464997812:54649978:C:Trs79880068TIntronicCBX50.002
21606906562:160690656:G:Ars78446341APAVsLY75, LY75-CD3020.002
81065815288:106581528:A:Trs6993770TIntronicZFPM2-0.002
31840913833:184091383:T:TGGAArs55827759TGGAAPAVsTHPO-0.002
1111907532311:119075323:G:Ars4938637AOthersCBL0.002
9219868479:21986847:T:Ars3731211AIntronicRP11-145E5.5, CDKN2A0.002
31840930403:184093040:G:Ars34623301AIntronicEIF2B5, THPO0.002
948403809:4840380:A:Grs10974808GIntronicRCL10.002
1311048915213:110489152:C:Trs11618989TOthers-0.002
184204106318:42041063:A:Crs12606438CIntronicCTC-782O7.10.002
1111908888311:119088883:G:Ars1893032AIntronicCBL0.002
1210949029612:109490296:G:Trs12426673TOthersUSP30-AS1-0.002
1255836101:25583610:C:Grs72660908GIntronicC1orf63-0.002
21606045142:160604514:C:Trs76774368TPAVsMARCH70.002
1120422611:12042261:A:Grs2236055GIntronicMFN2-0.002
947599309:4759930:C:Trs12343429TOthers0.002
22272914152:227291415:A:Crs11686139COthers0.002
1438052401:43805240:A:Grs16830693GPAVsMPL0.002
71234088427:123408842:G:Ars9886090AOthers0.002
109603959710:96039597:G:Crs2274224CPAVsPLCE1-0.002
512823195:1282319:C:Ars7726159AIntronicTERT0.002
1881046481:88104648:A:Grs1826164GOthers-0.002
948562349:4856234:G:Ars1887430AIntronicRCL10.002
7448709337:44870933:T:Crs10267576CIntronicH2AFV-0.001
9221429079:22142907:G:Ars10811664AOthers-0.001
61354190186:135419018:T:Crs9399137CIntronicHBS1L0.001
6335464986:33546498:C:Trs5745582TIntronicBAK10.001
2277309402:27730940:T:Crs1260326CPAVsGCKR-0.001
149351646514:93516465:T:Crs2180369CIntronicITPK10.001
512871945:1287194:G:Ars2853677AIntronicTERT-0.001
162850642816:28506428:C:Trs151233TPCVsAPOBR0.001
9915692489:91569248:G:Trs11137467TOthers-0.001
51776107005:177610700:A:Grs10074768GOthersGMCL1P1-0.001
11989749041:198974904:C:Trs10919615TOthersRP11-16L9.30.001
2686155462:68615546:A:Crs34338164CPAVsPLEK-0.001
129947171:2994717:C:Ars4648451AIntronicPRDM16-0.001
21607290052:160729005:C:Trs1397706TPAVsLY75, LY75-CD3020.001
182072097318:20720973:G:Trs11082304TIntronicCABLES1-0.001
173388480417:33884804:T:Crs10512472CPAVsSLFN140.001
6335483946:33548394:G:Trs5745568TOthersGGNBP1, BAK10.001
51110666975:111066697:T:Crs11559CPAVsNREP-0.001
31840874173:184087417:C:Trs9849502TIntronicEIF2B50.001
2314648292:31464829:A:Grs647316GIntronicEHD30.001
21121679312:112167931:T:Crs62160676CIntronicMIR4435-1HG0.001
159924813215:99248132:G:Trs4966015TIntronicIGF1R-0.001
168541583816:85415838:T:Crs4783187COthers-0.001
1272399201:27239920:C:Grs6659176GPAVsNR0B20.001
21607149582:160714958:C:Trs114821641TPTVsLY75, LY75-CD3020.001
61353813516:135381351:A:Grs11759077GPTVsCTA-212D2.2-0.001
203039288420:30392884:A:Grs6089075GOthersTPX2, RP11-243J16.8-0.001
108116414610:81164146:C:Trs116052829TIntronicRP11-342M3.5, ZCCHC240.001
168857334716:88573347:G:Trs17700789TIntronicZFPM10.001
8302808338:30280833:G:Ars2979489AIntronicRBPMS0.001
139589775813:95897758:T:Crs4148442CIntronicABCC40.001
9914018939:91401893:C:Trs9410344TOthers-0.001
194542294619:45422946:A:Grs4420638GOthersAPOC1-0.001
139589820713:95898207:A:Grs4148441GIntronicABCC40.001
1111395362211:113953622:G:Ars73000929AIntronicZBTB16-0.001
21606870612:160687061:C:Trs2556097TIntronicLY75, LY75-CD302-0.001
61354271446:135427144:C:Ars9376092AOthersHBS1L0.001
9990910099:99091009:C:Trs10990535TIntronicSLC35D20.001
189183131:8918313:T:Crs10864368COthersENO10.001
20860459320:8604593:G:Ars6077396AIntronicPLCB1-0.001
205498932620:54989326:T:Grs11905250GIntronicCASS40.001
124821574912:48215749:G:Ars1859281AIntronicHDAC70.001
71307612357:130761235:G:Ars10954300AIntronicLINC-PINT-0.001
139589632113:95896321:T:Crs6492772CIntronicABCC40.001
6313223676:31322367:T:Grs3819299GOthersHLA-B0.001
20192362320:1923623:T:Grs4144201GOthersRP4-684O24.5-0.001
1111920792811:119207928:C:Trs6649TUTRRNF260.001
12265553021:226555302:A:Grs1136410GPAVsPARP1-0.001
61350360036:135036003:A:Grs6902798GOthers0.001
3122676483:12267648:A:Grs7616006GOthers-0.001
114743170311:47431703:A:Grs61897432GPAVsSLC39A130.001
195172847719:51728477:C:Trs12459419TPAVsCD33-0.001
81449960298:144996029:A:Grs7833924GPAVsPLEC0.001

There is no guarantee that our iPGS model selects causal variants. We show the top 100 variants with the largest effect size (BETA). To see 27034 variants included in our iPGS model, please download the iPGS coefficients by clicking the download button. We use the GRCh37/hg19 reference genome.


Follow-up analysis

There are several ways to use the resource in your research. First, you may use our iPGS coefficients and compute individual-level polygenic scores for your cohort. Second, you may also investigate the genetic variants with non-zero coefficients and their annotated genes to learn more about biology by taking advantage of the sparsity of our iPGS models. For your convenience, here we suggest several resources as an example of follow-up analysis. We do not intend to cover all the relevant follow-up analyses.

Using iPGS coefficients

By clicking the download button above, you may download the iPGS coefficients. Our FAQ page shows the description of file format and how you may use iPGS coefficients in your research.

HaploReg

HaploReg is a tool for exploring annotations of the non-coding genome at variants on haplotype blocks. The button above submits the top 100 genetic variants with the largest absolute value of coefficients as a query to HaploReg using the default parameters in HaploReg v4.2 (LD threshold r2 >= 1, ChromHMM 15-state model, SiPhy-omega, and GENCODE genes). HaploReg's ability to browse haplotypes is useful here as there is no guarantee that our iPGS model selects causal variants. The 'top 100 variant' cutoff is an arbitrary threshold; we aim to demonstrate how one may investigate the selected variants. Please check Ward and Kellis. Nucleic Acids Res. 2012 and Ward and Kellis. Nucleic Acids Res. 2016 for more information on HaploReg.

GREAT

GREAT: Genomic Regions Enrichment of Annotations Tool evaluates enrichment of pathway and ontology terms. The ability of GREAT to map non-coding genetic variants to their downstream target genes would be suitable for investigating pathway and ontology enrichment of genetic variants selected in our sparse iPGS model. The button above submits the top 1000 genetic variants with the largest absolute value of coefficients as a query to GREAT using the default parameters in GREAT v4.0.4. The 'top 1000 variant' cutoff is an arbitrary threshold; we aim to demonstrate how one may investigate the selected variants. Please check McLean et al. Nat Biotechnol. 2010 and Tanigawa*, Dyer*, and Bejerano. PLoS Comput Biol. 2022 for more information on GREAT.


References