Power of Inclusion: enhancing polygenic prediction with admixed individuals

Tanigawa and Kellis. Am J Hum Genet. (2023).


Phenotype: Sitting height


Sitting height iPGS coefficients

Our FAQ page shows the description of the file format and how you may use iPGS coefficients in your research.


iPGS prediction in the held-out test set individuals

We compared the polygenic prediction from our iPGS model and the phenotype values using the held-out test set individuals in UK Biobank. Note the difference in the number of individuals in the five population groups.

/static/data/tanigawakellis2023/per_trait/INI20015/INI20015.WB.PGS_vs_phe.png
/static/data/tanigawakellis2023/per_trait/INI20015/INI20015.NBW.PGS_vs_phe.png
/static/data/tanigawakellis2023/per_trait/INI20015/INI20015.SA.PGS_vs_phe.png
/static/data/tanigawakellis2023/per_trait/INI20015/INI20015.Afr.PGS_vs_phe.png
/static/data/tanigawakellis2023/per_trait/INI20015/INI20015.others.PGS_vs_phe.png

Predictive performance

Population Model Metric Predictive Performance 95% CI P-value
Population Model Metric Predictive Performance 95% CI P-value
white BritishCovariate-only modelR20.435[0.430, 0.441]<1.0x10-300
white BritishGenotype-only modelR20.133[0.129, 0.138]<1.0x10-300
white BritishFull model (covariates and genotypes)R20.567[0.562, 0.572]<1.0x10-300
Non-British whiteCovariate-only modelR20.435[0.408, 0.462]<1.0x10-300
Non-British whiteGenotype-only modelR20.148[0.124, 0.172]1.7x10-102
Non-British whiteFull model (covariates and genotypes)R20.563[0.539, 0.587]<1.0x10-300
South AsianCovariate-only modelR20.426[0.388, 0.464]1.2x10-176
South AsianGenotype-only modelR20.068[0.043, 0.093]5.6x10-24
South AsianFull model (covariates and genotypes)R20.477[0.440, 0.513]2.3x10-205
AfricanCovariate-only modelR20.356[0.312, 0.399]2.9x10-115
AfricanGenotype-only modelR20.019[0.004, 0.034]2.1x10-06
AfricanFull model (covariates and genotypes)R20.362[0.319, 0.405]8.3x10-118
OthersCovariate-only modelR20.457[0.441, 0.473]<1.0x10-300
OthersGenotype-only modelR20.146[0.132, 0.160]1.0x10-275
OthersFull model (covariates and genotypes)R20.563[0.548, 0.577]<1.0x10-300

The predictive performance (R2), its 95% confidence interval (CI), and statistical significance (P-value) are shown for each population in UK Biobank in the held-out test set. The "model" column indicates whether the predictive performance is from the covariate-terms alone (covariate-only model), PGS terms alone (Genotype-only model), or the full model containing both PGS and covariate terms. We used the following sets of covariates in our analysis: age, sex, age2, age*sex, Townsend deprivation index, and genotype PCs (PC1-PC18). Please refer to our publication for a more detailed description of the methods.


Coefficients (BETA) of PGS models

/static/data/tanigawakellis2023/per_trait/INI20015/INI20015.BETAs.png

We show the coefficients (BETA) of PGS models. Our iPGS model selected 44543 variants with non-zero coefficients. The genetic variants with the large absolute values of coefficients are annotated in the plot. There is no guarantee that our iPGS model selects causal variants. We use the GRCh37/hg19 reference genome.

The top 100 genetic variants with the largest absolute value of coefficients

CHROM POS Variant Variant ID Effect allele Consequence Gene symbol Beta
CHROM POS Variant Variant ID Effect allele Consequence Gene symbol Beta
149484494714:94844947:C:Trs28929474TPAVsSERPINA10.286
81356145538:135614553:G:Crs112892337CPAVsZFAT0.230
158940068015:89400680:A:Grs28407189GPAVsACAN-0.219
6198394156:19839415:C:Trs41271299TIntronicID40.218
16182803016:1828030:G:Ars35816944APAVsSPSB3-0.216
203402575620:34025756:A:Grs143384GUTRGDF50.204
1510171823915:101718239:C:Grs62621400GPAVsCHSY1-0.197
31855486833:185548683:G:Ars720390AOthers0.195
22330948682:233094868:C:Trs78198962TIntronicDIS3L20.191
222850141422:28501414:C:Trs77885044TPAVsTTC28-0.185
129397850412:93978504:G:Trs11107116TOthersSOCS20.182
3380512113:38051211:G:Ars75495843APAVsPLCD10.181
1416182971:41618297:G:Ars114233776APAVsSCMH1-0.170
1510069295315:100692953:G:Ars72755233APAVsADAMTS17-0.165
195587967219:55879672:C:Trs4252548TPAVsIL11-0.161
1210314034412:103140344:C:Trs79747671TOthers-0.152
6342143226:34214322:C:Grs1150781GPAVsC6orf1-0.152
12403376012:4033760:G:Ars74431741AOthers0.142
4877309804:87730980:C:Trs61730641TPAVsPTPN13-0.142
41459826254:145982625:G:Ars116662954AIntronicANAPC10-0.134
135072289513:50722895:C:Ars1326122AIntronicDLEU10.129
195599343619:55993436:G:Trs147110934TPAVsZNF628-0.128
158938665215:89386652:G:Ars34949187APAVsACAN-0.121
157246225515:72462255:C:Trs34815962TPAVsGRAMD20.120
126635975212:66359752:C:Ars8756AUTRHMGA2-0.120
146097653714:60976537:C:Ars33912345APAVsSIX6-0.117
8571386768:57138676:T:Grs36112366GOthersRP11-140I16.3-0.117
81205960238:120596023:A:Grs10283100GPAVsENPP20.115
20786428420:7864284:T:Crs34249643CPAVsHAO10.113
6813994646:81399464:A:Grs34570868GOthers0.112
1210861863012:108618630:C:Trs3764002TPAVsWSCD20.111
4815471844:81547184:A:Grs13119535GIntronicC4orf22-0.110
41455696924:145569692:C:Trs13146972TOthersHHIP0.108
41456590644:145659064:T:Crs11727676CPCVsHHIP0.107
11281073111:2810731:C:Trs2237886TIntronicKCNQ10.106
6341632926:34163292:G:Ars74841643AOthersKRT18P90.106
41458220144:145822014:T:Grs11100870GOthers-0.104
X78649193X:78649193:C:Trs1474563TOthers0.104
51290594975:129059497:A:Grs246242GIntronicADAMTS19, CTC-575N7.1-0.102
61426693386:142669338:A:Grs9496346GIntronicGPR126-0.101
156745769815:67457698:A:Grs35874463GPAVsSMAD30.101
51708640215:170864021:G:Trs4073717TIntronicFGF18-0.101
224207037422:42070374:A:Crs41311445CUTRNHP2L1-0.098
11840235291:184023529:A:Crs1046934CPAVsTSEN150.098
158925773415:89257734:G:Crs72765638COthers-0.098
3116434653:11643465:T:Crs2276749CPAVsVGLL40.097
159919489615:99194896:C:Grs2871865GIntronicIGF1R-0.096
11033799181:103379918:G:Ars3753841APAVsCOL11A1-0.094
41031887094:103188709:C:Trs13107325TPAVsSLC39A80.094
6346188936:34618893:G:Ars2814993AIntronicC6orf1060.093
158458212415:84582124:G:Trs4842838TPAVsADAMTSL30.093
145092324914:50923249:C:Trs12881869TPAVsMAP4K5-0.092
5774412995:77441299:C:Ars10755299AIntronicAP3B1-0.091
61174906646:117490664:T:Crs1405212COthers0.091
2277309402:27730940:T:Crs1260326CPAVsGCKR0.091
31290207783:129020778:A:Grs6765930GPAVsHMCES0.091
X75887014X:75887014:C:Trs6607798TOthers0.090
1267059512:670595:G:Ars36078145APAVsB4GALNT30.090
51729834575:172983457:G:Trs4867721TOthers-0.090
81307237288:130723728:A:Grs4733724GOthers-0.089
51682562405:168256240:G:Ars4282339AIntronicSLIT3-0.089
61303412356:130341235:T:Crs113898003CIntronicL3MBTL3-0.088
158935683215:89356832:G:Trs11633371TIntronicACAN0.088
61694639976:169463997:C:Ars12526105AOthersRP3-495K2.10.086
7379398407:37939840:T:Crs10276139CPAVsNME80.086
22185839452:218583945:C:Trs7562385TIntronicDIRC30.086
11768164371:176816437:T:Crs10753141COthersPAPPA20.085
61426940496:142694049:G:Ars9403383AIntronicGPR126-0.085
31411060633:141106063:T:Crs7632381COthersZBTB380.085
191222417219:12224172:T:Crs8109273CPTVsZNF7880.084
4821657904:82165790:T:Crs994014COthers0.084
1267568561:26756856:T:Crs17261915COthersDHDDS-0.083
193470620319:34706203:G:Ars36006556APAVsLSM14A0.083
116856232811:68562328:C:Trs2229738TPAVsCPT1A-0.082
677975916:7797591:T:Crs10498671CIntronicBMP60.082
41455651164:145565116:G:Ars7680661AIntronicHHIP-AS10.082
2560968922:56096892:A:Grs3791679GIntronicEFEMP1-0.081
117527617811:75276178:A:Crs606452CIntronicSERPINH1-0.080
107958097610:79580976:G:Ars41274586APAVsDLG5-0.079
22183103402:218310340:C:Trs966423TIntronicDIRC30.079
61398299176:139829917:G:Ars17513809AOthers-0.079
1514400931:51440093:C:Trs12855TUTRCDKN2C0.079
11505513271:150551327:G:Ars11580946APAVsMCL10.078
11556401151:155640115:C:Trs41264945TPAVsYY1AP1-0.078
1267288491:26728849:G:Ars79954464AOthers0.078
6810408676:81040867:C:Trs10455370TIntronicBCKDHB-0.078
126982765812:69827658:G:Trs10748128TOthers0.078
7195302197:19530219:T:Crs12540801CIntronicAC007091.10.078
677200596:7720059:G:Ars12198986AOthers0.077
672408766:7240876:G:Ars41302867AIntronicRREB1-0.075
2333595652:33359565:G:Ars116713089AUTRLTBP1-0.075
2683598972:68359897:C:Trs4078978TIntronicWDR92, RP11-474G23.1-0.075
7465770567:46577056:A:Grs723149GOthers-0.075
6818003956:81800395:T:Grs310404GOthers0.074
147063341114:70633411:C:Trs41286548TPAVsSLC8A3-0.074
12185964611:218596461:A:Grs6657275GIntronicTGFB20.074
129412131412:94121314:T:Crs12817549CIntronicRP11-887P2.5, CRADD-0.073
126624137212:66241372:C:Trs2583929TIntronicHMGA20.073
3674262813:67426281:T:Crs35494829CPAVsSUCLG2-0.073
116820129511:68201295:C:Trs3736228TPAVsLRP5-0.073

There is no guarantee that our iPGS model selects causal variants. We show the top 100 variants with the largest effect size (BETA). To see 44543 variants included in our iPGS model, please download the iPGS coefficients by clicking the download button. We use the GRCh37/hg19 reference genome.


Follow-up analysis

There are several ways to use the resource in your research. First, you may use our iPGS coefficients and compute individual-level polygenic scores for your cohort. Second, you may also investigate the genetic variants with non-zero coefficients and their annotated genes to learn more about biology by taking advantage of the sparsity of our iPGS models. For your convenience, here we suggest several resources as an example of follow-up analysis. We do not intend to cover all the relevant follow-up analyses.

Using iPGS coefficients

By clicking the download button above, you may download the iPGS coefficients. Our FAQ page shows the description of file format and how you may use iPGS coefficients in your research.

HaploReg

HaploReg is a tool for exploring annotations of the non-coding genome at variants on haplotype blocks. The button above submits the top 100 genetic variants with the largest absolute value of coefficients as a query to HaploReg using the default parameters in HaploReg v4.2 (LD threshold r2 >= 1, ChromHMM 15-state model, SiPhy-omega, and GENCODE genes). HaploReg's ability to browse haplotypes is useful here as there is no guarantee that our iPGS model selects causal variants. The 'top 100 variant' cutoff is an arbitrary threshold; we aim to demonstrate how one may investigate the selected variants. Please check Ward and Kellis. Nucleic Acids Res. 2012 and Ward and Kellis. Nucleic Acids Res. 2016 for more information on HaploReg.

GREAT

GREAT: Genomic Regions Enrichment of Annotations Tool evaluates enrichment of pathway and ontology terms. The ability of GREAT to map non-coding genetic variants to their downstream target genes would be suitable for investigating pathway and ontology enrichment of genetic variants selected in our sparse iPGS model. The button above submits the top 1000 genetic variants with the largest absolute value of coefficients as a query to GREAT using the default parameters in GREAT v4.0.4. The 'top 1000 variant' cutoff is an arbitrary threshold; we aim to demonstrate how one may investigate the selected variants. Please check McLean et al. Nat Biotechnol. 2010 and Tanigawa*, Dyer*, and Bejerano. PLoS Comput Biol. 2022 for more information on GREAT.

Single-cell RNA-seq

For anthropometric traits, it may be relevant to investigate the single-cell expression profiling data in adipose-muscle tissues. Please check Single Cell Metab Browser from Yang*, Vamvini*, Nigro* et al. Cell Metab. 2022 as an example of such resources.


References