Power of Inclusion: enhancing polygenic prediction with admixed individuals

Tanigawa and Kellis. Am J Hum Genet. (2023).


Phenotype: Standing height


Height iPGS coefficients

Our FAQ page shows the description of the file format and how you may use iPGS coefficients in your research.


iPGS prediction in the held-out test set individuals

We compared the polygenic prediction from our iPGS model and the phenotype values using the held-out test set individuals in UK Biobank. Note the difference in the number of individuals in the five population groups.

/static/data/tanigawakellis2023/per_trait/INI50/INI50.WB.PGS_vs_phe.png
/static/data/tanigawakellis2023/per_trait/INI50/INI50.NBW.PGS_vs_phe.png
/static/data/tanigawakellis2023/per_trait/INI50/INI50.SA.PGS_vs_phe.png
/static/data/tanigawakellis2023/per_trait/INI50/INI50.Afr.PGS_vs_phe.png
/static/data/tanigawakellis2023/per_trait/INI50/INI50.others.PGS_vs_phe.png

Predictive performance

Population Model Metric Predictive Performance 95% CI P-value
Population Model Metric Predictive Performance 95% CI P-value
white BritishCovariate-only modelR20.539[0.534, 0.544]<1.0x10-300
white BritishGenotype-only modelR20.180[0.175, 0.186]<1.0x10-300
white BritishFull model (covariates and genotypes)R20.719[0.715, 0.723]<1.0x10-300
Non-British whiteCovariate-only modelR20.540[0.516, 0.565]<1.0x10-300
Non-British whiteGenotype-only modelR20.210[0.184, 0.237]6.5x10-150
Non-British whiteFull model (covariates and genotypes)R20.728[0.712, 0.745]<1.0x10-300
South AsianCovariate-only modelR20.580[0.548, 0.613]8.2x10-275
South AsianGenotype-only modelR20.083[0.056, 0.110]3.6x10-29
South AsianFull model (covariates and genotypes)R20.669[0.642, 0.697]<1.0x10-300
AfricanCovariate-only modelR20.489[0.449, 0.529]2.7x10-175
AfricanGenotype-only modelR20.028[0.010, 0.046]6.8x10-09
AfricanFull model (covariates and genotypes)R20.503[0.464, 0.543]2.6x10-182
OthersCovariate-only modelR20.552[0.537, 0.566]<1.0x10-300
OthersGenotype-only modelR20.151[0.137, 0.165]1.6x10-285
OthersFull model (covariates and genotypes)R20.699[0.688, 0.710]<1.0x10-300

The predictive performance (R2), its 95% confidence interval (CI), and statistical significance (P-value) are shown for each population in UK Biobank in the held-out test set. The "model" column indicates whether the predictive performance is from the covariate-terms alone (covariate-only model), PGS terms alone (Genotype-only model), or the full model containing both PGS and covariate terms. We used the following sets of covariates in our analysis: age, sex, age2, age*sex, Townsend deprivation index, and genotype PCs (PC1-PC18). Please refer to our publication for a more detailed description of the methods.


Coefficients (BETA) of PGS models

/static/data/tanigawakellis2023/per_trait/INI50/INI50.BETAs.png

We show the coefficients (BETA) of PGS models. Our iPGS model selected 62419 variants with non-zero coefficients. The genetic variants with the large absolute values of coefficients are annotated in the plot. There is no guarantee that our iPGS model selects causal variants. We use the GRCh37/hg19 reference genome.

The top 100 genetic variants with the largest absolute value of coefficients

CHROM POS Variant Variant ID Effect allele Consequence Gene symbol Beta
CHROM POS Variant Variant ID Effect allele Consequence Gene symbol Beta
149484494714:94844947:C:Trs28929474TPAVsSERPINA10.790
81356145538:135614553:G:Crs112892337CPAVsZFAT0.768
6198394156:19839415:C:Trs41271299TIntronicID40.698
6341632926:34163292:G:Ars74841643AOthersKRT18P90.611
1510069295315:100692953:G:Ars72755233APAVsADAMTS17-0.576
116702453411:67024534:C:Trs7952436TUTRKDM2A-0.576
195587967219:55879672:C:Trs4252548TPAVsIL11-0.527
22330948682:233094868:C:Trs78198962TIntronicDIS3L20.498
158940068015:89400680:A:Grs28407189GPAVsACAN-0.486
51276686855:127668685:G:Trs78727187TPAVsFBN20.476
6342143226:34214322:C:Grs1150781GPAVsC6orf1-0.472
135072289513:50722895:C:Ars1326122AIntronicDLEU10.460
203402575620:34025756:A:Grs143384GUTRGDF50.425
41459826254:145982625:G:Ars116662954AIntronicANAPC10-0.415
31855486833:185548683:G:Ars720390AOthers0.410
195599343619:55993436:G:Trs147110934TPAVsZNF628-0.397
11765216551:176521655:G:Ars10913200AIntronicPAPPA2-0.393
2202055412:20205541:C:Trs52826764TPAVsMATN3-0.386
156745769815:67457698:A:Grs35874463GPAVsSMAD30.380
4877309804:87730980:C:Trs61730641TPAVsPTPN13-0.372
2560968922:56096892:A:Grs3791679GIntronicEFEMP1-0.358
8571386768:57138676:T:Grs36112366GOthersRP11-140I16.3-0.331
1510171823915:101718239:C:Grs62621400GPAVsCHSY1-0.316
41456590644:145659064:T:Crs11727676CPCVsHHIP0.311
11499023421:149902342:C:Trs145659444TPAVsMTMR110.305
158458212415:84582124:G:Trs4842838TPAVsADAMTSL30.289
413415534:1341553:A:Grs111391498GPAVsUVSSA-0.283
9785151959:78515195:A:Grs35650604GIntronicPCSK5-0.283
1514400931:51440093:C:Trs12855TUTRCDKN2C0.281
11499064131:149906413:T:Crs11205303CPAVsMTMR110.276
126635975212:66359752:C:Ars8756AUTRHMGA2-0.276
182073540818:20735408:T:Crs4369779CIntronicCABLES10.275
1416182971:41618297:G:Ars114233776APAVsSCMH1-0.275
16182803016:1828030:G:Ars35816944APAVsSPSB3-0.265
129397850412:93978504:G:Trs11107116TOthersSOCS20.260
124676425412:46764254:A:Grs74385174GIntronicSLC38A2-0.256
31487142493:148714249:G:Crs143137713CPAVsGYG1-0.255
11281073111:2810731:C:Trs2237886TIntronicKCNQ10.253
205745986820:57459868:T:Crs76602912COthersRP1-309F20.3-0.252
41455720464:145572046:T:Crs7675744COthersHHIP0.251
11505513271:150551327:G:Ars11580946APAVsMCL10.249
4179522084:17952208:C:Trs6845078TIntronicLCORL-0.247
127743982312:77439823:G:Crs61754233CPAVsE2F70.246
194111730019:41117300:G:Ars34093919APAVsLTBP4-0.244
4735153134:73515313:T:Crs7697556COthers-0.241
117527617811:75276178:A:Crs606452CIntronicSERPINH1-0.239
11840235291:184023529:A:Crs1046934CPAVsTSEN150.235
41455651164:145565116:G:Ars7680661AIntronicHHIP-AS10.233
51765166315:176516631:G:Ars1966265APAVsFGFR40.233
3511921263:51192126:G:Ars4256170AIntronicDOCK30.230
1210314034412:103140344:C:Trs79747671TOthers-0.230
61426693386:142669338:A:Grs9496346GIntronicGPR126-0.228
4821657904:82165790:T:Crs994014COthers0.227
1212175608412:121756084:G:Ars13141APAVsANAPC5-0.224
450168834:5016883:G:Ars11722554APAVsCYTL1-0.222
3116434653:11643465:T:Crs2276749CPAVsVGLL40.222
5327668225:32766822:T:Crs3811966CIntronicNPR30.220
6418776716:41877671:G:Ars114056237AIntronicMED20-0.220
9969488639:96948863:A:Grs2768641GIntronicRP11-2B6.3, MIRLET7DHG0.220
22329842002:232984200:G:Ars16828632AIntronicDIS3L2-0.219
2333595652:33359565:G:Ars116713089AUTRLTBP1-0.218
175949727717:59497277:A:Grs757608GOthers-0.218
31290207783:129020778:A:Grs6765930GPAVsHMCES0.217
159919489615:99194896:C:Grs2871865GIntronicIGF1R-0.212
107958097610:79580976:G:Ars41274586APAVsDLG5-0.208
9983802229:98380222:G:Ars817300AOthers-0.205
51682562405:168256240:G:Ars4282339AIntronicSLIT3-0.203
126982765812:69827658:G:Trs10748128TOthers0.202
X38009121X:38009121:G:Ars35318931APAVsSRPX-0.201
10501932010:5019320:C:Trs6650153TIntronicAKR1C1-0.199
6761645896:76164589:C:Ars12209223AIntronicFILIP10.198
2560400992:56040099:T:Crs10199082COthers0.198
31411251863:141125186:A:Grs1344674GIntronicZBTB380.198
51766374715:176637471:G:Ars28932177APAVsNSD10.197
175469472417:54694724:A:Grs117223734GOthers0.195
4577974144:57797414:C:Trs3796529TPAVsREST0.194
31719690773:171969077:C:Grs7652177GPAVsFNDC3B0.192
X78649193X:78649193:C:Trs1474563TOthers0.190
126636450912:66364509:T:Crs12424086COthersHMGA2-0.190
81205960238:120596023:A:Grs10283100GPAVsENPP20.189
101294397310:12943973:C:Trs12779328TIntronicCCDC3-0.187
19842932319:8429323:G:Ars116843064APAVsANGPTL40.186
61303412356:130341235:T:Crs113898003CIntronicL3MBTL3-0.185
2975660512:97566051:C:Trs114707893TIntronicFAM178B-0.184
126624189812:66241898:G:Ars7961706AIntronicHMGA2-0.182
12188797831:218879783:C:Trs72742475TOthers0.178
51727550665:172755066:C:Ars148833559APAVsSTC20.177
154198909115:41989091:C:Ars61736074APAVsMGA-0.175
157246225515:72462255:C:Trs34815962TPAVsGRAMD20.174
154868495815:48684958:C:Trs12592845TOthers-0.174
146097653714:60976537:C:Ars33912345APAVsSIX6-0.172
11278780411:2787804:A:Grs67004488GIntronicKCNQ1-0.172
135054535713:50545357:C:Ars2490637AOthers-0.172
31411218143:141121814:A:Crs2871960CUTRZBTB380.171
11188646021:118864602:T:Grs7536458GOthers-0.171
61589106986:158910698:G:Ars12206717APAVsTULP4-0.171
147063341114:70633411:C:Trs41286548TPAVsSLC8A3-0.170
158940505215:89405052:C:Ars16942383AIntronicACAN-0.170
81356498488:135649848:G:Ars12541381APAVsZFAT-0.169
157433663315:74336633:T:Crs5742915CPAVsPML0.169

There is no guarantee that our iPGS model selects causal variants. We show the top 100 variants with the largest effect size (BETA). To see 62419 variants included in our iPGS model, please download the iPGS coefficients by clicking the download button. We use the GRCh37/hg19 reference genome.


Follow-up analysis

There are several ways to use the resource in your research. First, you may use our iPGS coefficients and compute individual-level polygenic scores for your cohort. Second, you may also investigate the genetic variants with non-zero coefficients and their annotated genes to learn more about biology by taking advantage of the sparsity of our iPGS models. For your convenience, here we suggest several resources as an example of follow-up analysis. We do not intend to cover all the relevant follow-up analyses.

Using iPGS coefficients

By clicking the download button above, you may download the iPGS coefficients. Our FAQ page shows the description of file format and how you may use iPGS coefficients in your research.

HaploReg

HaploReg is a tool for exploring annotations of the non-coding genome at variants on haplotype blocks. The button above submits the top 100 genetic variants with the largest absolute value of coefficients as a query to HaploReg using the default parameters in HaploReg v4.2 (LD threshold r2 >= 1, ChromHMM 15-state model, SiPhy-omega, and GENCODE genes). HaploReg's ability to browse haplotypes is useful here as there is no guarantee that our iPGS model selects causal variants. The 'top 100 variant' cutoff is an arbitrary threshold; we aim to demonstrate how one may investigate the selected variants. Please check Ward and Kellis. Nucleic Acids Res. 2012 and Ward and Kellis. Nucleic Acids Res. 2016 for more information on HaploReg.

GREAT

GREAT: Genomic Regions Enrichment of Annotations Tool evaluates enrichment of pathway and ontology terms. The ability of GREAT to map non-coding genetic variants to their downstream target genes would be suitable for investigating pathway and ontology enrichment of genetic variants selected in our sparse iPGS model. The button above submits the top 1000 genetic variants with the largest absolute value of coefficients as a query to GREAT using the default parameters in GREAT v4.0.4. The 'top 1000 variant' cutoff is an arbitrary threshold; we aim to demonstrate how one may investigate the selected variants. Please check McLean et al. Nat Biotechnol. 2010 and Tanigawa*, Dyer*, and Bejerano. PLoS Comput Biol. 2022 for more information on GREAT.

Single-cell RNA-seq

For anthropometric traits, it may be relevant to investigate the single-cell expression profiling data in adipose-muscle tissues. Please check Single Cell Metab Browser from Yang*, Vamvini*, Nigro* et al. Cell Metab. 2022 as an example of such resources.


References