Power of Inclusion: enhancing polygenic prediction with admixed individuals

Tanigawa and Kellis. Am J Hum Genet. (2023).


Phenotype: Hematocrit percentage


Hematocrit % iPGS coefficients

Our FAQ page shows the description of the file format and how you may use iPGS coefficients in your research.


iPGS prediction in the held-out test set individuals

We compared the polygenic prediction from our iPGS model and the phenotype values using the held-out test set individuals in UK Biobank. Note the difference in the number of individuals in the five population groups.

/static/data/tanigawakellis2023/per_trait/INI30030/INI30030.WB.PGS_vs_phe.png
/static/data/tanigawakellis2023/per_trait/INI30030/INI30030.NBW.PGS_vs_phe.png
/static/data/tanigawakellis2023/per_trait/INI30030/INI30030.SA.PGS_vs_phe.png
/static/data/tanigawakellis2023/per_trait/INI30030/INI30030.Afr.PGS_vs_phe.png
/static/data/tanigawakellis2023/per_trait/INI30030/INI30030.others.PGS_vs_phe.png

Predictive performance

Population Model Metric Predictive Performance 95% CI P-value
Population Model Metric Predictive Performance 95% CI P-value
white BritishCovariate-only modelR20.333[0.327, 0.339]<1.0x10-300
white BritishGenotype-only modelR20.067[0.064, 0.071]<1.0x10-300
white BritishFull model (covariates and genotypes)R20.400[0.394, 0.405]<1.0x10-300
Non-British whiteCovariate-only modelR20.367[0.339, 0.395]1.3x10-281
Non-British whiteGenotype-only modelR20.058[0.041, 0.074]2.3x10-38
Non-British whiteFull model (covariates and genotypes)R20.426[0.398, 0.453]<1.0x10-300
South AsianCovariate-only modelR20.390[0.351, 0.429]1.4x10-155
South AsianGenotype-only modelR20.033[0.015, 0.051]3.9x10-12
South AsianFull model (covariates and genotypes)R20.426[0.388, 0.464]2.5x10-174
AfricanCovariate-only modelR20.406[0.363, 0.448]2.3x10-132
AfricanGenotype-only modelR20.016[0.002, 0.030]1.5x10-05
AfricanFull model (covariates and genotypes)R20.413[0.370, 0.455]2.4x10-135
OthersCovariate-only modelR20.359[0.342, 0.375]<1.0x10-300
OthersGenotype-only modelR20.049[0.039, 0.058]6.3x10-86
OthersFull model (covariates and genotypes)R20.404[0.387, 0.420]<1.0x10-300

The predictive performance (R2), its 95% confidence interval (CI), and statistical significance (P-value) are shown for each population in UK Biobank in the held-out test set. The "model" column indicates whether the predictive performance is from the covariate-terms alone (covariate-only model), PGS terms alone (Genotype-only model), or the full model containing both PGS and covariate terms. We used the following sets of covariates in our analysis: age, sex, age2, age*sex, Townsend deprivation index, and genotype PCs (PC1-PC18). Please refer to our publication for a more detailed description of the methods.


Coefficients (BETA) of PGS models

/static/data/tanigawakellis2023/per_trait/INI30030/INI30030.BETAs.png

We show the coefficients (BETA) of PGS models. Our iPGS model selected 20106 variants with non-zero coefficients. The genetic variants with the large absolute values of coefficients are annotated in the plot. There is no guarantee that our iPGS model selects causal variants. We use the GRCh37/hg19 reference genome.

The top 100 genetic variants with the largest absolute value of coefficients

CHROM POS Variant Variant ID Effect allele Consequence Gene symbol Beta
CHROM POS Variant Variant ID Effect allele Consequence Gene symbol Beta
71002754447:100275444:G:Ars62482253APAVsGNB2-0.284
107109339210:71093392:C:Trs16926246TIntronicHK10.212
6260911796:26091179:C:Grs1799945GPAVsHFE0.188
61354190186:135419018:T:Crs9399137CIntronicHBS1L-0.186
71514150417:151415041:A:Grs10224002GIntronicPRKAG2-0.158
223746959022:37469590:C:Trs387907018TPAVsTMPRSS60.156
6259182256:25918225:T:Crs80215559CIntronicSLC17A20.150
1211188460812:111884608:T:Crs3184504CPAVsSH2B3-0.145
2463531662:46353166:A:Grs10495928GIntronicPRKCE-0.141
194130665019:41306650:C:Trs61750953TPAVsEGLN2-0.115
154382071715:43820717:C:Trs55707100TPAVsMAP1A-0.109
91361310229:136131022:C:Trs8176751TOthersABO0.106
107109991310:71099913:T:Crs7072268CIntronicHK10.104
41031887094:103188709:C:Trs13107325TPAVsSLC39A80.104
6439411376:43941137:T:Crs17287978COthers-0.103
91361538759:136153875:C:Trs651007TOthersABO-0.099
11551787821:155178782:A:Trs760077TPAVsMTX10.096
12141801181:214180118:G:Ars726334AIntronicPROX10.095
107111284310:71112843:G:Trs10998738TIntronicHK1-0.091
124851228512:48512285:C:Ars4760682APAVsPFKM-0.090
113075483711:30754837:G:Ars55733296AOthers-0.082
6258429516:25842951:T:Grs1408272GIntronicSLC17A30.079
204304236420:43042364:C:Trs1800961TPAVsHNF4A0.078
11549651131:154965113:G:Trs7535144TPAVsFLAD1-0.077
157629813215:76298132:A:Grs4886755GPCVsNRG4-0.076
224636416122:46364161:G:Ars9330813AIntronicWNT7B0.074
3567712513:56771251:A:Crs3772219CPAVsARHGEF3-0.074
4553941724:55394172:C:Trs218237TOthers-0.073
168885372916:88853729:C:Trs837763TOthersPIEZO1-0.071
91306229469:130622946:T:Crs4837197COthers-0.071
175945658917:59456589:C:Trs9895661TOthersBCAS3-0.070
2463155162:46315516:C:Grs71422190GIntronicPRKCE-0.068
193375454819:33754548:C:Trs78744187TOthers0.067
21208480492:120848049:C:Trs28930677TPAVsEPB41L50.066
157623298215:76232982:C:Trs2648437TIntronicNRG4-0.066
6438015826:43801582:C:Trs12660375TOthers-0.065
71003616757:100361675:G:Ars2293767APAVsZAN0.064
21139729452:113972945:A:Grs752590GIntronicPAX8-AS10.063
1980744219:807442:G:Crs123698COthersPTBP10.062
X8916646X:8916646:A:Crs17307280COthers0.062
194132836519:41328365:C:Trs11879672TIntronicCYP2F2P, CTC-490E21.12-0.062
194138168319:41381683:G:Trs58682606TPTVsCYP2A7-0.060
6438066096:43806609:G:Ars881858AOthers0.059
11472825081:147282508:C:Trs11240129TOthers0.058
146470359314:64703593:G:Trs1256061TIntronicESR2-0.057
12314885241:231488524:C:Trs2437150TPAVsSPRTN0.057
2463590792:46359079:A:Grs17034610GIntronicPRKCE-0.055
71003155177:100315517:A:Grs1734907GOthersEPO-0.055
191937954919:19379549:C:Trs58542926TPAVsTM6SF20.054
1464934601:46493460:T:Grs1707336GPAVsMAST2-0.054
1480303501:48030350:G:Ars11583750AOthers-0.053
211633917221:16339172:G:Crs2229742CPAVsNRIP1-0.053
4774103184:77410318:C:Ars4859682AIntronicSHROOM30.053
712745827:1274582:G:Ars4724799AIntronicUNCX-0.051
205759797020:57597970:A:Crs463312CPAVsTUBB10.051
224432472722:44324727:C:Grs738409GPAVsPNPLA30.051
155872342615:58723426:A:Grs1077835GIntronicLIPC, ALDH1A2-0.051
963566579:6356657:C:Trs7021445TOthers0.050
213535876221:35358762:G:Ars2834322AOthers-0.050
113074909011:30749090:T:Crs963837COthers0.049
163010316016:30103160:C:Ars3809627AUTRTBX60.049
21219919672:121991967:C:Trs11677172TIntronicTFCP2L10.048
2440171882:44017188:T:Crs3792020CIntronicDYNC2LI10.048
174392407317:43924073:T:Crs12373123CPAVsSPPL2C0.048
121304621112:13046211:C:Trs12811512TIntronicGPRC5A-0.048
2463060402:46306040:G:Ars113797384AIntronicPRKCE-0.048
19217076419:2170764:T:Crs2108524CIntronicDOT1L0.047
1104242521:10424252:C:Trs11121548TIntronicKIF1B0.047
206103832220:61038322:C:Trs6121609TOthersGATA50.047
71509303637:150930363:C:Trs73169668TUTRCHPF2-0.046
6437363896:43736389:A:Crs699947COthersVEGFA-0.046
3698705953:69870595:A:Grs1529585GIntronicMITF0.046
132923058113:29230581:A:Grs1340817GOthersPOMP-0.046
1293200131:29320013:G:Ars111642750APAVsEPB410.045
111017444711:10174447:C:Trs7129531TIntronicRP11-748C4.1, SBF2-0.045
4237365234:23736523:A:Grs16874052GIntronicRP11-380P13.10.044
17483589517:4835895:T:Crs2243093CPAVsGP1BA0.043
167214417416:72144174:T:Crs9302635CIntronicDHX38-0.043
168633966116:86339661:C:Trs11649143TOthers-0.043
6311065016:31106501:C:CCAffx-89026413CCPTVsPSORS1C1-0.043
153332451915:33324519:T:Crs17816699CIntronicFMN10.043
81451129838:145112983:C:Trs55916375TPAVsOPLAH0.042
6122958766:12295876:A:Grs1629862GIntronicEDN1-0.042
3123590493:12359049:G:Ars2067819AIntronicPPARG-0.042
12251835212:2518352:T:Crs4765929CIntronicCACNA1C0.042
143615882814:36158828:T:Crs7155504COthersRALGAPA1-0.042
21770426332:177042633:A:Crs2072590COthersHOXD-AS1, AC009336.240.042
125784371112:57843711:G:Ars2229357APAVsINHBC0.041
81165594358:116559435:A:Grs3808434GIntronicTRPS10.040
1311454901513:114549015:T:Crs6602910CIntronicGAS60.040
71296634967:129663496:C:Trs11556924TPAVsZC3HC1-0.040
161625959616:16259596:G:Ars41278174APAVsABCC60.040
1285642791:28564279:A:Grs9508GPAVsATPIF10.039
891851468:9185146:T:Crs2126259CIntronicRP11-115J16.1-0.039
3584105543:58410554:C:Trs34579268TPAVsPXK-0.039
202528296720:25282967:C:Trs746748TPAVsABHD12-0.039
2463726752:46372675:A:Grs10495930GIntronicPRKCE-0.039
4880522194:88052219:T:Crs342467CPAVsAFF1-0.039
631238217HLA-C*0501HLA-C*0501+PAVsHLA-C-0.039
223887746122:38877461:T:Grs12004GPAVsKDELR3-0.039

There is no guarantee that our iPGS model selects causal variants. We show the top 100 variants with the largest effect size (BETA). To see 20106 variants included in our iPGS model, please download the iPGS coefficients by clicking the download button. We use the GRCh37/hg19 reference genome.


Follow-up analysis

There are several ways to use the resource in your research. First, you may use our iPGS coefficients and compute individual-level polygenic scores for your cohort. Second, you may also investigate the genetic variants with non-zero coefficients and their annotated genes to learn more about biology by taking advantage of the sparsity of our iPGS models. For your convenience, here we suggest several resources as an example of follow-up analysis. We do not intend to cover all the relevant follow-up analyses.

Using iPGS coefficients

By clicking the download button above, you may download the iPGS coefficients. Our FAQ page shows the description of file format and how you may use iPGS coefficients in your research.

HaploReg

HaploReg is a tool for exploring annotations of the non-coding genome at variants on haplotype blocks. The button above submits the top 100 genetic variants with the largest absolute value of coefficients as a query to HaploReg using the default parameters in HaploReg v4.2 (LD threshold r2 >= 1, ChromHMM 15-state model, SiPhy-omega, and GENCODE genes). HaploReg's ability to browse haplotypes is useful here as there is no guarantee that our iPGS model selects causal variants. The 'top 100 variant' cutoff is an arbitrary threshold; we aim to demonstrate how one may investigate the selected variants. Please check Ward and Kellis. Nucleic Acids Res. 2012 and Ward and Kellis. Nucleic Acids Res. 2016 for more information on HaploReg.

GREAT

GREAT: Genomic Regions Enrichment of Annotations Tool evaluates enrichment of pathway and ontology terms. The ability of GREAT to map non-coding genetic variants to their downstream target genes would be suitable for investigating pathway and ontology enrichment of genetic variants selected in our sparse iPGS model. The button above submits the top 1000 genetic variants with the largest absolute value of coefficients as a query to GREAT using the default parameters in GREAT v4.0.4. The 'top 1000 variant' cutoff is an arbitrary threshold; we aim to demonstrate how one may investigate the selected variants. Please check McLean et al. Nat Biotechnol. 2010 and Tanigawa*, Dyer*, and Bejerano. PLoS Comput Biol. 2022 for more information on GREAT.


References