In genome-wide association studies (GWAS) for thousands of phenotypes in large biobanks, most binary traits have substan- tially fewer cases than controls. Both of the widely used approaches, the linear mixed model and the recently proposed logistic mixed model, perform poorly; they produce large type I error rates when used to analyze unbalanced case-control phenotypes. SAIGE is a scalable and accurate generalized mixed model association test that uses the saddlepoint approximation to calibrate the distribution of score test statistics.SAIGE provides accurate P values even when case-control ratios are extremely unbalanced. SAIGE uses state-of-art optimization strategies to reduce computational costs; hence, it is applicable to GWAS for thousands of phenotypes by large biobanks. We would like to benchmark SAIGE on the summit node for potentially scaling up the analysis.
“gpu_power9_v100_smx2” queue and the hostname “witherspoon00”