Input data

Requirement on #SNPs and #individuals

In order for the desirable properties of the h2-GRE estimator to hold, h2-GRE requires individual-level genotype and phenotype data where the number of individuals (N) is larger than the number of genotyped SNPs per chromosome (m_k for the k-th chromosome).

As a reference point, in our analyses of the UK Biobank genotyped SNPs (UK Biobank Axiom Array), chromosome 1 (the largest chromosomes) had about 45-55K SNPs depending on how we QC-ed the data, while number of individuals is about 300K.

Prepare the data

The genotype and phenotypes should be in PLINK format. We will also use PLINK software to perform OLS regression.

!!! note Here we assume that you know the basic about plink, e.g. basic commands, data format. You can read more on the plink website about covariate file, phenotype file and association analysis.

We will use the following data.

all_bfile: Genotype for the unrelated individuals in UK Biobank after quality control. This contains genome-wide data in one file.
pheno_file: the phenotype file.
covar_file: the covariates which will be used for association studies.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

input_data.md

input_data.md

Input data

Requirement on #SNPs and #individuals

Prepare the data

Files

input_data.md

Latest commit

History

input_data.md

File metadata and controls

Input data

Requirement on #SNPs and #individuals

Prepare the data