R Software Packages

If you are a student or researcher who analyzes genetic and genomic data, or a methodologist developing methods of analysis for such data, please download the software developed by our group. Most methods are implemented as R packages.

Associations in high dimensional data

- This package proposes functions and algorithm to identify influential observations in high dimensional regression setting
- R software package to implement high-dimensional error-in-variables regression. This package implements CoCoLasso algorithm in settings with additive error or missing data in the covariates. This package also implements a variation of the CoCoLasso algorithm called Block-Descent CoCoLasso (or BD-CoCoLasso), which focuses on a setting where only a small percentage of the features are corrupted (with additive error or missing data).
- Construction of a new instrumental variable that minimizes horizontal pleiotropy in the context of Mendelian randomization | Citation:
– A mixed model, where the fixed effects can be high dimensional and penalized (L1), and the random effects covariance may be constructed using some of the features also included in among the fixed effects. For example, for simultaneous estimation of SNP fixed effects while adjusting for family relationships using a kinship matrix constructed using overlapping SNPs. See
- Sparse additive interaction learning. Efficient penalized model for interactions between one key covariate and a high dimensional feature space. Sail enforces a strict hierarchy on the interaction terms.
– Principal components of heritability, a method for dimension reduction of a high dimensional feature space, while maximizing the variance explained by covariates | Citation:
– Finding p-values from a double Wishart problem | Citation:
- Provides tools to model and test the association between multiple genotypes and multiple traits, taking into account the prior biological knowledge. The method is based on Generalized Structured Component Analysis (GSCA) | Citation:
- R package for kernel semi-parametric models.Manuscript in preparation.

Methods of analysis for DNA Methylation data.

- Estimating smooth covariate effects on targeted bisulfite sequencing measures of DNA methylation Manuscript submitted for publication.
- Hidden Markov model for estimating methylation levels and for testing for differentially methylated CpG sites | Citation: Biometrics
- A smoothing method for whole genome bisulfite sequencing data that allows for sequencing errors | Citation:
– Normalization of Illumina beadchip-derived DNA methylation data when data are from multiple tissues or cell types | Citation:
- Functional normalization of 450k methylation array data improves replication in large cancer studies | Citation:

Analysis methods for rare genetic variants

– A method for estimating genome-wide significance thresholds for extremely dense genetic information, such as obtained from sequencing studies | Citation:
– Multivariate tests of association between rare genetic variants and two or more phenotypes | Citation:
– A suite of tools for rare variant analysis including non normal phenotypes and family structures consideration | Citation:
– Now integrated into RVPedigree | Citation:
– Tests for association with rare genetic variants | Citation:

Scripts

– A script to assist in preparing files for imputation using the Sanger imputation service. This repository contains scripts to prepare plink genotype files for imputation on the Sanger server.
- Functions to run a 450K pipeline analysis.
– Statistical analysis and visualization of functional profiles for genes and gene clusters.
– Scripts for performing cell type mixture adjustments in DNA methylation data | Citation:
- A pipeline to run a pcev analysis from the R package on CBRAIN.

Microbiome Data

- allows the estimation of microbiome OTU co-occurrence networks within two separate groups, where the networks are defined through precision matrices. The difference between the two precision matrices is also estimated, along with corresponding interval estimates.Manuscript submitted for publication.

on various useful tools in analysis and research

Presentation by Greg Voisin
Vignette by Greg Voisin
Presentations by Sahir Bhatnagar
by Sahir Bhatnagar

For more information, visit the R project website at:

and

��VR��Ƶ

R Software Packages

Associations in high dimensional data

Methods of analysis for DNA Methylation data.

Analysis methods for rare genetic variants

Scripts

Microbiome Data

on various useful tools in analysis and research

Greenwood Lab

����VR��Ƶ

R Software Packages

Associations in high dimensional data

Methods of analysis for DNA Methylation data.

Analysis methods for rare genetic variants

Scripts

Microbiome Data

on various useful tools in analysis and research

Greenwood Lab

��VR��Ƶ