Regression Models for Count Data and Variable Selection Methods for Metabolomics Data

Presented by: University of Calgary
Category: Other Event
Price: $0
Date: March 14, 2014 – March 14, 2014
Address: 2500 University Drive NW, Calgary, Alberta T2N 1N4
Website: http://www.ucalgary.ca/

This talk has two speakers: Assessing the impact of using different regression models for count data Colin Weaver Departments of Mathematics and Statistics University of Calgary Abstract: Various regression models are available for count outcomes, including Poisson, negative binomial, zero-inflated Poisson, and zero-inflated negative binomial. In these models, the effect of a variable on the count outcome is summarized in a rate ratio (RR). We are interested in the effect of regression type on the estimate of the RR and its standard error. We also wish to test a recently explored method to estimate the overall RR in zero-inflated models. Hospital stay data of dialysis patients and simulated data are used. Danny Lu Departments of Mathematics and Statistics University of Calgary Performance of Four Variable Selection Methods for Metabolomics Data Abstract: Metabolomics is a relatively new field that studies small molecules found within biological systems that can be used to develop better disease diagnostic approaches. The main issue trying to analyze this data is that the number of variables being measured is often much larger than the number of observations, which can lead to high variability, mis-identification, and over fitting of the data. Thus variable selection is needed to overcome these problems. Using these methods can improve estimation accuracy, model interpretability, and computational costs of analyzing the data. This study compared the performances of four feature selection methods that are currently used with metabolomics data: Student t-tests, Lasso, Elastic Net, and Variable Importance in Projection (VIP). Simulations studies with varying parameters were used and evaluated with the partial area under the receiver-operating curve. The results suggest that the non regularized (or t-test and VIP) methods are the top performing methods across all parameter settings.

Location:

MS 431 Math Science Building

Speaker:

Colin Weaver and Danny Lu, Departments of Mathematics and Statistics, University of Calgary

More information at http://www.ucalgary.ca/events/calendar/regression-models-count-data-and-variable-selection-methods-metabolomics-data


Get Directions

2500 University Drive NW, Calgary, Alberta T2N 1N4
To ease another’s heartache is to forget one’s own.
Abraham Lincoln

More events at University of Calgary

No Entries Found

Other Events

No Similar Events Found