..

生物识别与生物统计学杂志

体积 4, 问题 1 (2013)

研究文章

Statistical Methods for Estimating Within-Cluster Effects for Clustered Poisson Data

Dexiang Gao, Gary K. Grunwald and Stanley Xu

Clustered Poisson data frequently appear in medical research. Interest often focuses on examination of an exposure effect within clusters. The objective of this paper is to compare the performance of six methods for estimating the exposure effect for clustered Poisson data: 1) independent Poisson; 2) fixed cluster effects Poisson; 3) conditional likelihood Poisson estimation; 4) Generalized Estimating Equations (GEE); 5) random cluster effects Poisson; and 6) random cluster effects Poisson, with separate between- and within-cluster effects. Biases and standard errors of within- cluster exposure effects are compared across the six statistical methods considering constant or varying exposure ratio

(number of exposed to unexposed subjects), constant or varying cluster sizes, different within-cluster exposure effect, different cluster variances, and number of clusters. Simulations and theoretical results show that exposure ratio is a key quantity. With constant exposure ratio designs, maximum likelihood estimates and asymptotic standard errors were obtained in closed form. All models, except GEE, give equivalent estimates and standard errors of the within-cluster  exposure effect. With varying exposure ratio designs, conditional likelihood and fixed cluster effects methods yield the same estimates and standard errors for the exposure effect. Results from the random cluster effects Poisson model with
separate between- and within-cluster effects are very similar to those from fixed cluster effects Poisson and conditional Poisson methods. We applied the above approaches to birth cohort data, to analyze incidence of Respiratory Syncytial Virus (RSV) infection in young children in Indonesia.
研究文章

Diagnostic Utility of Gene Expression Profiles

Chengjie Xiong, Yan Yan and Feng Gao

Two crucial problems arise from a microarray experiment in which the primary objective is to locate differentially expressed genes for the diagnosis of diseases such as cancer and Alzheimer’s. The first problem is the detection of a subset of genes which provides an optimum discriminatory power between diseased and normal subjects, and the second problem is the statistical estimation of discriminatory power from the optimum subset of genes between two groups of subjects. We develop a new method to select an optimum subset of discriminatory genes by searching over possible linear combinations of gene expression profiles and locating the one which provides the maximum discriminatory power between two sources of RNA as measured by the area under the receiver operating characteristic

(ROC) curve. We further provide an estimate to the optimum discriminatory power between the diseased and the healthy subjects over the selected subsets of genes. The proposed stepwise approach takes in account of the gene-to-gene correlations in the estimation of discriminating power as well as the associated variability and allows the number of genes to be selected based on the increment of the discriminating power. Finally, the proposed methodology is applied to a benchmark microarray experiment and compared to the results obtained through existing approaches in the literature.
研究文章

Contaminated Chi-Square Modeling and Large-Scale ANOVA Testing

Richard Charnigo, Feng Zhou and Hongying Dai

We propose a convenient moment-based procedure for testing the omnibus null hypothesis of no contamination of a central chi-square distribution by a non-central chi-square distribution. In sharp contrast with likelihood ratio tests for mixture models, there is no need for re-sampling or random field theory to obtain critical values. Rather, critical values are available from an asymptotic normal distribution, and there is excellent agreement between nominal and actual significance levels. This procedure may be used to model numerous chi-square statistics, obtained via monotonic transformations of F statistics, from large-scale ANOVA testing, such as that encountered in microarray data analysis. In that context, modeling chi-square statistics instead of p-values may improve detection of differential gene expression, as we demonstrate through simulation studies, while also reducing false declarations of the same, as we illustrate in a case study on aging and cognition. Our procedure may also be incorporated into a gene filtration process, which may reduce type II errors on genewise null hypotheses by justifying lighter controls for Type I errors.

索引于

相关链接

arrow_upward arrow_upward