Statistical Methods

Wilks’ dissimilarity for gene clustering: computational issues


Clustering methods are widely used in the analysis of gene expression data for their ability to uncover coordinated expression profiles. One important goal of clustering is to discover co–regulated genes because it has been postulated that co–regulation implies a similar function. In the context of agglomerative hierarchical clustering, we introduced a dissimilarity measure based on the Wilks’ Λ statistic that they called the Wilks’ dissimilarity and showed its usefulness in the identification of transcription modules. In this paper, we discuss the ability of the Wilks’ dissimilarity to identify clusters of co-expressed genes by providing an example where the most commonly used dissimilarity measures fail. Furthermore, we carry out a set of simulations aimed to investigate the use of a sparse canonical correlation technique in the estimation of the Wilks’ dissimilarity and provide guidelines for its use.

Full Text:





Article Metrics

Metrics Loading ...

Metrics powered by PLOS ALM


  • There are currently no refbacks.

Copyright (c)

We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it (Read more).

EBPH Epidemiology, Biostatistics and Public Health | ISSN 2282-0930

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.