Graph Theoretic Techniques for Clustering and Biclustering gene expression data. | Zendy

Prangyaparamita Mohapatra | Zendy; Tripti Swarnkar | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Graph Theoretic Techniques for Clustering and Biclustering gene expression data.

Author(s) -

Prangyaparamita Mohapatra,

Tripti Swarnkar

Publication year - 2012

Publication title -

international journal of computer and communication technology

Language(s) - English

Resource type - Journals

eISSN - 2231-0371

pISSN - 0975-7449

DOI - 10.47893/ijcct.2012.1136

Subject(s) - cluster analysis , biclustering , data mining , computer science , dna microarray , consensus clustering , expression (computer science) , correlation clustering , clustering high dimensional data , graph , biological data , partition (number theory) , computational biology , gene , cure data clustering algorithm , gene expression , bioinformatics , biology , mathematics , artificial intelligence , genetics , theoretical computer science , combinatorics , programming language

DNA microarray technology has made it possible to simultaneously monitor the expression levels of thousands of genes during biological processes and across collections of related samples. However, the large number of genes and the complexity of biological networks greatly increase the challenges of comprehending and interpreting the resulting mass of data, which often consists of millions of measurements. A first step toward addressing this challenge is the use of clustering techniques, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. Cluster analysis seeks to partition a given data set into groups based on specified features so that the data points within a group are more similar to each other than the points in different groups. Many conventional clustering algorithms have been adapted or directly applied to gene expression data, and also new algorithms have recently been proposed specifically aiming at gene expression data. These clustering algorithms have been proven useful for identifying biologically relevant groups of genes and samples. A large number of clustering approaches have been proposed for the analysis of gene expression data obtained from microarray experiments. However, the results of the application of standard clustering methods to genes are limited. These limited results are imposed by the existence of a number of experimental conditions where the activity of genes is uncorrelated. A similar limitation exists when clustering of conditions is performed. For this reason, a number of algorithms that perform simultaneous clustering on the row and column dimensions of the gene expression matrix have been proposed to date. This simultaneous clustering, usually designated by biclustering, seeks to find submatrices that are subgroups of genes and subgroups of columns, where the genes exhibit highly correlated activities for every condition. This type of algorithms has also been proposed and used in other fields, such as information retrieval and data mining. In this paper, we first briefly introduce the concepts of microarray technology and discuss the basic elements of clustering on gene expression data. Then, we present specific challenges pertinent to each clustering category and introduce several representative approaches.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research