Coclustering of Multidimensional Big Data: A Useful Tool for Genomic, Financial, and Other Data Analysis

Hong Yan*

*Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

Abstract

The analysis of a multidimensional data array is necessary in many applications. Although a data set can be very large, it is possible that meaningful and coherent patterns embedded in the data array are much smaller in size. For example, in genomic data, we may want to find a subset of genes that coexpress under a subset of conditions. In this article, I will explain coclustering algorithms for solving the coherent pattern-detection problem. In these methods, a coherent pattern corresponds to a low-rank matrix or tensor and can be represented as an intersection of hyperplanes in a high-dimensional space. We can then extract coherent patterns from the large data array by detecting hyperplanes. Examples will be provided to demonstrate the effectiveness of the coclustering algorithms for solving unsupervised pattern classification problems.
Original languageEnglish
Pages (from-to)23-30
JournalIEEE Systems, Man and Cybernetics Magazine
Volume3
Issue number2
Online published18 Apr 2017
DOIs
Publication statusPublished - Apr 2017

Fingerprint

Dive into the research topics of 'Coclustering of Multidimensional Big Data: A Useful Tool for Genomic, Financial, and Other Data Analysis'. Together they form a unique fingerprint.

Cite this