Effect of feature extraction and feature selection on expression data from epithelial ovarian cancer


Turkeli Y., Ercil A., Sezerman O. U.

25th Annual International Conference of the IEEE-Engineering-in-Medicine-and-Biology-Society, Cancun, Meksika, 17 - 21 Eylül 2003, cilt.25, ss.3559-3562 identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Cilt numarası: 25
  • Doi Numarası: 10.1109/iembs.2003.1280921
  • Basıldığı Şehir: Cancun
  • Basıldığı Ülke: Meksika
  • Sayfa Sayıları: ss.3559-3562
  • Anahtar Kelimeler: feature extraction, feature selection, ovarian epithelial cells, principal component analysis, tree structured classifiers, GENE-EXPRESSION
  • Acıbadem Mehmet Ali Aydınlar Üniversitesi Adresli: Hayır

Özet

Classifying the gene expression levels of normal and cancerous cells and identifying the genes most contributing to this distinction propose an alternative means of diagnosis. We have investigated the effect of feature extraction and feature selection on clustering of the expression data on two different data sets for ovarian cancer. One data set consisted of 2176 transcripts from 30 samples, nine from normal ovarian epithelial cells and 21 from cancerous ones. The other data set had 7129 transcripts coming from 27 tumor and four normal ovarian tissues. Hierarchical clustering algorithms employing complete-link, average-link and Ward's method were implemented for comparative, evaluation. Principal component analysis was applied for feature extraction and resulted in 100% segregation. Feature selection was performed to identify the most distinguishing genes using CARTS software. Selected features were able to cluster the data with 100% success. The results suggest that adoption of feature extraction and selection enhances the quality of clustering of gene expression data for ovarian cancer. Identification of distinguishing genes is a more complex problem that requires incorporating pathway knowledge with statistical and machine learning methods.