Application of automatic mutation-gene pair extraction to diseases


Erdogmus M., Sezerman O. U.

Journal of Bioinformatics and Computational Biology, cilt.5, sa.6, ss.1261-1275, 2007 (Scopus) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 5 Sayı: 6
  • Basım Tarihi: 2007
  • Doi Numarası: 10.1142/s021972000700317x
  • Dergi Adı: Journal of Bioinformatics and Computational Biology
  • Derginin Tarandığı İndeksler: Scopus
  • Sayfa Sayıları: ss.1261-1275
  • Anahtar Kelimeler: Disease, Gene, Information extraction, Mutation
  • Acıbadem Mehmet Ali Aydınlar Üniversitesi Adresli: Hayır

Özet

To have a better understanding of the mechanisms of disease development, knowledge of mutations and the genes on which the mutations occur is of crucial importance. Information on disease-related mutations can be accessed through public databases or biomedical literature sources. However, information retrieval from such resources can be problematic because of two reasons: manually created databases are usually incomplete and not up to date, and reading through a vast amount of publicly available biomedical documents is very time-consuming. In this paper, we describe an automated system, MuGeX (Mutation Gene eXtractor), that automatically extracts mutation-gene pairs from Medline abstracts for a disease query. Our system is tested on a corpus that consists of 231 Medline abstracts. While recall for mutation detection alone is 85.9%, precision is 95.9%. For extraction of mutation-gene pairs, we focus on Alzheimer's disease. The recall for mutation-gene pair identification is estimated at 91.3%, and precision is estimated at 88.9%. With automatic extraction techniques, MuGeX overcomes the problems of information retrieval from public resources and reduces the time required to access relevant information, while preserving the accuracy of retrieved information. © 2007 Imperial College Press.