51视频

Computer Science and Information Technology Vol. 4(1), pp. 1 - 8
DOI: 10.13189/csit.2016.040101
Reprint (PDF) (517Kb)


Discovery of Gene-disease Associations from Biomedical Texts


Wen-Juan Hou *, Bo-Yuan Kuo
Department of Computer Science and Information Engineering, National Taiwan Normal University, Taiwan

ABSTRACT

Due to the ever-expanding growth of biomedical publications, biologists have to retrieve up-to-date information from vast literatures to ensure they do not neglect certain significant publications. It becomes more and more important to deal with the extraction problem from the biomedical texts in an automatic way. The paper focuses on automatically identifying the relationships between human genetic diseases and genes from the biomedical literatures. The experimental data is retrieved from Mendelian Inheritance in Man (MIM) literatures of morbid in Online Mendelian Inheritance in Man (OMIM) database. We propose a hybrid method combining the rule learning and the statistical techniques. To collect the corpus used in the research, the first step is to find the sentences that include both the related human genetic diseases and genes mentioned from the morbid file, and they are regarded as the correct sentences. In the second step, the sentences that neither have the related human genetic diseases nor the genes mentioned from the morbid file are randomly selected, and they are regarded as the incorrect sentences. Next, the Memory-Based Shallow Parser is utilized to analyze these sentences to get some information in order to find rules in the following step. Then, some learning rules are obtained with a rule learner, ALEPH system. These generated rules are applied to catch the pairs of human genetic diseases and genes within one sentence. In the following, the study proposes a statistical approach, called Z-score method, to determine whether the pairs are valid or not. Finally, the experiments are made with considering some constraints and different numbers of rules. Furthermore, the evaluation metrics in the experiments are precision, recall rates, and F-scores.

KEYWORDS
Gene-disease Association, Biomedical Text Mining, Statistical Method, Rule Learning

Cite This Paper in IEEE or APA Citation Styles
(a). IEEE Format:
[1] Wen-Juan Hou , Bo-Yuan Kuo , "Discovery of Gene-disease Associations from Biomedical Texts," Computer Science and Information Technology, Vol. 4, No. 1, pp. 1 - 8, 2016. DOI: 10.13189/csit.2016.040101.

(b). APA Format:
Wen-Juan Hou , Bo-Yuan Kuo (2016). Discovery of Gene-disease Associations from Biomedical Texts. Computer Science and Information Technology, 4(1), 1 - 8. DOI: 10.13189/csit.2016.040101.