项目作者: omerferhatt

项目描述 :
Multiple machine learning models applied to micro-array gene data with t-test.
高级语言: Python
项目地址: git://github.com/omerferhatt/ml-on-genes.git
创建时间: 2020-05-31T21:00:16Z
项目社区:https://github.com/omerferhatt/ml-on-genes

开源协议:MIT License

下载


Disease class prediction with using genetic micro-array data

Objective

The purpose of this project is to develop a method that uses genetic data for disease
classification. Data is extracted from a DNA microarray which
measures
the expression levels of large numbers of genes simultaneously.
Samples in the datasets represent patients. For each patient 7070 genes expressions
(values) are measured in order to classify the patient’s disease into one of the
following cases: EPD, JPA, MED, MGL, RHB.

Data

Gene data is in genes-in-rows format, comma-separated values. You will find on Moodle
a Zip file named: final_project_data.zip file. Unzip to extract the following 3 files:

  • Training dataset: p5i_train.gr.csv
  • Training data classes: pp5i_train_class.txt
  • Test dataset: pp5i_test.gr.csv