Multiple machine learning models applied to micro-array gene data with t-test.
The purpose of this project is to develop a method that uses genetic data for disease
classification. Data is extracted from a DNA microarray which
measures
the expression levels of large numbers of genes simultaneously.
Samples in the datasets represent patients. For each patient 7070 genes expressions
(values) are measured in order to classify the patient’s disease into one of the
following cases: EPD, JPA, MED, MGL, RHB.
Gene data is in genes-in-rows format, comma-separated values. You will find on Moodle
a Zip file named: final_project_data.zip file. Unzip to extract the following 3 files: