Unsupervised Deep Autoencoders for Feature Extraction with Educational Data
This repository contains the code for the paper (see bosch-dlwed17-camera.pdf
) presented at the
Deep Learning with Educational Data workshop at the
2017 Educational Data Mining conference.
Bosch, N., & Paquette, L. (2017). Unsupervised deep autoencoders for feature extraction with educational data. In Deep Learning with Educational Data Workshop at the 10th International Conference on Educational Data Mining.
The code was tested with Keras 2.0.3 and Tensorflow 1.1.0 neural network libraries.
Data were from Betty’s Brain. These data
are required for the code to run, and are not publicly available. However, the code could be
(relatively) easily adapted to another dataset.
Model building generally consists of data preprocessing, autoencoder feature extraction, and
supervised learning phases.
preprocess_bromp.py
- takes raw BROMP files created by the HART application and combines thempreprocess_timeseries.py
- creates timeseries (evenly spaced in time) data from Betty’s Brainpreprocess_seq.py
- creates sequences suitable for training RNN models from the timeseriesae_lstm.py
- this and similar files (e.g., vae_lstm.py
) trains the autoencodersextracy_embeddings.py
- takes a trained model, feeds in data sequences, and saves thealign_embeddings+labels.py
- matches up BROMP affect/behavior labels to the embeddingssupervised/ae_feats_test.py
- trains a decision tree (CART) model with the autoencoder featuressupervised/expert_feats_extract.py
- extracts some simple features with the traditional methodsupervised/expert_feats_test.py
- builds a model using the expert features to serve as avisualize_activations.py
generates images of model activations by feeding in a random subset of
samples to a trained autoencoder and creating histograms of the activations of every layer in the
network. For layers with several neurons (> 15), a subset of neurons is sampled to create a more
tractable image.
The model structure is also visualized (requires the pydot
package).