使用自动化特征工程,基于历史传感器观察预测组件的剩余使用寿命
The general setup for the problem is a common one: we have a single table of sensor observations over time. Now that collecting information is easier than ever, most industries have already generated time-series type problems by the way that they store data. As such, it is crucial to be able to handle data in this form. Thankfully, built-in functionality from Featuretools handles time varying data well.
We’ll demonstrate an end-to-end workflow using a Turbofan Engine Degradation Simulation Data Set from NASA. This notebook demonstrates a rapid way to predict the Remaining Useful Life (RUL) of an engine using an initial dataframe of time-series data. There are three sections of the notebook:
To run the notebooks, you need to download the data yourself. Download and unzip the file from https://ti.arc.nasa.gov/c/6/. Then create a ‘data’ directory and place the files in the ‘data’ directory.
Clone the repo
git clone https://github.com/Featuretools/predict-remaining-useful-life.git
Install the requirements
pip install -r requirements.txt
You will also need to install graphviz for this demo. Please install graphviz according to the instructions in the Featuretools Documentation
Download the data
The data is from the NASA Turbofan Engine Degradation Simulation Data Set
and is available here
To run the notebooks, place the following files in the ‘data’ directory:
train_FD004.txt
, test_FD004.txt
, RUL_FD004.txt
Run the Tutorials notebooks:
jupyter notebook
The utils.py
script contains a number of useful helper functions.
Featuretools is an open source project created by Feature Labs. To see the other open source projects we’re working on visit Feature Labs Open Source. If building impactful data science pipelines is important to you or your business, please get in touch.
Any questions can be directed to help@featurelabs.com