项目作者: drawning0510

项目描述 :
The applications data was analyzed for the purpose of developing a supervised identity fraud detection model to identify candidates for fraudulent applications. To build this model, the fraud label was assessed in relation to the linkage of five personally identifiable parameters which include SSN, address, phone number, date of birth and zip code. I created these time-window variables using sqldf library in R because it's efficient and easy to understand.
高级语言: R
项目地址: git://github.com/drawning0510/R-Creating_Time-Window_Variables_with_sqldf.git


R-Creating_time-window_variables_with_sqldf

The applications data was analyzed for the purpose of developing a supervised identity fraud detection model to identify candidates for fraudulent applications. To build this model, the fraud label was assessed in relation to the linkage of five personally identifiable parameters which include SSN, address, phone number, date of birth and zip code. I created these time-window variables using sqldf library in R because it’s efficient and easy to understand.

Instruction for running the code

  1. Download both data files and R code in the same folder
  2. Run Creating time-window variables with sqldf.R in Rstudio

    File list

  3. Dataset file: applications.csv
  4. R code: Creating time-window variables with sqldf.R
  5. Explanations: Variable Creation.pdf

    Contributor

  • Ian Chi