Performed an ETL on two datasets from Amazon completely in the cloud. First dataset was on beauty products, second dataset was on watches
Transformed the dataset to fit the tables in the schema file. Ensured the DataFrames matched in data type and in column name, then loaded the DataFrames that corresponded to tables into an RDS instance
Demonstrated ability to conduct statistical analyses on data using PySpark