Due 1/22
(20 Points) Assignment 1: Exploratory Analysis Perform exploratory analysis on the UCI MPG dataset. Generate reasonable statistics, search for outliers, and plots for each feature of the dataset. Use scatter plots with regression lines where reasonable. Make histograms for discrete data. Compare vehicles from each origin, are there any noticeable differences between the statistics for each origin? Which features are most associated with a good MPG?
Write a short report, a couple of paragraphs plus figures, to answer the above questions. Turn in your report on canvas and your code on Gitlab. Submit your write-up to Canvas.
|
Due 1/29
(40 Points) Assignment 2: Regression Apply linear regression to the data set from assignment 1 to predict MPG. Besides loading the data and fitting the model, make sure to do the following: feature engineering, cross validation, and error diagnostic plots.
Write a short report on your results. What features did you engineer? Did these engineered features help? Report MAE and MAPE for both the training and testing set, along with the size of train-test split. Include diagnostic plots as well. Speculate what important features are missing from the data.
Submit your report on Canvas and commit your code to GitLab.
|