Data Science Capstone

Machine Learning
Data Science
Python
SQL

A comprehensive Data Science project that predicts the likelihood of SpaceX's Falcon 9 first stage landing successfully in order to advise a competing company

SpaceX rocket lauch in progress

Introduction

This capstone project predicts the likelihood of the Falcon 9 first stage landing successfully. SpaceX advertises Falcon 9 rocket launches on its website with a cost of $62 million; other providers cost upward of $165 million each. Much of the savings is because SpaceX can reuse the first stage of it’s rockets after launch. If we can determine whether the first stage will land, we can estimate the cost of a launch. This information can prove to be very insightful for an alternate company that wants to bid against SpaceX for a rocket launch.

This project mimics a real world Data Science problem and is my first attempt at putting all my Data Science knowlegde together in one body of work. I have assumed the role of a Data Scientist working for a startup company intending to compete with SpaceX. In the process I followed the Data Science methodology involving API aided data collection, web scraping, data wrangling, data analysis, exploratory data analysis, static data visualization, interactive data visualization / dashboard development, machine learning model building, and presentation of findings.

Business Problem

SpaceX advertises Falcon 9 rocket launches on its website, with a cost of 62 million dollars; other providers cost upward of 165 million dollars each, much of the savings is because SpaceX can reuse the first stage. Therefore if an accurate prediction of the likelihood of the first stage rocket landing successfully can be achieved, it is possible to determine the cost of a launch. With the help of these Data Science findings and models, the competing startup can make more informed bids against SpaceX for a rocket launch.

Objective

  • To apply Data Science toolkit and machine learning model building in order to accurately predict the likelihood of the first stage rocket landing successfully, and determine the cost of a launch.
  • To explore the data in order to obtain interesting insights.
  • To relate findings to stakeholders in a clear and concise fashion.

Business metric

Machine Learning Model Classification Accuracy: number of correct prediction divided by the total number of prediction

$$ Accuracy = \frac{TP+TN} {TP+FP+TN+FN} $$

Deliverables

  • Accurate predictive algorithms
  • Business case report for stakeholders

Thank you

Please feel free to leave a comment in my google slide presentation. I always welcome constructive criticism.