
PREMISE
In this assignment, I go over a labeled dataset containing a list of passengers on board the ship Titanic, – A large civilian cruise vessel that met its unfortunate demise near the Antarctic – and a list of survivals from the Kaggle Machine Learning challenge of the same name. My goal is,
Train A Model That Can Predict The Likelihood (Yes / No) Of A Given Member’s Survival On Board The Ship From A Given List Of Onboarders.
DATA SOURCES
Test: Test Data Used.
Train: Training Data Used.
Features: Feature lists.
My_submission: The label list which includes the “Survival” column for the training data. Used for testing the accuracy of prediction models.
METHODOLOGY
Part one: Libraries Used
Part Two: Setting Up Dataframe
Output:

Part Three: Train Test Split
Part Four: Testing Algorithms
Part Five: Accuracy Results
Output:
Result: Random Forrest Classification: 100% Support Vector Machine: 98% Decision Tree Classification: 94% Gradient Booster: 96%
As always, the whole code can be found on my GitHub Page
REVIEW
Not Much to add, one important thing to note that using standard scaling / minmax scaling on data using make_pipeline function actually decreased the accuracy of models.
RESOURCES
https://www.kaggle.com/c/titanic

About Me!
An aspiring data scientist with a great interest in machine learning and its applications. I post my work here in the hope to improve over time.