Titanic Survival Model

PREMISE

In this assignment, I go over a labeled dataset containing a list of passengers on board the ship Titanic, – A large civilian cruise vessel that met its unfortunate demise near the Antarctic – and a list of survivals from the Kaggle Machine Learning challenge of the same name. My goal is,


Train A Model That Can Predict The Likelihood (Yes / No) Of A Given Member’s Survival On Board The Ship From A Given List Of Onboarders.

DATA SOURCES

Test: Test Data Used.

Train: Training Data Used.

Features: Feature lists.

My_submission: The label list which includes the “Survival” column for the training data. Used for testing the accuracy of prediction models.

METHODOLOGY

Part one: Libraries Used

Part Two: Setting Up Dataframe

Output:

Part Three: Train Test Split

Part Four: Testing Algorithms

Part Five: Accuracy Results

Output:

 Result:
 Random Forrest Classification: 100%
 Support Vector Machine: 98%
 Decision Tree Classification: 94%
 Gradient Booster: 96% 

As always, the whole code can be found on my GitHub Page

REVIEW

Not Much to add, one important thing to note that using standard scaling / minmax scaling on data using make_pipeline function actually decreased the accuracy of models.

RESOURCES

https://www.kaggle.com/c/titanic


About Me!

An aspiring data scientist with a great interest in machine learning and its applications. I post my work here in the hope to improve over time.