In this assignment, I go over a labeled dataset containing a list of passengers on board the ship Titanic, – A large civilian cruise vessel that met its unfortunate demise near the Antarctic – and a list of survivals from the Kaggle Machine Learning challenge of the same name. My goal is,
Test: Test Data Used.
Train: Training Data Used.
Features: Feature lists.
My_submission: The label list which includes the “Survival” column for the training data. Used for testing the accuracy of prediction models.
Part one: Libraries Used
Part Two: Setting Up Dataframe
Part Three: Train Test Split
Part Four: Testing Algorithms
Part Five: Accuracy Results
Result: Random Forrest Classification: 100% Support Vector Machine: 98% Decision Tree Classification: 94% Gradient Booster: 96%
As always, the whole code can be found on my GitHub Page
Not Much to add, one important thing to note that using standard scaling / minmax scaling on data using make_pipeline function actually decreased the accuracy of models.
An aspiring data scientist with a great interest in machine learning and its applications. I post my work here in the hope to improve over time.