Predicting the Survival of Titanic Passengers
Abstract
In 1912, the Titanic sank during its maiden voyage, resulting in the loss of thousands of lives. This paper employs several machine learning techniques — namely, Principal Component Analysis, Logistic Regression, Ridge Regression, Lasso Regression, Decision Tree, Random Forest, Conditional Forest, Support Vector Machine, and K Nearest Neighbours — to predict whether passengers survived based on the available passenger information.
Our assumption that women and wealthy individuals were more likely to survive was confirmed during the study. A simple prediction model based on this assumption achieved an accuracy of 75.60%. Among the methods, the Conditional Forest method performed the best, achieving an accuracy of 81.34%, followed by K-Nearest Neighbors and Support Vector Machine.