Solve Data Sprint #12 Challenge | DPhi

What is Phishing?

Phishing is a cybercrime in which a target or targets are contacted by email, telephone or text message by someone posing as a legitimate institution to lure individuals into providing sensitive data such as personally identifiable information, banking and credit card details, and passwords.

This is a companion discussion topic for the original entry at


Do we need to use only Random forest classifier for solving the problem? This is because when i checked the leadership i could see better accuracies of around 95% but i am stuck at only 77%. Kindly please let me know…

Use XGBoost/CatBoost/lgbm classifier models. It will give you accuracy of 95%. Later do some hyperparameter tunning using GridSearchCV to increase the accuracy.

Do feature engineering. Feature Engineering +feature_selection+ lgbm I got 2nd rank.