What could be the reasons for such low F1 scores in ML Bootcamp Assignment 1 (Advanced Track)?

For those unattended here is the Datathon link: https://dphi.tech/practice/challenge/49

Well, it is likely that we could face similar issues while solving real-world problems too. There are several cases where certain problems do not necessarily need to be solved with ML/A.I Maybe this is one such case.

Other potential reasons that are affecting performance:

  • the data is not just adequate and we could’ve collected more data and most importantly more data elements (features).
  • huge class imbalance
  • and more… opening to hearing thoughts.

If your performance is bad don’t arrive at a conclusion that you are not good at solving. Instead, it is better to reason the root cause, fill the gaps and change the strategy (maybe even the training data isn’t adequate enough to solve this problem).


This assignment was quite challenging and I learnt a lot from it. :slight_smile: It would be great if we could have a session/notebook that takes us through the proper steps to solve it so we can understand where we went wrong


@chanukya I was solving Assignment1 - Advanced quiz and for every model i.e LR, RF, DT the f1-score is in the range of 8.0 to 8.3%. Only when I did Hyperparameter Tunning | DT it improved to 11.5%. I wanted some help there as how I can improve it or what step I need to follow in that case. As it was imbalanced data so we opted for SMOTE technique using sampling_strategy = 1.0 so as to perform oversampling of minority class in proportion to majority class. But how to interpret in terms of f1-score when it is less and what can be done next in such scenarios?

@vinrock19 one of the potential reasons is the data is not just adequate and we could’ve collected more data or more features that are relevant to predict the target. So, we can conclude this as a failed experiment and re-work on the data collection strategy from scratch.


Ok. So @chanukya do we need to do any customisation for this quiz?

Hi @vinrock19
For the quiz you don’t need to do any customizations. Just follow the instructions given and do the task you are asked in quiz. Do not perform anything additional changes in the dataset other than asked in the quiz, else you might get wrong answers.

Fine @manish_kc_06. Thanks for the reply.