Red Wine Quality

Wine is an alcoholic drink typically made from fermented grapes. Yeast consumes the sugar in the grapes and converts it to ethanol, carbon dioxide, and heat.


This is a companion discussion topic for the original entry at https://dphi.tech/practice/challenge/10
1 Like

In Predict the Quality of Red Wine practice problem the quality below 5 is 3.93 percent only whereas above 5 is 53.44 percent. How to handle this imbalance in the dataset? Since here in output variable, the range of quality is from 0 to 10 and our dataset has distrbution
5 : 477
6 : 446
7 : 139
4 : 37
8 : 13
3 : 7

It does not have 0, 1, 2, 9, 10

You can use SMOTE. Also you can take a look to the following: https://www.kaggle.com/questions-and-answers/93669 and https://datascience.stackexchange.com/questions/24610/smote-and-multi-class-oversampling

What accuracy did you receive? I have been receiving extremely poor accuracy using SMOTE.

I used xgboost.XGBRFClassifier and got 68.125.
Still, there is a room to improve.

How have you done the feature processing like handling of outliers etc?
Have you implemented the SMOTE technique as well?

I haven’t done any preprocessing with features because all the features are numerical features.