train_data, validation_data, test_data = np.split(train_df.sample(frac=1), [int(.6*len(train_df)), int(.8*len(train_df))])
Need Help. Please explain what this code does? How this code split data into train, validate, test data?
It’s used to split the dataset into multiple parts. first 60% of the data will be train , next 20% will be validation, last 20% will be test.
It’s mentioning the numbers not the percent. if your df length is 100 , 0-60 will get assigned to train 60-80 to validation, 80-100 to test.
Here is the documentaion of numpy.split method