Image recognition is a vital component in robotics such as driverless vehicles or domestic robots. Image recognition is also important in image search engines such as Google or Bing image search whereby you use rich image content to query for similar stuff. Like in Google photos where the system uses image recognition to categorize your images into things like cats, dogs, people and so on so that you can quickly search your albums for things like, “give me photos of my cat”, that's awesome.
I am not getting way past 20% prediction rate on the learderboard. I am doing image augumentation with ImageDataGenerator, have tried with 3 to 6 CNN layers, currenty using 125x125x3 images, but also tried with 32x32 and 28x28. I also tried few optimizers (RMSprop, Adam, SGD), but results are more or less similar.
Any suggestions? I see some folks have got above 90%. Would like to hear the feedback.
Sunny
I would love to know as well. I spend hours trying to get my performance as high as I could but I barely passed 50% accuracy. I ended up with an MLP with 6-7 hidden layers. They all had a lot of nodes (> 150) except for the final 2. But yeah, it’s not really helpful since the performance was not great. I also experimented with CNNs but unfortunately, they overfitted. I did not spend a lot of effort into trying to reduce the overfit.
I am getting this error while using image_dataset_from_directory() in test folder.
Since there are no labels, i have changed the label_mode = None. Still i am getting error.
Error: ValueError: Tensor conversion requested dtype string for Tensor with dtype float32: <tf.Tensor ‘args_0:0’ shape=() dtype=float32>
i used flow_from_directory() instead of image_dataset_from_directory() for test data and now it doesn’t show any error. But now i have another problem , i have read that image_dataset_from_directory() returns a tuple of images and labels but now when i check the length of the tuple it shows 205 instead of 6557(after removing validation data) . And when i converted it to an array by looping it with images , labels and after appending all the array i get this shape
(29, 224, 224, 3), which should be (6557, 224, 224, 3) .What i am doing wrong?
It’s very hard to pinpoint what helped, but I see that applying AveragePooling2D at the first few CNN layers does help. Also trying with a different number of filters helps to move the prediction better.
I did look at vgg16 architecture and it does help with the number of CNN layers. I am still struggling though to get it beyond 60+%
For test, you basically need to start with the given CSV file for test data. That contains names of all the images.
Adding a path to where the folder for images is before the image name gives you the image path for each image. Now, you can easily read all the images one by one using OpenCV’s imread.
I applied the shuffle for the training data and my prediction rate improved to the upper 70s%. Still far away from the 90’s and my runtime is getting affected.
Once you have extracted the zip file in your colab files, follow the steps in this notebook to load the data from ‘Accessing the CSV file’ and onwards. Just need to provide path of the files that is located in your colab files.
What is the filepath in this code?
if len(labels_csv) == len(filepaths):
print('Number of labels i.e. ', len(labels_csv), 'matches the number of filenames i.e. ', len(filepaths))
else:
print(‘Number of labels does not match the number of filenames’)
I tried to do this way but after using cv2.imread it returns [[None]] in almost last 400 arrays in test_data instead of pixel values. From 0:400 pixel values is present.
Its been two days i am trying to do test_set preprocessing. Anyone who did it help me with this one.
@manish_kc_06 I think the format in the ‘test’ folder and in the csv file are not same.I created file path adding the folder name and filename from the csv. But it says that the file is not available