Introduction
I am researching Explainable Artificial intelligence (XAI). I chose to restart my experiments on heat-maps, explanation, and neural networks. In a new series of blog posts, I plan to write about my progress on creating an evaluating a XAI heatmap.
Tutorials
I built my emotion recognition neural network using a tutorial titled Emotion Recognition Using Keras. I followed the instructions on the ReadMe.md document which came with the GitHub repository link.
A common problem with any internet tutorial is code becoming out of date. Even though the GitHub repository link was published in November 2019 it still needed updating. I searched for the error messages when they appeared in the terminal and I was able to update the code.
Once I finished following the tutorial, the neural network only had 17% accuracy. This was not the level of accuracy I expected. The website Machine Learning Mastery was very helpful in helping me to correct this. It suggested that I set the Adam variable to the recommended setting for Keras. When I did that the accuracy shot up to 68%.
The Adam variable holds the needed information that the Adam algorithm needs to function. The Adam algorithm is an adaptive learning rate optimization algorithm . It can be set in Keras. The Adam variable can be set to anything, however I found that it functioned well when set to the recommended settings.
Once the neural network had the expected accuracy then I wanted to examine the algorithm further. I used an ELI5 package to develop a heat-map to explain why the neural network labelled an angry face as a neutral face. Like the Emotion Recognition Using Keras tutorial, the ELI5 tutorial needed work to make it work with the neural network.
Processing the images and testing the NN
The neural network was tasked to label images from the Cohn-Kanade plus dataset. The dataset consists of many series of images. The image series show a person displaying the required emotion, going from a neutral face to a face showing the maximum amount of emotion.
Images from the dataset follow a naming convention. We are using the image name S138_004_00000013_ANG.png as an example. The first part of the name (“S138_004”) identifies the folder where the pictures are stored. The second part of the name (“00000013”) identifies the position of the image in the series, with 00000001 indicating the first image. I attached a three-letter code to the end of the name (“ANG”) to indicate the emotion depicted. The abbreviations were chosen because they all started with different letters, making programming easier.
Three-letter Abbreviation | Un-abbreviated word | Emotion |
ANG | Angry | Angry |
FEA | Fear | Scared |
HAP | Happy | Happy |
MIS | Misery | Sad |
SUR | Surprise | Surprised |
The ‘neutral’ emotion has no three-letter code. A neutral emotion defined for the purposes of this experiment as any image taken from the first or second part of each image series.
I only looked at four pictures of each emotion, except for the neutral emotion which took the first picture in a series showing each emotion. As the CK+ dataset does not contain any explicitly labelled ‘neutral’ images, I believed it was the fairest way of testing the NN on ‘neutral’ images.
Results
The most significant result from this preliminary test is that there is not a strong correlation between the last image of a series and the accuracy or prediction strength of the neural network.
I have attached a pdf of the results of my preliminary experiments. Each emotion (“angry”, “scared”, “happy”, “sad”, “surprised”, and “neutral”) has a number of slides with images of people emoting, the image labelled, the image with a heatmap showing the areas of interest of the NN, a graph showing the probability that the photo is of each emotion.
In this slide you can see that the neural network predicted that it was most likely that this image showed a person making a ‘neutral’ face, and that it was second most likely that the person was making an ‘angry’ face. The heat map shows that the neural network is focusing on the persons brow, chin, and left ear (from the viewers perspective).
The Paul Ekman group, named after the Paul Ekman who designed the Facial Action Coding methodology used in the CK+ dataset to pose expressions, describes an angry face like so: “In anger the eyebrows come down and together, the eyes glare, and there is a narrowing of the lip corners“. The heatmap shows a focus on the eyebrows and one of the corners of the eye, and one of the corners of the mouth. This is shows the neural network is focusing on the expected aspects of the face.
However, the neural network did not correctly identify the person as angry. This may because of the other things it is focusing on, as shown in the heatmap. However more research needs to be done with a broader dataset before coming to any concrete conclusions.
Conclusion
There is a lot more research to be conducted using heatmaps, the CK+ dataset, and neural networks. I think it will be interesting to look further into the correlation of image position in series and accuracy or prediction strength of the neural network.
I would also like to determine if the heatmaps will help to give a good explanation of why the NN labelled the image as it did. Further I would like to know if this explanation would help a well-educated non-expert determine if an algorithm was properly designed. I will do this by overfitting and overtraining one NN on the CK+ dataset and comparing it to a properly designed NN.
Footnote
To create the slides showing the four pictures together I used Microsoft PowerPoints’ photo album function and selected “four pictures” on each slide and arranged the pictures so that the pictures all referred to the same base image.
References
Brownlee, J. (2019, September 19). Machine Learning Mastery. https://machinelearningmastery.com/
Brownlee, J. (2020, August 20). Gentle introduction to the Adam optimization algorithm for deep learning. Machine Learning Mastery. Retrieved October 26, 2020, from https://machinelearningmastery.com/adam-optimization-algorithm-for-deep-learning/
Bushaev, V. (2018, October 24). Adam – latest trends in deep learning optimization. Medium. Retrieved October 26, 2020, from https://towardsdatascience.com/adam-latest-trends-in-deep-learning-optimization-6be9a291375c
Lucey, P., Cohn, J. F., Kanade, T., Saragih, J., Ambadar, Z., & Matthews, I. (2010). The extended Cohn-kanade dataset (CK+): A complete dataset for action unit and emotion-specified expression. 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition – Workshops. https://doi.org/10.1109/cvprw.2010.5543262
Mohammad, C. A. (2020, January 10). Emotion recognition using Keras. Medium. Retrieved October 26, 2020, from https://medium.com/@ee18m003/emotion-recognition-using-keras-ad7881e2c3c6
Paul Ekman Group. (2020, January 30). Anger. Retrieved October 26, 2020, from https://www.paulekman.com/universal-emotions/what-is-anger/#:~:text=Facial%20expression%20of%20anger,a%20split%2Dsecond%20micro%20expression