Robot

Emotion Recognition using Keras on the CK+ Dataset

Introduction

I am researching Explainable Artificial intelligence (XAI). I chose to restart my experiments on heat-maps, explanation, and neural networks. In a new series of blog posts, I plan to write about my progress on creating an evaluating a XAI heatmap.

Tutorials

I built my emotion recognition neural network using a tutorial titled Emotion Recognition Using Keras. I followed the instructions on the ReadMe.md document which came with the GitHub repository link.

A common problem with any internet tutorial is code becoming out of date. Even though the GitHub repository link was published in November 2019 it still needed updating.  I searched for the error messages when they appeared in the terminal and I was able to update the code.

Once I finished following the tutorial, the neural network only had 17% accuracy. This was not the level of accuracy I expected. The website Machine Learning Mastery was very helpful in helping me to correct this.  It suggested that I set the Adam variable to the recommended setting for Keras. When I did that the accuracy shot up to 68%.

The Adam variable holds the needed information that the Adam algorithm needs to function. The Adam algorithm is an adaptive learning rate optimization algorithm .  It can be set in Keras. The Adam variable can be set to anything, however I found that it functioned well when set to the recommended settings.

Once the neural network had the expected accuracy then I wanted to examine the algorithm further. I used an ELI5 package to develop a heat-map to explain why the neural network labelled an angry face as a neutral face. Like the Emotion Recognition Using Keras tutorial, the ELI5 tutorial needed work to make it work with the neural network.

Processing the images and testing the NN

The neural network was tasked to label images from the Cohn-Kanade plus dataset. The dataset consists of many series of images. The image series show a person displaying the required emotion, going from a neutral face to a face showing the maximum amount of emotion.

Images from the dataset follow a naming convention. We are using the image name S138_004_00000013_ANG.png as an example. The first part of the name (“S138_004”) identifies the folder where the pictures are stored. The second part of the name (“00000013”) identifies the position of the image in the series, with 00000001 indicating the first image. I attached a three-letter code to the end of the name (“ANG”) to indicate the emotion depicted. The abbreviations were chosen because they all started with different letters, making programming easier.

Three-letter AbbreviationUn-abbreviated wordEmotion
ANGAngryAngry
FEAFearScared
HAPHappyHappy
MISMiserySad
SURSurpriseSurprised

The ‘neutral’ emotion has no three-letter code. A neutral emotion defined for the purposes of this experiment as any image taken from the first or second part of each image series.

I only looked at four pictures of each emotion, except for the neutral emotion which took the first picture in a series showing each emotion. As the CK+ dataset does not contain any explicitly labelled ‘neutral’ images, I believed it was the fairest way of testing the NN on ‘neutral’ images.

Results

The most significant result from this preliminary test is that there is not a strong correlation between the last image of a series and the accuracy or prediction strength of the neural network.

I have attached a pdf of the results of my preliminary experiments. Each emotion (“angry”, “scared”, “happy”, “sad”, “surprised”, and “neutral”) has a number of slides with images of people emoting, the image labelled, the image with a heatmap showing the areas of interest of the NN, a graph showing the probability that the photo is of each emotion.

In this slide you can see that the neural network predicted that it was most likely that this image showed a person making a ‘neutral’ face, and that it was second most likely that the person was making an ‘angry’ face. The heat map shows that the neural network is focusing on the persons brow, chin, and left ear (from the viewers perspective).

The Paul Ekman group, named after the Paul Ekman who designed the Facial Action Coding methodology used in the CK+ dataset to pose expressions, describes an angry face like so: “In anger the eyebrows come down and together, the eyes glare, and there is a narrowing of the lip corners. The heatmap shows a focus on the eyebrows and one of the corners of the eye, and one of the corners of the mouth. This is shows the neural network is focusing on the expected aspects of the face.

However, the neural network did not correctly identify the person as angry. This may because of the other things it is focusing on, as shown in the heatmap. However more research needs to be done with a broader dataset before coming to any concrete conclusions.

Conclusion

There is a lot more research to be conducted using heatmaps, the CK+ dataset, and neural networks. I think it will be interesting to look further into the correlation of image position in series and accuracy or prediction strength of the neural network.

I would also like to determine if the heatmaps will help to give a good explanation of why the NN labelled the image as it did. Further I would like to know if this explanation would help a well-educated non-expert determine if an algorithm was properly designed. I will do this by overfitting and overtraining one NN on the CK+ dataset and comparing it to a properly designed NN.

Footnote

To create the slides showing the four pictures together I used Microsoft PowerPoints’ photo album function and selected “four pictures” on each slide and arranged the pictures so that the pictures all referred to the same base image.

References

Brownlee, J. (2019, September 19). Machine Learning Mastery. https://machinelearningmastery.com/

Brownlee, J. (2020, August 20). Gentle introduction to the Adam optimization algorithm for deep learning. Machine Learning Mastery. Retrieved October 26, 2020, from https://machinelearningmastery.com/adam-optimization-algorithm-for-deep-learning/

Bushaev, V. (2018, October 24). Adam – latest trends in deep learning optimization. Medium. Retrieved October 26, 2020, from https://towardsdatascience.com/adam-latest-trends-in-deep-learning-optimization-6be9a291375c

Lucey, P., Cohn, J. F., Kanade, T., Saragih, J., Ambadar, Z., & Matthews, I. (2010). The extended Cohn-kanade dataset (CK+): A complete dataset for action unit and emotion-specified expression. 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition – Workshopshttps://doi.org/10.1109/cvprw.2010.5543262

Mohammad, C. A. (2020, January 10). Emotion recognition using Keras. Medium. Retrieved October 26, 2020, from https://medium.com/@ee18m003/emotion-recognition-using-keras-ad7881e2c3c6

Paul Ekman Group. (2020, January 30). Anger. Retrieved October 26, 2020, from https://www.paulekman.com/universal-emotions/what-is-anger/#:~:text=Facial%20expression%20of%20anger,a%20split%2Dsecond%20micro%20expression

Robot

Centrelink RoboDebt: A Blog Post

The Centrelink Logo

Introduction

Services Australia (formally known as Centrelink) is a program of the Australian government that provides “social security payments and services to Australians” (Services Australia). In July 2016 Centrelink launched a new compliance program designed to recover debt from overpaid clients. This compliance program was nicknamed “RoboDebt” by the Australian press, though it is not clear who first coined the phrase, after it caused a lot of unnecessary stress and anxiety by poorly communicating why the client was being chased for debt, and asking by clients to provide old or non-existent employment records. Over 10,000 people registered for the class action suit against the Australian government based on their mishandling of the audit (Henriques-Gomes, 2020).

From Victoria Legal Aid

The passages describing it the Ombudsman’s report (Commonwealth Ombudsman, 2017, p. 40) have convinced me that the RoboDebt uses some AI technology. The report revealed that the Centrelink compliance system (“RoboDebt”) used Expert Systems and Fuzzy Logic to find possible debtors, and dismiss people from the list of debtors who Centrelink could have reason to believe shouldn’t be there.

The use of AI technology in the creation of the RoboDebt debacle makes it directly relevant to my thesis on Explainable Artificial Intelligence. I am particularly interested in the best way to explain a decision of an AI. “RoboDebt” wasn’t merely an administrative failure to handle the debt collection process but it was a failure to explain the decisions of an AI.

Though some may argue that that the failure to explain the decisions of an AI is still a bureaucratic issue, I disagree. I believe that the solution to avoiding similar issues in the future is to facilitate a dialog between policy makers, business case analysts, coders, customer service people, and clients. There has always been a need for a way for bureaucrats and the people who code their requests to find a common ground and work together, Explainable AI is focused on bridging that gap and allowing the communication to flow.

Ombudsman’s Review (April 2017)

In April 2017 the Commonwealth Ombudsman published a report focusing on complaints made to it about the RoboDebt compliance scheme. The report covered many aspects of the RoboDebt debacle, but because of my interest in Explainable Artificial Intelligence, I will be covering two aspects in detail: the calculation of debt, and the explanation as to why the client was found to owe money to Centrelink.

The Calculation of Debt

The RoboDebt scheme used Centrelink and Australian Taxation Office (ATO) data to identify overpayments. When the client had not entered a fortnightly income statement or if there were gaps in the clients’ declarations, Centrelink calculated their fortnightly income by dividing up the yearly income declared to the ATO into fortnightly amounts. Previously this was done manually by Centrelink compliance officers (Commonwealth Ombudsman, 2017, p. 1).  In July 2016 Centrelink rolled out a program (the so called “RoboDebt”) to identify overpayments to clients automatically (Commonwealth Ombudsman, 2017, p. 1).

While there was nothing wrong with the process to identify overpayments, requests for information that was old and hard to find and poor communication of about the calculation of overpayment led to great frustration from the public.

The Communication and Explanation of Debt

The most mismanaged part of the “RoboDebt” scheme was the communication to Centrelink clients about the identification of possible over-payments and the possibility of incurring a debt to Centrelink (Commonwealth Ombudsman, 2017, p. 9). Some clients did not receive the letters as these had been sent to the wrong address (Commonwealth Ombudsman, 2017, p. 10), and some felt they had not been given adequate time to respond (Commonwealth Ombudsman, 2017, p. 14). The clients who did receive the initial letters found them confusing and there was no listed helpline to explain the letter (Commonwealth Ombudsman, 2017, p. 2).

The letters to the clients announcing the possibility of the client having been overpaid, and the debt incurred by the client was unclear. Letters also did not specify the consequences of ignoring the letter (Commonwealth Ombudsman, 2017, p. 15). The client was instructed to enter the correct details on the Centrelink website, itself a confusing nightmare of links and fields to enter (Commonwealth Ombudsman, 2017, p. 11).

This lack of clear communication was preventable, for a properly rolled out computer program would have had more staff trained to help clients and would have consulted clients and staff about the clarity of the letters and the Centrelink website. The Ombudsman recommends these changes in their report (Commonwealth Ombudsman, 2017, p. 26).

Conclusion

Centrelink is an organisation which regularly deals with some of the most vulnerable members of society. I believe Centrelink has have a duty to communicate clearly with their clients, especially when delivering upsetting news. The RoboDebt debacle could have been largely avoided by properly rolling out the new compliance program and consulting with clients and staff about the clarity of their communication and the availability of client support.

Four years on from the first roll-out of the RoboDebt program it still appears in the Australian news as an example of bad government communication.

References

Cambridge University Press. (2020). Ombudsman. In Cambridge dictionary. Retrieved July 4, 2020, from https://dictionary.cambridge.org/dictionary/english/ombudsman

Commonwealth Ombudsman (2017). Centrelink’s automated debt raising and recovery system (02|2017). https://www.ombudsman.gov.au/__data/assets/pdf_file/0022/43528/Report-Centrelinks-automated-debt-raising-and-recovery-system-April-2017.pdf

Henriques-Gomes, L. (2020). Almost 10,000 people register as robodebt class action gains momentum. The Guardian. Retrieved from https://www.theguardian.com/australia-news/2020/feb/04/almost-10000-people-register-as-robodebt-class-action-gains-momentum

Services Australia. (n.d.). Centrelink. Retrieved July 4, 2020, from https://www.servicesaustralia.gov.au/individuals/centrelink

Robot

Computer Algorithms aren’t necessarily Artificial Intelligence Algorithms

Difference Between Algorithms and Artificial Intelligence

Puram, A. D. (2019, January 7). Artificial intelligence vs. a clever algorithm – What’s the difference? AI Trends. Retrieved June 11, 2020, from https://www.aitrends.com/ai-software/software-development/artificial-intelligence-vs-a-clever-algorithm-whats-the-difference/

The above article is an excellent discussion of the difference between regular computer algorithms and AI algorithms. To summarize: the big difference is how the algorithm is programmed. Additionally computer algorithms have clearly defined outputs, but AI algorithms do not.

A Visualization of How Computer Science, Artificial Intelligence, and Machine Learning Connect

Computer Science (C.S.)

noun The science that deals with the theory and methods of processing information in digital computers, the design of computer hardware and software, and the applications of computers.

Artificial Intelligence (A.I.)

noun The capacity of a computer to perform operations analogous to learning and decision making in humans, as by an expert system, a program for CAD or CAM, or a program for the perception and recognition of shapes in computer vision systems. 

Explainable Artificial Intelligence (X.A.I.)

noun A branch of artificial intelligence that explores the best, and most appropriate way, to explain the decisions of an artificial intelligence algorithm

Machine Learning (M.L.)

noun A branch of artificial intelligence in which a computer generates rules underlying or based on raw data that has been fed into it

All definitions (apart from the Explainable Artificial Intelligence definition) taken from Dictionary.com

References

Dictionary.com. (2020). Retrieved June 11, 2020, from https://www.dictionary.com/

A. Meers, S. Durvasula, T. Newton, & Macleod, L. (2017). Lessons learnt about digital transformation and public administration: Centrelink’s online compliance intervention. Australian Government Retrieved from https://www.ombudsman.gov.au/__data/assets/pdf_file/0024/48813/AIAL-OCI-Speech-and-Paper.pdf

Robot

Regulating AI for Fun and Profit

The Union Jack and the European Union flag in front of a building in London 
Kellam, D. (2007, February 24). The Union Jack and the European Union flag in front of a building in London [Photograph]. Wikimedia Commons. https://commons.wikimedia.org/wiki/File:Union_Jack_and_the_european_flag.jpg

In January 2012, the European Union set out to regulate data collection and use (Agreement on Commission’s EU data protection reform will boost Digital Single Market, 2015). In December 2015 the European Commission put forward its EU Data Protection Reform, commonly known as the GDPR (General Data Protection Regulation) (Palmer, 2019). The GDPR was an important milestone in the field of Explainable Artificial Intelligence (XAI). The GDPR will require AI algorithms to provide explanations of the decisions that were made on user-level predictions (Doshi-Velez & Kim, 2017, p. 2).

In January 2020, the United Kingdom’s Information Commissioner’s Office and the Alan Turing Institute started a project to consult on their upcoming publication: Explaining decisions made with AI (“ICO and the Turing consultation on explaining AI decisions guidance,” 2020) . In July 2020, the resulting publication was published. It contains three parts: Part 1: The basics of explaining AI; Part 2: Explaining AI in practice; Part 3: What explaining AI means for your organisation.

Part 1: The basics of explaining AI is a great introduction to XAI for laypeople, and contains valuable research for XAI researchers. I can recommend it to anyone who is interested in the field of XAI.

References

Agreement on Commission’s EU data protection reform will boost Digital Single Market (IP/15/6321). (2015). European Commission. https://ec.europa.eu/commission/presscorner/detail/en/IP_15_6321

Doshi-Velez, F., & Kim, B. (2017). Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608.

ICO and the Turing consultation on explaining AI decisions guidance. (2020, May 20). Information Commissioner’s Office. https://ico.org.uk/about-the-ico/ico-and-stakeholder-consultations/ico-and-the-turing-consultation-on-explaining-ai-decisions-guidance/

Information Commissioner’s Office, & Alan Turing Institute. (2020). Explaining decisions made with AI – Part 1: The basics of explaining AIhttps://ico.org.uk/media/about-the-ico/consultations/2616434/explaining-ai-decisions-part-1.pdf

Palmer, D. (2019, May 17). What is GDPR? Everything you need to know about the new general data protection regulations. ZDNethttps://www.zdnet.com/article/gdpr-an-executive-guide-to-what-you-need-to-know/

Parliament and Council of the European Union. (2016). General data protection regulationhttps://gdpr-info.eu/

Rogynskyy, O. (2019, September 6). Council post: What GDPR means for businesses with an AI strategy. Forbes. https://www.forbes.com/sites/forbestechcouncil/2019/09/06/what-gdpr-means-for-businesses-with-an-ai-strategy/#456ecf9858dc

Robot

An Explanation of a NN decision to label a knife as a dog

The neural network

I programmed a neural network (NN) to help me understand how to explain NN decision-making. I first had to decide what architecture to use. Since I wanted to be able to create heat-maps highlighting the areas of an image apparently most relevant to the NN’s decisions, I used Keras. Keras was a good choice because several different heat-map programs were available that worked with it.

I followed a Keras tutorial to get started. This classified images of cats and dogs. My NN achieved 95% accuracy in its discriminations, that is, it was able to correctly identify an image as containing a dog (or cat) 95% of the time.

I then followed another tutorial, provided by the coders of the ELI5 program, to create heat-maps. This showed the importance values of each neuron for a particular decision (Korobov & Lopuhin, 2017). I then ran the heat-maps on some sample data using a looping program.

I noticed that although the dogs’ faces were always of interest when the NN classified the image as a dog (fig. 1), the images classified as cats had random areas of interest (fig. 2).

The cutting knife of truth

To try to understand these heat-maps, I fed the NN classifier three images of not-dogs and not-cats, in fact of knives (fig. 3). The knife image had nothing in the background to distract the NN.

The NN classified the image of the knife as a dog.  (A heat-map of this classification is shown in fig. 4). The NN classifier was 99.99958% sure that fig. 3 showed a dog.

Other people’s research

Samek et al. argue that ‘Although humans are able to intuitively assess the quality of a heatmap by matching with prior knowledge and experience of what is regarded as being relevant, defining objective criteria for heatmap quality is very difficult.’ (Samek, Binder, Montavon, Lapuschkin, & Müller, 2016, p. 5)

I agree. It is easy to see a heat-map as confirmatory evidence for a conclusion already arrived at. Just as neural correlates of a conscious experience are question-beggingly held to ‘underlie’ the experience itself, heat-maps are useful only as ‘proof’ for what is already known.

Conclusion

It is not easy to understand the NN’s ‘reasons’ for classifying fig. 3 as a dog. It may have been the knife’s shadow, colour, or shape, or some combination of these.

Unfortunately, we do not know, and heat-maps do not help us to understand why.

It may be that the task given to the NN was simply beyond its designed capabilities. The NN was programmed to make binary discriminations; perhaps it was unfair to extend this, adding the possibility of neither-P-or-not-P to the original binary choice of P or not-P.

This experiment shows the issues with solely relying on heat-maps for an explanation of a NN decision. Although heatmaps can be an entertaining party trick, they are not suitable by themselves for an in-depth discussion of why an image was classified as it was.

References

Korobov, M., & Lopuhin, K. (2017). ELI5 0.9.0 documentation. Retrieved February 14, 2020, from https://eli5.readthedocs.io/en/latest/index.html

Samek, W., Binder, A., Montavon, G., Lapuschkin, S., & Müller, K.-R. (2016). Evaluating the visualization of what a deep neural network has learned. IEEE transactions on neural networks and learning systems, 28(11), 2660-2673.

Charlotte Y. Writes · XAI

A knife-edge decision

My field of study, called ‘Explainable Artificial Intelligence’ (XAI), investigates and evaluates the reasons that underlie the decisions of artificially intelligent machines (AIs).

As AI becomes more powerful and widely used, there is increasing concern about its potential misuse, and there have been calls for regulation. If some AI applications do indeed need to be controlled and restricted,  the first step will be to develop a robust method of explaining how the AI reasoned, how it reached its conclusions.

AI has several forms. The AI I discuss below is called a neural network (NN). [See “Neural Networks and Deep Learning” for an explanation of what an NN is and how it works.]

To test an AI’s reasons for its decisions, I first created a need for an AI explanation. For simplicity, I built an NN that could differentiate between the image of a cat and that of a dog, an apparently straightforward task of binary classification, easy for people and not
hard for AIs.  Most NNs trained on this sort of task achieve an accuracy of 95%. I wanted to know how my AI arrived at its decisions.

In an earlier project, I used a NN architecture called ‘Caffe’. This time I used a widely-employed NN called Keras, which comes with networks already built for various tasks. However, more importantly, I used Keras because the Python package ELI5 works with it. (Python is a programming language; ELI5 is a set of Python routines.)

Importantly for my purposes, ELI5 can be made to produce a heat-map, a way of representing data graphically. Just as a pie chart displays the data in slices of a size proportional to quantity, a heat-map displays some given aspect of the data in different colours. A Bureau of Meteorology weather map of Australia that pictures today’s temperatures in shades of more intense and less intense red is a heat map (literally).

I began the experiment by using routines available at machinelearningmastery.com. The accompanying tutorial included instructions for building a simple NN to predict if an image was that of a cat or of a dog. The decision was binary; there was no category ‘Other’.

First I trained the NN to perform the identification—‘training’ means showing the NN correctly-labelled images so that it has success-data upon which to base its ‘reasoning’.

Then I created a heat-map to display which part of the image was of the greatest interest to the NN, that is, to show which area counted most heavily in the NN’s classification decision.

In interpreting the results of this experiment it became clear to me that I had created an NN that could identify dogs from their facial features. If the image did not have the facial features of a dog in it somewhere, it was marked as ‘cat’. My NN could identify dogs and
not-dogs but it could not reliably differentiate between dogs and cats.

To test this theory I fed more pictures to the NN—more cats, and then some non-cats that were not dogs. I gave it knives to look at. The NN labelled every knife as a dog.

Although my neural network achieved an accuracy of 95% in identifying dogs, by using heat-maps I was able to show that this was really a spurious success. Elements of the image other than the intended targets—cats and dogs— were being used by the NN to make its classification decision.

Charlotte Y. Writes · Robot

We’re back!

stack-letters-letter-handwriting-family-letters-51191.jpeg

I (Charlotte Y.) have been away for some time. I have been dedicating all my time to my thesis, and then I went on holidays to the beach. I had a lovely time. I received a Second Class Honours for my work this year and I am very pleased.

My thesis was called:

The Importance of Data Selection for Facial Emotion Recognition with a Neural Network

The abstract of the thesis was as follows:

Emotion recognition using machine learning recognises the emotions conveyed by facial expressions from videos. Many emotion-recognition papers are submissions to emotion recognition competitions and aim to submit a complete solution: from the selection of source material through to the identification of features and classification. The type of paper encouraged by competition gives limited insight into which aspects of a system are most responsible for its performance. However, research aimed at improving parts of emotion recognition systems could also improve the overall performance of a system. Data selection is a key but a largely unexplored aspect of the emotion recognition system and therefore there is a need for research into its effect on the system’s accuracy. This thesis aims to fill the identified gap in research by concentrating on data selection. The hypothesis of this thesis is that the accuracy of facial emotion recognition reported in the literature has been overstated due to issues regarding the selection of data for training and testing. In order to test this hypothesis the data selection technique of a paper, DeXpression: Deep Convolutional Neural Network for Expression Recognition (Burkert, Trier, Afzal, Dengel, & Liwicki, 2016), was replicated.

Charlotte Y. Writes · Lump the Dog · Pablo · Paloma · Robot

Open Day

Open Day was a success and Tanya and I talked to many people about the university and all the opportunities there.

Lump2
Lump preparing to meet his fans

OpenDay1
Children playing with Pablo

 

OpenDay4
A child uses a mouse to control Paloma and make her draw

Paloma_1
Paloma being put through her paces prior to meeting people

Paloma_Tanya_Charlotte_1
Tanya (right) and I (left) collaborating on Paloma

Charlotte Y. Writes · Lump the Dog · Robot

Lump the Dog joins the family

Pablo and Paloma would like to welcome the newest member of the family: Lump the Dog. Lump is an Aibo ERS7, and was used to research artificial intelligence by Federation University before joining the Picasso family. Lump is about 13 years old this year.

ChildAndAIBO
A cousin of Lump the Dog being played with by an unknown child. Source: WikiCommons
Lump is named after Pablo Picasso’s dog, who was a dachshund. Pablo Picasso was very fond of the dog, and while Pablo Picasso was not the owner, Lump lived with Picasso for 6 very happy years. Wikipedia rather delightfully refers to Lump as Pablo Picasso’s muse.

PicassoAndLump_D-Duncan
Picasso and Lump Photography by David Douglas Duncan                                             From: http://www.anothermag.com/design-living/gallery/1901/picassos-sausage-dog/1
 

LumpSketch
Picasso’s sketch of Lump © Pablo Picasso                                                                                             From: http://www.anothermag.com/design-living/gallery/1901/picassos-sausage-dog/2
As it is the National Day of Dogs today I thought it was doubly appropriate to introduce Lump. All the robots will be introduced during the Federation University Open Day tomorrow so I hope it goes well.