Mimansa Jaiswal awarded Rackham Barbour Scholarship
CSE PhD candidate Mimansa Jaiswal has received a Rackham Barbour Scholarship to support her work creating robust, generalizable, and privacy-preserving emotion recognition systems.
One goal of modern machine learning research is to create models capable of emulating human-like behavior, including the ability to hold a conversation. Models with these abilities could improve automated conversational systems and enhance user experience with chatbots and personal assistants. However, in order to create a truly human-like experience, these chatbots have to go beyond interpreting language to perceiving and producing emotions.
Jaiswal aims to develop systems that enable these emotion recognition capabilities, which are lagging behind other conversational tasks. But she faces several unique challenges: emotion in language is both subjective and highly dependent on the situation, emotional data is far scarcer than language data, generating high quality emotional training data is challenging, and the collection and use of this data runs the risk of being highly invasive.
Jaiswal’s dissertation research grappled with these technical challenges, and proposed a number of solutions. First, she discussed non-representational data. Real emotional data is difficult to obtain, so many of the training sets available were recorded in controlled lab settings. Models using this data produce errors when they encounter more authentic speech or writing. To address this, Jaiswal developed and validated two datasets: Multimodal Stressed Emotion (MuSE), which focuses on introducing a controlled situational confounder (in this case, stress) in an emotion dataset, and Interactive Emotional Dyadic Motion Capture (IEMOCAP), which compiles realistic noisy samples in text, audio, and video. IEMOCAP allowed Jaiswal to measure how much different kinds of noise impact the emotional labels given to a sample by humans and models.
Second, Jaiswal explored the issue of label subjectivity. She focused especially on contextual biasing, when annotators can perceive the same sentence or phrase to have different emotions with or without context. Traditionally, many commonly-used emotion datasets are annotated sentence by sentence so that the annotator has knowledge of the sentence that comes previously, but the models trained on these datasets are often evaluated on individual utterances. She showed how this difference in design choice during the process of dataset creation propagates to possible errors in model evaluation.
Finally, Jaiswal worked on privacy concerns surrounding model understanding. She showed that commonly used emotion recognition models are able to identify the individual whose speech they were trained on, and devised a similar adversarial network that does not retain this information. She also proposed a new metric for emotion privacy preservation that uses different explanations from a general emotion recognition model and from a privacy-preserving model and quantifies humans judge the quality of its privacy preservation. This produces a cost-effective metric that is aligned with human expectations of privacy, is quantifiable and optimizable, and still maintains performance for emotion recognition.
Through these three improvements, Jaiswal hopes to create better emotion recognition models and to understand and isolate the factors that influence their development process. She is advised by Prof. Emily Mower Provost.
The Barbour Scholarship was established in 1917 to support women of the highest academic and professional caliber from Asia and the Middle East.