Use a QR Code reader on a mobile device to add this person as a contact:
Kavisha Jayathunge is working towards his PhD (supervised by Dr. Xiaosong Yang) in Computer Vision, specifically an unsupervised approach to categorising human emotions from audiovisual input.
He is also participates in teaching at Bournemouth University, where he is a lecturer in the MSc Animation Software Engineering unit.
He holds an MEng in Electronics and Software engineering from the University of Glasgow.
A significant portion of human communication involves parts of speech that are not linguistic (i.e., facial expressions, vocal intonations etc.). Collectively, these properties are called paralinguistic information. Due to the way the human brain is structured, information from different modalities, i.e. sight, hearing, are combined in ways that exploit the dependencies within these modes. This is important in the context of automatic emotion classification software, because paralinguistics convey extra information about the speaker which could be used in some way -- a good example would be to infer emotional state from this information and change the way a chat-bot replies, improving the user experience by making the conversation feel more `natural'.
Deep learning based tools of this type or others may be, and indeed are often used to provide services to the public. Therefore, it is important to make sure that human biases that arise from labelling and dataset composition do not discriminate against the users of such a system. The incorporation of human social biases when labelling data manually is a well known problem in this field. Additionally, the process of manually labelling data is time-consuming, which in turn results in small datasets.
Recently, emotion detection models have begun using transformer-based architectures to combine the audio and visual modalities of speech, and this has shown promising results. However, most of these models are trained in a supervised learning scheme with labelled data, which suffers from the problems mentioned earlier... We hypothesise that training these models in an unsupervised setting would, by facilitating the inclusion of large datasets of varying social composition, allow the creation of less biased representations of paralinguistic features.more
Profile of Teaching PG
- Animation Software Engineering 22/23
- MEng in Electronics and Software Engineering (University of Glasgow, 2019)