My name is Tom Smith, I am a third year PHD student in the Computer Vision Lab at the University of Nottingham. My current research is focused around improving learnt computer vision techniques with small amount of data. With an overall goal of creating a system to help paediatrician analyse the performance of resuscitation techniques on postnatal babies. Helping me are my four supervisors: Dr Michel Valstar, Dr Don Sharkey, Dr Jon Crowe, and Dr Mercedes Torres Torres.
Learning Superpixels with Multichannel Connected Graphs
Using Convolutional Neural Networks (CNNs) to learn features has surpassed the capabilities of hand-crafted algorithms for computer vision problems, as long as there is a sufficiently large dataset. Theoretically this should also hold for the unsolved problem of semantic segmentation of arbitrary objects using CNNs. However, we are interested in scenarios where there are very little data, which poses a challenge in order to benefit from the power of deep learning. To overcome this challenge we would like to use superpixels, yet in order for us to enable an end-to-end deep learned system we would require those superpixels to be learned rather than hand-crafted. In this paper we propose a novel technique of deep learning superpixels by training a network with hand-crafted algorithms providing the ground truth and can thus be generated with infinite amounts of data. We show that it is not possible to learn superpixels directly from RGB pixels and introduce a novel graph-based representation of superpixels that allows a CNN to learn the superpixels as a multichannel output, which we call a multichannel connected graph (MCG). Our results show that this does enable a CNN to learn the target MCG representation of superpixels. Astonishingly, using these two in conjunction has lead to a surprising result where the network has learned to abstract from superpixels to segmentation.
Semantic Segmentation using Multichannel Connectivity Graphs and Deep Learned Superpixels
Using Convolutional Neural Networks to semantically segment meaningful parts of an image or video is still an unsolved problem. This becomes even more apparent when a relatively small dataset is used. While using RGB input is sufficient for a large labelled dataset, achieving high accuracy on a small dataset can be difficult. This is because there is a limited amount of knowledge that can be learnt from a small dataset without overfitting. We show that the addition of superpixels to represent an image in our network improves the segmentation if those superpixels are appropriately represented. Here we use the recently proposed multichannel connected graphs (MCGs) to indicate which pixels belong to each superpixel by noting in separate channels if pixels left, right, above, or below belong to the same superpixels. We show how using deep learned superpixels can result in an additional improvement, allowing superpixel learning on very large unlabelled data and end-to-end fine-tuning on small domain-specific datasets. We show that the addition of MCGs and the use of deep learned superpixels improves segmentation accuracy on our domain specific datasets of around 300 instances. Additionally, our system can be easily extended to work with a large number of individually labelled segments by simply changing the number of channels for the output mask.
Computer Vision Representative for PGR-LCF (2017-18)
The Learning Community Forum (LCF) aims to ensure that the views of the postgraduate research (PGR) students are given proper weight and that concerns they may have about supervision, progress, specific training, development opportunities, career, etc… are being addressed. Minutes of the Forum are taken into consideration in the review to promote a vibrant and thriving learning community for the students.
- Lead Demonstrator, G53MLE Machine Learning (2018-2019)
- Lab Assistant, G53MLE Machine Learning (2017-2018)
- Lab Assistant, G53GRA Computer Graphics (2016-2017)
I am one of the web chair for http://acii-conf.org/2019/
Annotated two emotion datasets for DynEmo
Annotated the LITTER dataset for the MCG work
Feel free to contact me at Thomas.Smith3@nottingham.ac.uk regarding anything related to my research.
If you need to find my at the university I am located at the following address:
B82 Computer Science Building
The University of Nottingham