Award Date
12-15-2019
Degree Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
Department
Electrical and Computer Engineering
First Committee Member
Brendan Morris
Second Committee Member
Emma Regentova
Third Committee Member
Venkatesan Muthukumar
Fourth Committee Member
Mohamed Trabia
Number of Pages
83
Abstract
In this dissertation, we tackle the task of quantifying the quality of actions, i.e., how well an
action was performed using computer vision. Existing methods used human body pose-based features to express the quality contained in an action sample. Human body pose estimation in actions such as sports actions, like diving and gymnastic vault, is particularly challenging, since the athletes undergo convoluted transformations while performing their routines. Moreover, pose-based features do not take into account visual cues such as water splash in diving. Visual cues are taken into account by human judges. In our first work, we show that using visual representation -- spatiotemporal features computed using a 3D convolutional neural network -- is more suitable as those attend to appearance and salient motion patterns of the athlete's performance. Along with developing three action quality assessment (AQA) frameworks, we also compile a diving and gymnastic vault dataset. Rather, learning an action-specific model, in our second work, we show that learning to assess the quality of multiple actions jointly is more efficient as it can exploit shared/common elements of quality among different actions. All-action modeling better uses the data, shows better generalization, and adaptation to unseen/novel action classes. Taking inspiration from the 'learning by teaching' method, we propose to take multitask learning (MTL) approach to AQA, unlike existing approaches, which follow single task learning (STL) paradigm. In our MTL approach we force the network to delineate the action sample -- recognize the action in detail, and commentate on good and bad points of the performance, in addition to the main task of AQA scoring. Through this better characterization of action sample, we are able to obtain state-of-the-art results on the task of AQA. To enable our MTL approach, we also released the largest multitask AQA dataset, MTL-AQA.
Keywords
action quality assessment; action recognition; caption generation; computer vision; image processing; machine learning
Disciplines
Electrical and Computer Engineering
File Format
File Size
2.7 MB
Degree Grantor
University of Nevada, Las Vegas
Language
English
Repository Citation
Parmar, Paritosh, "On Action Quality Assessment" (2019). UNLV Theses, Dissertations, Professional Papers, and Capstones. 3833.
http://dx.doi.org/10.34917/18608746
Rights
IN COPYRIGHT. For more information about this rights statement, please visit http://rightsstatements.org/vocab/InC/1.0/