Kelly Jakubowski, Tuomas Eerola, Paolo Alborno, Gualtiero Volpe, Antonio Camurri, Martin Clayton


Extracting Coarse Body Movements from Video in Music Performance: A Comparison of Automated Computer Vision Techniques with Motion Capture Data


Frontiers in Digital Humanities, 6 April 2017

Institution of corresponding author


Corresponding author

Martin Clayton


Music, computer vision, video analysis, musical ensemble coordination

Catalogue entry

The researchers are interested in interpersonal interaction and coordination during music performance in a range of genres. They study this through a variety of methods that capture video data, computer vision techniques and video analysis.

In this article they apply three different computer vision techniques to performances in three genres, classical, jazz and pop.

The study shows that the three relatively inexpensive computer vision techniques, frame differencing, optical flow and kernelized correlation filters, can replicate the results of motion capture, enabling an understanding of coordination and interaction outside a constrained laboratory setting.

You will find this article useful in analysing performance, and interaction between performers and more widely in studies of the relationship between movement, gesture and the leader/follower relationship.