Biometrics
Chatting over lunch with a fellow student one day, it transpired that his brother was running a sports-analysis company, Quintic.
Quintic offered a range of software and services that allowed coaches and sportspeople to analyse and improve performance.
Quantifying powerlifting
As we were chatting, my friend mentioned that one of one of their current services allowed powerlifters to measure their power output using only a camera and a laptop.
Power = work done / time taken
Work done = force x distance = weight on bar x height lifted
Power = (weight x delta-height) / frame rate
Measure the height of the bar in every frame of the video and you can calculate the power output.
The challenge - accuracy
Quintic had already developed an algorithm for providing this service but were having real problems with the accuracy of their tracking. The algorithm worked but was producing unacceptably noisy data.
The solution - a new algorithm
After a couple of weeks work, I was able to develop an algorithm that delivered the accuracy they were looking for.
The video in link below shows a man lifting a weight bar. Superimposed onto the original video is a blue line. This blue line represents the position of the bar as calculated by the tracking algorithm.
How it worked
Most problems in vision are about separating signal from noise. How do you separate the information you want - the height of the bar, from the information you don't - everything else in the image.
Although I sadly no longer have images that illustrate the process, the algorithm iteself is reasonably straightforward although perhaps a little difficult to visualise.
1. Noise reduction
In this problem, the noise is everything in the image that isn't the weightlifting bar. The only thing we want to know is the height of the bar in each frame.
First off we convert the video from 3-channel colour to 1-channel greyscale. The entire sequence can now be visualised as a huge 3-D matrix of pixel values:
Matrix size = image height (px) x width (px) x no. frames
Remove the background
Looking head at the first image in this matrix, each pixel represents the top of a shaft of pixels where and layer of depth corresponds to another frame of the video.
In a video with a static background, we can calculate the average or median image over the duration of the video. This median image represents our background.
Having done this, we then subtract the background from each and every frame to give us the foreground. By reducing the video only to foreground information, we instantly get rid of a large chunk of noise.
Edge detection
Now we have the foreground including part of the weightlifter and bits of noise in the background. All we really want though is the bar. Since the bar is essentially just one long horizontal line we can now remove even more noise by reducing the image to edges.
Not only will we reduce to edges but since we've no interest in anything that's not the bar, we don't care about vertical edges and so we'll just look at the horizonatal edges in the image.
2. Reduce 3D video to 2D edge-intensity
We now have a sequence of video that shows little more than a horizontal line moving up and down in the image.
However, we still need to calculate exactly what height that line is at. To do this, we project each image across and onto the y-axis to form a measurement of total horizontal edge plotted against height.
Doing this for each frame and plotting these measurements together then gives us a plot showing the amount of horizontal edge detected in each row of every frame. The image below shows this for a sample section of video.
3. Threshold the image to calculate the height
With almost all the noise removed from the image, we can be confident that the bright line in the image above is an edge-peak corresponding to the powerlifting bar.
Calculating the exact height of the bar in each frame is now a simple matter of thresholding the image and measuring the height of the line for each frame in the sequence.
Other examples
The video below shows a second example of the algorithm at work as well as a slightly adapted version of it being used to track the shaft of a golf club.