Multimedia, Image and Video Processing, Algorithms and mathematics, Processor architectures, Software engineering
With the announcement by Rasperry Pi of the ultra-low-cost Pi Compute Module (see www.raspberrypi.org/raspberry-pi-compute-module-new-product/) it is now possible to create volume products using the Rasperry Pi, building on its easy programming environment, simplicity of attaching hardware and wealth of developer resources.
The Pi has powerful multimedia capabilities, but whereas general programming on the Pi is straightforward, getting the best out of the multimedia system and programming the VideoCore blocks that make it up is more complex.
Argon Design is expert in programming these systems and can turbocharge multimedia algorithms running on the Pi. To mark the availability of the Compute Module and illustrate our skills, we have produced a demonstration.
The demonstration is of stereo depth perception - getting depth information from the different views seen by two cameras spaced a short distance apart. This also makes use of another exciting feature of the Pi Compute Module which is its support for two cameras (the standard Pi only supports one).
There are several good algorithms documented for depth perception in the literature. Many of these share similarities with video compression algorithms, a field with which Argon Design has a great deal of experience. Both are based on dividing images into blocks and, for each block in one image, searching for "matching" blocks in one or more other images.
However, we need much more reliable results for depth perception than for video compression. The literature also documents many ways to improve the output of the basic algorithm. As this was a proof-of-concept, our criteria were to choose improvements which would not require too much time to implement, which were not too computationally expensive and which gave decent quality improvements. The basic differences between block-based video compression and our final algorithm are:
Specifically, we decided on a combination of the "C5" correlation function from [2] with the multi-window scheme of [1]. Our program can use either the 5x5 windowing scheme or the 7x7 scheme - overall, the 7x7 version takes longer to calculate but is slightly more accurate
There are more sophisticated "global" schemes for stereo depth, such as Semi-Global Matching, Belief Propagation and Graph-Cut. However these schemes are too complex to run in real-time without specialised hardware.
The implementation went through three different versions:
The approximate processing times for a moderate-sized image (768x576 pixels) in each version were as follows:
Note that the VideoCore version took only around 50% longer than the x86 C version on a processor with around 12x the clock speed. This demonstrates the improvement which can be had by using a specialised digital signal processor.
We also created a demo application which would read images from the two cameras on the Compute Module, process them and display the calculated depth as a colour map. The camera processing and display added a small amount of overhead, so to compensate we reduced the image size to VGA (640x480). This resulted in a final framerate of 12fps. This is sufficient for a proof-of-concept and shows the sort of image processing tasks that can be implemented with reasonable speed on the Raspberry Pi.
Please don’t ask us if we can give you the code for your own project. The demo is not in a form that is suitable to be released and we can only provide advice on stereo algorithms on a commercial basis.
The demonstration application we produced can display the results in three different ways:
Screenshots of the results are below, along with some photos of the equipment used.
However, these are far from the only possible applications. Others include:
Do you have a project that you would like to discuss with us? Or have a general enquiry? Please feel free to contact us
Contact us