GPUCV is a free GPU-accelerated library for image processing and computer vision. It offers an Intel OPENCV-like programming interface for easily porting existing applications. A one-page description is available. A longer presentation and discussion was published at IEEE ICME 2006. (J.-P. Farrugia, P. Horain, E. Guehenneux, Y. Allusse, “GPUCV: A framework for image processing acceleration with graphics processors”, CDROM proc. of the IEEE International Conference on Multimedia & Expo, July 9-12, 2006, Toronto, Ontario, Canada.)
GPUCV: A free GPU-accelerated library for image processing and computer vision
April 2nd, 2007Multi-view stereo vision challenge
November 7th, 2006A multi-view stereo evaluation has been proposed by Steve Seitz et al. The challenge involves recovering 3D reconstructions of complete objects from a large number of views. Among the reported techniques, two out of nine make an intensive usage of GPUs, both yielding large speedups: the work by Pons, Keriven and Labatut that took part in the original competition at CVPR06, and the work by Hornung and Kobbelt. Running times, accuracy and completeness of the methods are reported here. (Steve Seitz et al. A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms, in IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), New York, 2006.)
Robust and Efficient Photo Consistency Estimation for Volumetric 3D Reconstruction
October 24th, 2006The computational power of GPU-based algorithms is receiving increased attention in research on Computer Vision and 3D stereo reconstruction from images. In this context one of the most important ingredients for any 3D stereo reconstruction technique is the estimation of photo consistency. This ECCV06 paper presents a new, illumination invariant photo consistency measure for high quality, volumetric 3D reconstruction from calibrated images. In contrast to current standard methods such as normalized cross-correlation it supports unconstrained camera setups and non-planar surface approximations. The paper shows how this measure as well as the other important stages of the volumetric reconstruction pipeline can be implemented in a highly efficient way by exploiting current graphics processors. The authors’ GPU implementation achieves speedups up to a factor of 85 in comparison to CPU-based algorithms, and allows reconstruction of high quality models with computation times of only a few seconds to minutes, even for large numbers of cameras and high volumetric resolutions. (Robust and Efficient Photo-Consistency Estimation for Volumetric 3D Reconstruction. Alexander Hornung and Leif Kobbelt. European Conference on Computer Vision (ECCV 2006), LNCS, vol. 3952, Springer, 179-190.)
GPU_KLT: A GPU-based Implementation of the Kanade-Lucas-Tomasi Feature Tracker
August 10th, 2006GPU_KLT is an implementation (using OpenGL/Cg) of the popular KLT feature tracker which runs primarily on the graphics processing unit (GPU). The GPU-based implementation emulates Stan Birchfield’s KLT implementation of the original algorithm proposed by Kanade, Lucas and Tomasi (1991). GPU_KLT tracks approximately 1000 feature points within 1024×768 resolution video at 30 Hz on an ATI 1900 XT and at 25 Hz on a Nvidia Geforce 7900 GTX. It can be used for real-time computer vision systems involving object detection, structure from motion, robot navigation and video surveillance. Source code is available for research use on the GPU_KLT webpage (Sudipta N Sinha, Jan-Michael Frahm, Marc Pollefeys and Yakup Genc, “Feature Tracking and Matching in Video Using Programmable Graphics Hardware”,
submitted to Machine Vision and Applications, July 2006.)
Real-Time, GPU-Based Foreground-Background Segmentation
October 6th, 2005Robust and accurate foreground-background segmentation is a relatively small but crucial step in several computer vision applications. It is a key element in surveillance, 3D-modelling from silhouettes, motion capture, or gesture analysis for human-computer interaction (HCI). For several of these, real-time processing is of main importance and thus should be extremely fast. This work by Andreas Griesser of ETH Zurich proposes a high-speed GPU-based implementation that processes image sequences in less than 4ms per frame and frees the CPU from this processing step altogether. Resulting segmentation exhibits compactness and smoothness in foreground areas as well as for inter-frame temporal contiguity. (Project homepage and software download, Andreas Griesser, Computer Vision Lab, ETH Zuerich.)
RoboGamer: Development of robotic TV game player using haptic interface and GPU image recognition
May 26th, 2005“RoboGamer” is a robotic system which is able to play a video game together with a human player. This project realized a physically connected friendly computer player with a simple robotic system that is composed of a video camera, wire based force feedback display SPIDAR and fast GPU image recognition software without any modification of the original video game system. RoboGamer has three functions: autonomous play; augmented effects like force feedback and/or rich graphics added to original old video games; and collaboration play with A.I. and human player via force feedback on the joystick. (http://akihiko.shirai.as/projects/RoboGamer/)
Interactive marker-less tracking of human limbs
March 21st, 2005This paper by Rao et al. at UNC Charlotte describes an algorithm to track human limbs at interactive rates without using markers. 3d point cloud data is derived from a modified visual hull algorithm. This data is fed into a particle filtering algorithm that runs on the GPU. The tracking system runs at interactive rates. (Interactive marker-less tracking of human limbs. Rao S., Hodges L.F to be submitted to Transactions on Visualization and Computer Graphics.)
Real-Time Motion Estimation and Visualization on Graphics Cards
November 27th, 2004This paper by Strzodka and Garbe presents a tool for real-time visualization of motion features in 2D image sequences. The motion is estimated through an eigenvector analysis of the spatio-temporal structure tensor at every pixel location. Post-processing in the form of coloring, blending, threshholding, fading and smoothing helps to select the desired motion features for display. The paper demonstrates several examples of test sequences containing people moving at different velocities. These people are visually marked in the real-time display of the image sequence. The tool is also applied to angiography sequences to emphasize the blood flow and its distribution. The implementation uses DX9 graphics hardware and centers around a vectorized version of the Jacobi method for matrix diagonalization. (Real-Time Motion Estimation and Visualization on Graphics Cards. Robert Strzodka and Christoph Garbe in Proceedings of Visualization 2004, pages 545-552, 2004)
OpenVIDIA Computer Vision on GPUs Project
May 18th, 2004The OpenVIDIA project is a GPL’d Free/Open Source project which implements computer vision algorithms on the GPU using OpenGL and Cg. These papers describe the GPU implementation of a projective image registration algorithm in OpenVIDIA. The current release includes a hand-tracking program used for a gesture recognition interfaces, some simple programming examples, and firewire video input support. OpenVIDIA explores the use of GPU hardware accelerated computer vision in the context of creating Computer Mediated Reality. (James Fung, Felix Tang, Steve Mann, “Mediated Reality Using Computer Graphics Hardware for Computer Vision“, Proceedings of the International Symposium on Wearable Computing 2002 (ISWC2002), Seattle, Washington, USA, Oct 7-10, 2002, pp. 83-89.
James Fung, Steve Mann, “Computer Vision Signal Processing on Graphics Processing Units“, Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2004), Montreal, Quebec, Canada, May 17-21, 2004.)
A Graphics Hardware Implementation of the Generalized Hough Transform for fast Object Recognition, Scale, and 3D Pose Detection
March 16th, 2004This paper presents an implementation of the Generalized Hough Transform (GHT) in DX8 graphics hardware. Given the 3D geometry of an object, the GHT is used to determine its pose, scale and position in an uncalibrated image. Without any a-priori knowledge about the image many different poses and scales must be tested. The implementation achieves a considerable speedup by increasing the operation count in favor of a data stream processing of the otherwise irregular memory access pattern of the GHT. The additional operations are used to regularize the problem, decreasing the number of the required candidate poses. (A Graphics Hardware Implementation of the Generalized Hough Transform for fast Object Recognition, Scale, and 3D Pose Detection. Robert Strzodka, Ivo Ihrke and Marcus Magnor in Proceedings ICIAP 2003, pp. 188-193, 2003.)