posted on 2021-06-08, 08:14authored byNiclas Zeller
This thesis presents the development of image processing algorithms based on a Microsoft Kinect camera system. The algorithms developed during this thesis are applied on the depth image received from Kinect and are supposed to model a three dimensional object based representation of the recorded scene. The motivation behind this thesis is to develop a system which assists visually impaired people by navigating through unknown environments. The developed system is able to detect obstacles in the recorded scene and to warn about these obstacles. Since the goal of this thesis was not to develop a complete real time system but to invent reliable algorithms solving this task, the algorithms were developed in MATLAB. Additionally a control software was developed by which depth as well as color images can be received from Kinect.
The developed algorithms are a combination of already known plane fitting algorithms and novel approaches. The algorithms perform a plane segmentation of the 3D point cloud and model objects out of the received segments. Each obstacle is defined by a cuboid box and thus can be illustrated easily to the blind person. For plane segmentation different approaches were compared to each other to find the most suitable approach. The first algorithm analyzed in this thesis is a normal vector based plane fitting algorithm. This algorithm supplies very accurate results but also has a high computation effort. The second approach, which was finally implemented, is a gradient based 2D image segmentation combined with a RANSAC plane segmentation (6) in a 3D points cloud. This approach has the advantage to find very small edges within the scene but also builds planes based on global constrains.
Beside the development of the algorithm results of the image processing, which are really promising,
are presented. Thus the algorithm is worth to be improved by further development. The developed algorithm is able to detect very small but significant obstacles but on the other hand does not represent the scene too detailed such that the result can be illustrated accurately to a blind person.