
Project Computer Vision – 3D Box Detection
Motivation
In this exercise, we acquire a brightness image (424×512), a depth image (424×512) and a 3D point cloud matrix (424x512x3) from Microsoft kinect camera. From the datasets stated above, we are asked to estimate the dimension of the object that lies in the center of the scene.
Visualization of the raw data



RANSAC floor plane fitting

Relation between camera and point cloud

Fitting top plane and refining of potential region
Excluding the invalid set of 3D point cloud and floor plane segmentation from above, there is large, well connected region in center of the image. However, if we apply RANSAC algorithm again to fit the top plane of the box, some regions which are near to the image boundary are often falsely regarded as “top plane”, so that we apply flood-fill algorithm to mark the only region in the middle.

def floodfill_topplane(topPlaneImg): """ apply floodfill algorithm to define more concise top plane Input: topPlaneImg(array(bool)): top plane image after RANSAC plane fitting Return: returnImg(array(bool)) """ (height,width) = topPlaneImg.shape returnImg = np.zeros(topPlaneImg.shape,dtype=np.int64) maskIndices = np.arange(height*width).reshape(height,width) validSet = maskIndices[topPlaneImg.astype(bool)] # retrieve the median "pixel" idx = np.floor(validSet.size/2).astype(int) x = validSet[idx]%width y = np.floor(validSet[idx]/width).astype(int) mask = skimage.morphology.flood(np.int64(topPlaneImg),(y,x),connectivity=2) returnImg[mask] = 1 return returnImg.astype(bool)

Sorting the result of corner detector
The output of corner detector is sorted w.r.t. “cornerness” which is used to specify potential corner candidates in the corner detector, however, if we draw the edges of the box directly with the order of the output of the corner detector, it is not guaranteed that we can generate a convex polygon. As a consequence, we compute the azimuth of 4 corners w.r.t. the geometric center of these and sort it in an ascending order which denotes that the range of angle lies within -π to π and the direction of rotation is counter clockwise.

def sort_corners(corners): """ sort corners' azimuth in counter clockwise order (3rd->4th->1st->2nd quadrant) w.r.t geometric center(mean value of 4 vertices) of the box Input: corners(array(int)): output(ndarray) of corner detection Return: returnMatx(array(int)) """ returnMatx = np.zeros(corners.shape,dtype=np.int64) ### check Quadrant centroid = np.mean(corners,axis=0) vector = corners - centroid angle = np.arctan2(vector[:,0],vector[:,1]) ind = np.argsort(angle) for i in range(ind.size): returnMatx[i,:] = corners[ind[i],:] return returnMatx


Box dimension estimation
Length(m) | Width(m) | Height(m) | |
---|---|---|---|
cloud1 | 0.4402 | 0.3319 | 0.1837 |
cloud2 | 0.4386 | 0.3381 | 0.1904 |
cloud3 | 0.4327 | 0.3416 | 0.1932 |
cloud4 | 0.4824 | 0.3096 | 0.1828 |
Hardware
CPU | Intel 8700K @4.4GHz all cores |
---|---|
RAM | 16GB @3000MHz |
NVMe SSD | 512GB |
GPU | GTX 1070Ti 8GB |
So far as we know, we can find out that the object is a DHL packet box size L
https://www.dhl.de/en/privatkunden/pakete-versenden/pakete-abgeben/verpacken.html
The execution time for both top plane and floor plane RANSAC plane fitting is about 7 seconds.