Today: June 21, 2025
admin Posted on 6:31 pm

Project Computer Vision – 3D Box Detection

Motivation

In this exercise, we acquire a brightness image (424×512), a depth image (424×512) and a 3D point cloud matrix (424x512x3) from Microsoft kinect camera.  From the datasets stated above, we are asked to estimate the dimension of the object that lies in the center of the scene.

Visualization of the raw data 
Brightness (after Histogram Equalization)
Depth (after Histogram Equalization)
View point: azimuth = -45/ elevation = -135
RANSAC floor plane fitting
Floor plane segmentation (RANSAC fitting)
Relation between camera and point cloud
Fitting top plane and refining of potential  region

Excluding the invalid set of 3D point cloud and floor plane segmentation from above, there is large, well connected region in center of the image. However,  if we apply RANSAC algorithm again to fit the top plane of the box, some regions which are near to the image boundary are often falsely regarded as “top plane”, so that we apply flood-fill algorithm to mark the only region in the middle.

def floodfill_topplane(topPlaneImg):
    """
    apply floodfill algorithm to define more concise top plane
    Input:
    topPlaneImg(array(bool)): top plane image after RANSAC plane fitting
    Return:
    returnImg(array(bool))
    """
    (height,width) = topPlaneImg.shape
    returnImg = np.zeros(topPlaneImg.shape,dtype=np.int64)
    maskIndices = np.arange(height*width).reshape(height,width)
    validSet = maskIndices[topPlaneImg.astype(bool)]
    # retrieve the median "pixel"
    idx = np.floor(validSet.size/2).astype(int)
    x = validSet[idx]%width
    y = np.floor(validSet[idx]/width).astype(int)
    mask = skimage.morphology.flood(np.int64(topPlaneImg),(y,x),connectivity=2)
    returnImg[mask] = 1
    return returnImg.astype(bool)
Top plane segmentation (RANSAC fitting + Floodfill)
Sorting the result of corner detector

The output of corner detector is sorted w.r.t. “cornerness” which is used to specify potential corner candidates in the corner detector, however, if we draw the edges of the box directly with the order of the output of the corner detector, it is not guaranteed that we can generate a convex polygon. As a consequence, we compute the azimuth of 4 corners w.r.t. the geometric center of these and sort it  in an ascending order which denotes that the range of angle lies within -π to π and the direction of rotation is counter clockwise.

def sort_corners(corners):
    """
    sort corners' azimuth in counter clockwise order (3rd->4th->1st->2nd quadrant) 
    w.r.t geometric center(mean value of 4 vertices) of the box
    Input:
    corners(array(int)): output(ndarray) of corner detection
    Return:
    returnMatx(array(int))
    """
    returnMatx = np.zeros(corners.shape,dtype=np.int64)
    ### check Quadrant
    centroid = np.mean(corners,axis=0)
    vector = corners - centroid
    angle = np.arctan2(vector[:,0],vector[:,1])
    ind = np.argsort(angle)
    for i in range(ind.size):
        returnMatx[i,:] = corners[ind[i],:]
    return returnMatx
3D Box Detection
3D Point Cloud (blue: floor plane/orange: top plane)

Box dimension estimation

Length(m)Width(m)Height(m)
cloud10.44020.33190.1837
cloud2 0.43860.3381 0.1904
cloud3 0.43270.3416 0.1932
cloud40.48240.30960.1828

Hardware

CPUIntel 8700K @4.4GHz all cores
RAM16GB @3000MHz
NVMe SSD512GB
GPUGTX 1070Ti 8GB

So far as we know, we can find out that the object is a DHL packet box size L
https://www.dhl.de/en/privatkunden/pakete-versenden/pakete-abgeben/verpacken.html

The execution time for both top plane and floor plane RANSAC plane fitting is about 7 seconds.

Leave a Reply

Your email address will not be published. Required fields are marked *

Social media & sharing icons powered by UltimatelySocial
YouTube
LinkedIn
Instagram
Verified by MonsterInsights