{"id":27,"date":"2020-05-02T18:31:25","date_gmt":"2020-05-02T18:31:25","guid":{"rendered":"https:\/\/jasoninerlangen.myqnapcloud.com:8081\/WordPress\/?p=27"},"modified":"2020-05-28T11:41:48","modified_gmt":"2020-05-28T10:41:48","slug":"project-computer-vision-3d-box-detection","status":"publish","type":"post","link":"https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/project-computer-vision-3d-box-detection\/","title":{"rendered":"Project Computer Vision &#8211; 3D Box Detection"},"content":{"rendered":"<h6>Motivation<\/h6>\n<p>In this exercise, we acquire a brightness image (424&#215;512), a depth image (424&#215;512) and a 3D point cloud matrix (424x512x3) from Microsoft kinect camera.&nbsp; From the datasets stated above, we are asked to estimate the dimension of the object that lies in the center of the scene.<\/p>\n<h6>Visualization of the raw data&nbsp;<\/h6>\n\n\n<figure class=\"wp-block-image size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/jasoninerlangen.myqnapcloud.com:8081\/WordPress\/wp-content\/uploads\/2020\/05\/Intensity.png\" alt=\"\" class=\"wp-image-59\" width=\"626\" height=\"469\" srcset=\"https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-content\/uploads\/2020\/05\/Intensity.png 640w, https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-content\/uploads\/2020\/05\/Intensity-300x225.png 300w\" sizes=\"auto, (max-width: 626px) 100vw, 626px\" \/><figcaption>Brightness <em>(after Histogram Equalization)<\/em><\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"640\" height=\"480\" src=\"https:\/\/jasoninerlangen.myqnapcloud.com:8081\/WordPress\/wp-content\/uploads\/2020\/05\/Distance.png\" alt=\"\" class=\"wp-image-81\" srcset=\"https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-content\/uploads\/2020\/05\/Distance.png 640w, https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-content\/uploads\/2020\/05\/Distance-300x225.png 300w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><figcaption>Depth (after Histogram Equalization)<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"640\" height=\"480\" src=\"https:\/\/jasoninerlangen.myqnapcloud.com:8081\/WordPress\/wp-content\/uploads\/2020\/05\/Point_Cloud_1.png\" alt=\"\" class=\"wp-image-176\" srcset=\"https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-content\/uploads\/2020\/05\/Point_Cloud_1.png 640w, https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-content\/uploads\/2020\/05\/Point_Cloud_1-300x225.png 300w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><figcaption>View point: azimuth = -45\/ elevation = -135<\/figcaption><\/figure>\n\n\n<h6>RANSAC floor plane fitting<\/h6>\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"640\" height=\"480\" src=\"https:\/\/jasoninerlangen.myqnapcloud.com:8081\/WordPress\/wp-content\/uploads\/2020\/05\/floor_plane_segmentation.png\" alt=\"\" class=\"wp-image-58\" srcset=\"https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-content\/uploads\/2020\/05\/floor_plane_segmentation.png 640w, https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-content\/uploads\/2020\/05\/floor_plane_segmentation-300x225.png 300w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><figcaption>Floor plane segmentation (RANSAC fitting)<\/figcaption><\/figure>\n\n\n<h6>Relation between camera and point cloud<\/h6>\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"527\" src=\"https:\/\/jasoninerlangen.myqnapcloud.com:8081\/WordPress\/wp-content\/uploads\/2020\/05\/kinect-camera-assumption-1024x527.png\" alt=\"\" class=\"wp-image-198\" srcset=\"https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-content\/uploads\/2020\/05\/kinect-camera-assumption-1024x527.png 1024w, https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-content\/uploads\/2020\/05\/kinect-camera-assumption-300x154.png 300w, https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-content\/uploads\/2020\/05\/kinect-camera-assumption-768x395.png 768w, https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-content\/uploads\/2020\/05\/kinect-camera-assumption-1536x790.png 1536w, https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-content\/uploads\/2020\/05\/kinect-camera-assumption.png 2048w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n<h6>Fitting top plane and refining of potential&nbsp; region<\/h6>\n<p>Excluding the invalid set of 3D point cloud and floor plane segmentation from above, there is large, well connected region in center of the image. However,&nbsp; if we apply RANSAC algorithm again to fit the top plane of the box, some regions which are near to the image boundary are often falsely regarded as &#8220;top plane&#8221;, so that we apply flood-fill algorithm to mark the only region in the middle.<\/p>\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"527\" src=\"https:\/\/jasoninerlangen.myqnapcloud.com:8081\/WordPress\/wp-content\/uploads\/2020\/05\/Floodfill-seed-point-search-1024x527.png\" alt=\"\" class=\"wp-image-174\" srcset=\"https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-content\/uploads\/2020\/05\/Floodfill-seed-point-search-1024x527.png 1024w, https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-content\/uploads\/2020\/05\/Floodfill-seed-point-search-300x154.png 300w, https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-content\/uploads\/2020\/05\/Floodfill-seed-point-search-768x395.png 768w, https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-content\/uploads\/2020\/05\/Floodfill-seed-point-search-1536x790.png 1536w, https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-content\/uploads\/2020\/05\/Floodfill-seed-point-search.png 2048w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<div style=\"height: 250px; position:relative; margin-bottom: 50px;\" class=\"wp-block-simple-code-block-ace\"><pre class=\"wp-block-simple-code-block-ace\" style=\"position:absolute;top:0;right:0;bottom:0;left:0\" data-mode=\"python\" data-theme=\"monokai\" data-fontsize=\"14\" data-lines=\"Infinity\" data-showlines=\"true\" data-copy=\"false\">def floodfill_topplane(topPlaneImg):\n    \"\"\"\n    apply floodfill algorithm to define more concise top plane\n    Input:\n    topPlaneImg(array(bool)): top plane image after RANSAC plane fitting\n    Return:\n    returnImg(array(bool))\n    \"\"\"\n    (height,width) = topPlaneImg.shape\n    returnImg = np.zeros(topPlaneImg.shape,dtype=np.int64)\n    maskIndices = np.arange(height*width).reshape(height,width)\n    validSet = maskIndices[topPlaneImg.astype(bool)]\n    # retrieve the median \"pixel\"\n    idx = np.floor(validSet.size\/2).astype(int)\n    x = validSet[idx]%width\n    y = np.floor(validSet[idx]\/width).astype(int)\n    mask = skimage.morphology.flood(np.int64(topPlaneImg),(y,x),connectivity=2)\n    returnImg[mask] = 1\n    return returnImg.astype(bool)<\/pre><\/div>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"640\" height=\"480\" src=\"https:\/\/jasoninerlangen.myqnapcloud.com:8081\/WordPress\/wp-content\/uploads\/2020\/05\/top_plane_segmentation.png\" alt=\"\" class=\"wp-image-57\" srcset=\"https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-content\/uploads\/2020\/05\/top_plane_segmentation.png 640w, https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-content\/uploads\/2020\/05\/top_plane_segmentation-300x225.png 300w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><figcaption>Top plane segmentation (RANSAC fitting + Floodfill)<\/figcaption><\/figure>\n\n\n<h6>Sorting the result of corner detector<\/h6>\n<p>The output of corner detector is sorted w.r.t. &#8220;<strong>cornerness<\/strong>&#8221; which is used to specify potential corner candidates in the corner detector, however, if we draw the edges of the box directly with the order of the output of the corner detector, it is not guaranteed that we can generate a convex polygon. As a consequence, we compute the azimuth of 4 corners w.r.t. the geometric center of these and sort it&nbsp; in an ascending order which denotes that the range of angle lies within -\u03c0 to \u03c0 and the direction of rotation is counter clockwise.<\/p>\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"527\" src=\"https:\/\/jasoninerlangen.myqnapcloud.com:8081\/WordPress\/wp-content\/uploads\/2020\/05\/sort-corners-1-1024x527.png\" alt=\"\" class=\"wp-image-172\" srcset=\"https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-content\/uploads\/2020\/05\/sort-corners-1-1024x527.png 1024w, https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-content\/uploads\/2020\/05\/sort-corners-1-300x154.png 300w, https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-content\/uploads\/2020\/05\/sort-corners-1-768x395.png 768w, https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-content\/uploads\/2020\/05\/sort-corners-1-1536x790.png 1536w, https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-content\/uploads\/2020\/05\/sort-corners-1.png 2048w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<div style=\"height: 250px; position:relative; margin-bottom: 50px;\" class=\"wp-block-simple-code-block-ace\"><pre class=\"wp-block-simple-code-block-ace\" style=\"position:absolute;top:0;right:0;bottom:0;left:0\" data-mode=\"python\" data-theme=\"monokai\" data-fontsize=\"14\" data-lines=\"Infinity\" data-showlines=\"true\" data-copy=\"false\">def sort_corners(corners):\n    \"\"\"\n    sort corners' azimuth in counter clockwise order (3rd->4th->1st->2nd quadrant) \n    w.r.t geometric center(mean value of 4 vertices) of the box\n    Input:\n    corners(array(int)): output(ndarray) of corner detection\n    Return:\n    returnMatx(array(int))\n    \"\"\"\n    returnMatx = np.zeros(corners.shape,dtype=np.int64)\n    ### check Quadrant\n    centroid = np.mean(corners,axis=0)\n    vector = corners - centroid\n    angle = np.arctan2(vector[:,0],vector[:,1])\n    ind = np.argsort(angle)\n    for i in range(ind.size):\n        returnMatx[i,:] = corners[ind[i],:]\n    return returnMatx<\/pre><\/div>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"640\" height=\"480\" src=\"https:\/\/jasoninerlangen.myqnapcloud.com:8081\/WordPress\/wp-content\/uploads\/2020\/05\/Figure_1.png\" alt=\"\" class=\"wp-image-5\" srcset=\"https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-content\/uploads\/2020\/05\/Figure_1.png 640w, https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-content\/uploads\/2020\/05\/Figure_1-300x225.png 300w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><figcaption>3D Box Detection<\/figcaption><\/figure><\/div>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"640\" height=\"480\" src=\"https:\/\/jasoninerlangen.myqnapcloud.com:8081\/WordPress\/wp-content\/uploads\/2020\/05\/3D_point_cloud.png\" alt=\"\" class=\"wp-image-157\" srcset=\"https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-content\/uploads\/2020\/05\/3D_point_cloud.png 640w, https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-content\/uploads\/2020\/05\/3D_point_cloud-300x225.png 300w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><figcaption>3D Point Cloud (blue: floor plane\/orange: top plane)<\/figcaption><\/figure>\n\n\n<h2 id=\"tablepress-1-name\" class=\"tablepress-table-name tablepress-table-name-id-1\">Box dimension estimation<\/h2>\n\n<table id=\"tablepress-1\" class=\"tablepress tablepress-id-1\" aria-labelledby=\"tablepress-1-name\">\n<thead>\n<tr class=\"row-1\">\n\t<td class=\"column-1\"><\/td><th class=\"column-2\">Length(m)<\/th><th class=\"column-3\">Width(m)<\/th><th class=\"column-4\">Height(m)<\/th>\n<\/tr>\n<\/thead>\n<tbody class=\"row-striping row-hover\">\n<tr class=\"row-2\">\n\t<td class=\"column-1\">cloud1<\/td><td class=\"column-2\">0.4402<\/td><td class=\"column-3\">0.3319<\/td><td class=\"column-4\">0.1837<\/td>\n<\/tr>\n<tr class=\"row-3\">\n\t<td class=\"column-1\">cloud2<\/td><td class=\"column-2\"> 0.4386<\/td><td class=\"column-3\">0.3381<\/td><td class=\"column-4\"> 0.1904<\/td>\n<\/tr>\n<tr class=\"row-4\">\n\t<td class=\"column-1\">cloud3<\/td><td class=\"column-2\"> 0.4327<\/td><td class=\"column-3\">0.3416<\/td><td class=\"column-4\"> 0.1932<\/td>\n<\/tr>\n<tr class=\"row-5\">\n\t<td class=\"column-1\">cloud4<\/td><td class=\"column-2\">0.4824<\/td><td class=\"column-3\">0.3096<\/td><td class=\"column-4\">0.1828<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<!-- #tablepress-1 from cache -->\n\n\n<h2 id=\"tablepress-2-name\" class=\"tablepress-table-name tablepress-table-name-id-2\">Hardware<\/h2>\n\n<table id=\"tablepress-2\" class=\"tablepress tablepress-id-2\" aria-labelledby=\"tablepress-2-name\">\n<thead>\n<tr class=\"row-1\">\n\t<th class=\"column-1\">CPU<\/th><th class=\"column-2\">Intel 8700K @4.4GHz all cores<\/th>\n<\/tr>\n<\/thead>\n<tbody class=\"row-striping row-hover\">\n<tr class=\"row-2\">\n\t<td class=\"column-1\">RAM<\/td><td class=\"column-2\">16GB @3000MHz<\/td>\n<\/tr>\n<tr class=\"row-3\">\n\t<td class=\"column-1\">NVMe SSD<\/td><td class=\"column-2\">512GB<\/td>\n<\/tr>\n<tr class=\"row-4\">\n\t<td class=\"column-1\">GPU<\/td><td class=\"column-2\">GTX 1070Ti 8GB<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<!-- #tablepress-2 from cache -->\n\n\n<p>So far as we know, we can find out that the object is a DHL packet box size L<br><a href=\"https:\/\/www.dhl.de\/en\/privatkunden\/pakete-versenden\/pakete-abgeben\/verpacken.html\">https:\/\/www.dhl.de\/en\/privatkunden\/pakete-versenden\/pakete-abgeben\/verpacken.html<\/a><\/p>\n<p>The execution time for both top plane and floor plane RANSAC plane fitting is about 7 seconds.<\/p>\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Motivation In this exercise, we acquire a brightness image (424&#215;512), a depth image (424&#215;512) and a 3D point cloud matrix [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":5,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"_uf_show_specific_survey":0,"_uf_disable_surveys":false,"footnotes":""},"categories":[2],"tags":[7],"class_list":["post-27","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-coding","tag-computer-vision"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-json\/wp\/v2\/posts\/27","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-json\/wp\/v2\/comments?post=27"}],"version-history":[{"count":53,"href":"https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-json\/wp\/v2\/posts\/27\/revisions"}],"predecessor-version":[{"id":200,"href":"https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-json\/wp\/v2\/posts\/27\/revisions\/200"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-json\/wp\/v2\/media\/5"}],"wp:attachment":[{"href":"https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-json\/wp\/v2\/media?parent=27"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-json\/wp\/v2\/categories?post=27"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/jasoninerlangen.myqnapcloud.com\/WordPress\/wp-json\/wp\/v2\/tags?post=27"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}