





摘 要: 在現(xiàn)實(shí)世界中,點(diǎn)云數(shù)據(jù)的采集方式有激光雷達(dá)、雙目相機(jī)和深度相機(jī),但是在機(jī)器人采集過程中由于設(shè)備分辨率、周圍環(huán)境等因素的影響,收集到的點(diǎn)云數(shù)據(jù)通常是非完整的。為了解決物體形狀缺失的問題,提出了一種使用局部鄰域信息的三維物體形狀自動補(bǔ)全的網(wǎng)絡(luò)架構(gòu)。該架構(gòu)包括點(diǎn)云特征提取網(wǎng)絡(luò)模塊和點(diǎn)云生成網(wǎng)絡(luò)模塊,輸入為缺失的點(diǎn)云形狀,輸出為缺失部分的點(diǎn)云形狀,將輸入與輸出點(diǎn)云形狀進(jìn)行合并完成物體的形狀補(bǔ)全。采用倒角距離和測地距離進(jìn)行評估,實(shí)驗(yàn)結(jié)果表明,在ShapeNet數(shù)據(jù)集上,平均倒角距離和平均測地距離均小于多層感知機(jī)特征提取網(wǎng)絡(luò)與PCN網(wǎng)絡(luò)的值,兩值分別為0.000 84和0.028。對于現(xiàn)實(shí)中掃描的點(diǎn)云數(shù)據(jù)進(jìn)行補(bǔ)全處理也達(dá)到了預(yù)期效果,說明該網(wǎng)絡(luò)有較強(qiáng)的泛化性,可以修復(fù)不同類別的物體。
關(guān)鍵詞: 計(jì)算機(jī)視覺; 深度學(xué)習(xí); 點(diǎn)云數(shù)據(jù); 物體補(bǔ)全
中圖分類號: TP389.1"" 文獻(xiàn)標(biāo)志碼: A
文章編號: 1001-3695(2022)05-052-1586-04
doi:10.19734/j.issn.1001-3695.2021.09.0383
3D object shape completion under learning point cloud neighborhood information
Zhang Jingjun, Zheng Can, Gao Ruizhen
(School of Mechanical amp; Equipment Engineering, Hebei University of Engineering, Handan Hebei 056000, China)
Abstract: In the real world,point cloud data collection methods include LiDAR,binocular cameras,and depth cameras,but due to factors such as device resolution and surrounding environment during robot collection,the collected point cloud data is usually incomplete.In order to solve the problem of missing object shape,this paper proposed a 3D point cloud object shape auto-completion network that learnt local neighborhood information.The network consisted of a point cloud feature extraction network and a point cloud generation network.It took missing point cloud shape as input,missing part of the point cloud shape as output,and merged the input and output point cloud shape to complete the shape of the object.The experimental results show that on the ShapeNet dataset,the average chamfer distance and the average earth mover’s distance are smaller than the value of the multi-layer perceptron feature extraction network and PCN network,and the two values are 0.000 84 and 0.028,respectively.Complementing the point cloud data scanned in reality also achieves the expected results,this paper indicates that the network has strong generalization and can repair different types of objects.
Key words: computer vision; deep learning; point cloud data; object completion
0 引言
在現(xiàn)實(shí)世界中,人類可以從有限的信息中推斷出場景的結(jié)構(gòu)和其中物體的形狀,即使對高遮擋區(qū)域也能夠猜測出物體合理的形狀[1]。這種能力源于人類擁有強(qiáng)大的先驗(yàn)知識,先驗(yàn)知識是人們成長過程中對周圍環(huán)境的學(xué)習(xí)與感知到的知識。將這種能力擴(kuò)展到機(jī)器人,從非完整的三維物體推斷出完整形狀,在機(jī)器人學(xué)和感知領(lǐng)域具有廣泛的應(yīng)用,例如機(jī)器人抓取、機(jī)器人運(yùn)動規(guī)劃等任務(wù)[2]。隨著深度神經(jīng)網(wǎng)絡(luò)在二維圖像上取得了巨大的成就,近些年,許多科學(xué)研究者探索深度神經(jīng)網(wǎng)絡(luò)在三維點(diǎn)云[3]、三維體素[4]等數(shù)據(jù)上的應(yīng)用。
三維點(diǎn)云是由激光雷達(dá)、雙目相機(jī)和深度相機(jī)等傳感器收集的數(shù)據(jù)。然而由于傳感器分辨率、人員操作不當(dāng)或物體自身遮擋,導(dǎo)致了三維點(diǎn)云數(shù)據(jù)點(diǎn)的稀疏和缺失,從而使物體丟失了幾何和語義信息[5,6]。通常情況下,傳感器獲取的三維點(diǎn)云大多數(shù)是非完整的數(shù)據(jù)。研究的主要任務(wù)是在三維點(diǎn)云下的物體形狀補(bǔ)全,以缺失的三維點(diǎn)云形狀的坐標(biāo)數(shù)組作為輸入,輸出為補(bǔ)全缺失部分的三維點(diǎn)云形狀的坐標(biāo)數(shù)組,最后將輸入點(diǎn)云形狀和輸出點(diǎn)云形狀進(jìn)行合并,獲得完整的點(diǎn)云形狀。本文提出一種學(xué)習(xí)局部鄰域信息的編碼器網(wǎng)絡(luò),在ShapeNet數(shù)據(jù)集上進(jìn)行定量和定性分析,并對修復(fù)補(bǔ)全后的結(jié)果進(jìn)行可視化。
1 相關(guān)點(diǎn)云工作
在三維數(shù)據(jù)的物體分類任務(wù)中,PointNet[6]開創(chuàng)了以三維點(diǎn)云的坐標(biāo)(x,y,z)作為輸入來訓(xùn)練神經(jīng)網(wǎng)絡(luò)的先河。點(diǎn)云數(shù)據(jù)是一個(gè)無序的集合,這意味著無論點(diǎn)如何排序,都不會更改點(diǎn)云的幾何形狀,所以PointNet設(shè)計(jì)了一種置換不變的特征提取器。首先將三維點(diǎn)云通過多層感知機(jī)映射到每個(gè)點(diǎn)特征上,然后使用最大池化層提取每個(gè)點(diǎn)的最大特征,生成全局特征向量。PointNet++[7]從每個(gè)點(diǎn)的鄰域中捕獲點(diǎn)與點(diǎn)之間的幾何關(guān)系,核心結(jié)構(gòu)是抽象層,由采樣層、分組層和PointNet層組成。為了更好地學(xué)習(xí)點(diǎn)云數(shù)據(jù),PointNet++疊加了幾個(gè)抽象層,從局部幾何關(guān)系中學(xué)習(xí)特征,并逐層提取局部特征。PointConv[8]是一種直接對三維點(diǎn)云進(jìn)行卷積操作的深度卷積網(wǎng)絡(luò),為了使點(diǎn)云保持置換不變和平移不變,以局部點(diǎn)坐標(biāo)訓(xùn)練多層感知機(jī)來近似卷積濾波器的權(quán)重和密度函數(shù)。
在三維點(diǎn)云物體補(bǔ)全任務(wù)中,三維點(diǎn)云采集過程中,由于設(shè)備受到天氣、物體自遮擋等因素的影響,采集到的點(diǎn)云數(shù)據(jù)通常是非完整的,需要修復(fù)缺失的點(diǎn)云數(shù)據(jù)。PCN[9]是一種由稀疏到稠密的三維點(diǎn)云補(bǔ)全網(wǎng)絡(luò)。受到PointNet的啟發(fā),PCN疊加兩次PointNet層來提取點(diǎn)云數(shù)據(jù)的特征向量;結(jié)合了全連接解碼器和基于折疊解碼器的優(yōu)點(diǎn),首先將點(diǎn)云數(shù)據(jù)特征向量輸入全連接解碼器中生成稀疏的完整點(diǎn)云,然后將稀疏點(diǎn)云輸入到折疊解碼器中生成稠密的完整點(diǎn)云。為了獲得高保真的密集三維點(diǎn)云,Liu等人[10]提出了密集點(diǎn)云補(bǔ)全網(wǎng)絡(luò),主要分為兩個(gè)階段。第一階段,參數(shù)化表面的元素集合來預(yù)測完整但稀疏的三維點(diǎn)云;第二階段,采樣網(wǎng)絡(luò)將稀疏預(yù)測點(diǎn)云與輸入點(diǎn)云進(jìn)行融合,獲得稠密三維點(diǎn)云。TopNet[11]是一種新穎的三維點(diǎn)云解碼器,它可以生成一個(gè)結(jié)構(gòu)化的三維點(diǎn)云,無須在底層點(diǎn)集上假設(shè)任何特定的結(jié)果或拓?fù)洹CN和TopNet使用多層感知機(jī)直接處理點(diǎn)云,由于沒有充分考慮點(diǎn)云之間的幾何關(guān)系,可能會導(dǎo)致細(xì)節(jié)的丟失。GRNet[12]是一種點(diǎn)云網(wǎng)格化網(wǎng)絡(luò),首先該網(wǎng)絡(luò)將點(diǎn)云轉(zhuǎn)換為等間距的體素網(wǎng)格,使用三維卷積層對網(wǎng)格進(jìn)行提取特征,然后將提取后的三維特征向量輸入到去網(wǎng)格化層生成預(yù)測點(diǎn)云。
4 實(shí)驗(yàn)
所設(shè)計(jì)的網(wǎng)絡(luò)方法在Ubuntu18.04系統(tǒng)下得到了實(shí)現(xiàn),硬件設(shè)備是一臺搭載Intel Core主頻3.80 GHz的i7-10700K處理器、顯存為8 GB的NVIDIA GeForce RTX 2070 圖形處理器以及 16 GB 內(nèi)存的計(jì)算機(jī)。
4.1 訓(xùn)練過程
網(wǎng)絡(luò)采用ShapeNet16公共數(shù)據(jù)集,并按照8:2的比例分為訓(xùn)練集和測試集,在Python3.6和PyTorch1.4.0的深度學(xué)習(xí)環(huán)境下進(jìn)行訓(xùn)練;訓(xùn)練時(shí)采用Adam優(yōu)化器,初始學(xué)習(xí)率設(shè)置為0.000 1,批量為12次,共迭代101 200步幅,步幅每增加1 000,學(xué)習(xí)率會減少一倍。隨著步幅的增加,損失函數(shù)值下降(圖6),預(yù)測點(diǎn)云與真實(shí)點(diǎn)云之間的倒角距離減小,說明預(yù)測點(diǎn)云更接近于真實(shí)點(diǎn)云。
4.2 實(shí)驗(yàn)分析
利用ShapeNet16公共數(shù)據(jù)集對該網(wǎng)絡(luò)進(jìn)行驗(yàn)證與評估,并采用倒角距離與測距距離這兩種評估度量對兩個(gè)特征提取網(wǎng)絡(luò)的點(diǎn)云補(bǔ)全結(jié)果進(jìn)行分析。為了方便進(jìn)行對比,將倒角距離數(shù)值乘以1 000,測地距離的數(shù)值乘以100,數(shù)值越小,補(bǔ)全物體的三維點(diǎn)云越接近真實(shí)點(diǎn)云。
所提出的點(diǎn)鄰域特征提取網(wǎng)絡(luò)的參數(shù)接近多層感知機(jī)特征提取網(wǎng)絡(luò)參數(shù)的四分之一,如表1所示。點(diǎn)鄰域特征提取網(wǎng)絡(luò)所修補(bǔ)物體的三維點(diǎn)云與真實(shí)點(diǎn)云在倒角距離與測地距離的數(shù)值均小于多層感知機(jī)提取網(wǎng)絡(luò)所獲得的補(bǔ)全點(diǎn)云,并且兩個(gè)網(wǎng)絡(luò)的評估度量均小于PCN的評估度量,如表2所示。在網(wǎng)絡(luò)參數(shù)少的情況下,所提出的點(diǎn)鄰域特征提取網(wǎng)絡(luò)補(bǔ)全的三維點(diǎn)云更接近于真實(shí)點(diǎn)云。
兩個(gè)特征提取網(wǎng)絡(luò)與PCN在補(bǔ)全后物體的三維點(diǎn)云與真實(shí)點(diǎn)云之間倒角距離的值,如表3所示。在測試數(shù)據(jù)集上,點(diǎn)鄰域特征網(wǎng)絡(luò)與多層感知機(jī)特征網(wǎng)絡(luò)的倒角距離均小于PCN,并且在包、帽子和水杯這三種類別的倒角距離,點(diǎn)鄰域特征提取網(wǎng)絡(luò)所修復(fù)物體點(diǎn)云的值略大于多層感知機(jī)特征提取網(wǎng)絡(luò)的值。
兩個(gè)特征提取網(wǎng)絡(luò)與PCN在補(bǔ)全后物體的三維點(diǎn)云與真實(shí)點(diǎn)云之間的測地距離的值,如表4所示。兩個(gè)提取網(wǎng)絡(luò)的測地距離均小于PCN的測地距離評估度量,并且多層感知機(jī)特征網(wǎng)絡(luò)所補(bǔ)全物體點(diǎn)云的值在帽子這種類別上略小于點(diǎn)鄰域特征網(wǎng)絡(luò)的值,汽車、電腦和滑板這三種類別的測地距離在這兩個(gè)特征提取網(wǎng)絡(luò)上持平。
所提出的學(xué)習(xí)點(diǎn)鄰域信息的三維點(diǎn)云物體補(bǔ)全網(wǎng)絡(luò)對非完整處理后的ShapeNet數(shù)據(jù)集中的物體對象進(jìn)行修復(fù)補(bǔ)全,可視化結(jié)果如表5所示。該方法將缺失部分進(jìn)行恢復(fù),不會改變其他位置的點(diǎn)云,但是PCN是將整個(gè)點(diǎn)云重新排列來獲取完整的點(diǎn)云。在表5中,PCN雖然學(xué)習(xí)出了完整的椅子,但是與真實(shí)點(diǎn)云相差太大,PCN對于圓桌的細(xì)節(jié)學(xué)習(xí)不到位。
對現(xiàn)實(shí)中的物體進(jìn)行掃描,獲得點(diǎn)云數(shù)據(jù)輸入到所提出的學(xué)習(xí)點(diǎn)云鄰域信息的三維點(diǎn)云物體補(bǔ)全網(wǎng)絡(luò)中,可視化結(jié)果如表6所示。首先,利用手持激光三維掃描儀FreeScan X5對凳子進(jìn)行掃描,然后將獲取的點(diǎn)云輸入到學(xué)習(xí)點(diǎn)鄰域信息補(bǔ)全網(wǎng)絡(luò)和PCN中進(jìn)行修復(fù)補(bǔ)全工作。在表6中可以觀察到,所提出的網(wǎng)絡(luò)完成對點(diǎn)云的補(bǔ)全,而PCN所學(xué)習(xí)到的點(diǎn)云沒有凳子之間的橫梁。
5 結(jié)束語
設(shè)計(jì)一種學(xué)習(xí)局部鄰域信息的三維點(diǎn)云的物體形狀自動補(bǔ)全網(wǎng)絡(luò)架構(gòu),并獲得補(bǔ)全后的完整點(diǎn)云。與其他點(diǎn)云形狀補(bǔ)全網(wǎng)絡(luò)相比,該網(wǎng)絡(luò)考慮了點(diǎn)與點(diǎn)之間的幾何結(jié)構(gòu)和局部鄰域信息。實(shí)驗(yàn)結(jié)果表明,在ShapeNet數(shù)據(jù)集中,平均倒角距離和平均測地距離均小于多層感知機(jī)特征提取網(wǎng)絡(luò)和PCN的值,兩值分別為0.000 84和0.028,并且對實(shí)際掃描的點(diǎn)云數(shù)據(jù)進(jìn)行補(bǔ)全處理,達(dá)到了預(yù)期效果。該網(wǎng)絡(luò)在物體形狀補(bǔ)全上應(yīng)用廣泛,在后續(xù)的研究中,將該網(wǎng)絡(luò)應(yīng)用于移動機(jī)器人的感知,識別重建補(bǔ)全的三維物體。
參考文獻(xiàn):
[1]Mandikal P,Radhakrishnan V B.Dense 3D point cloud reconstruction using a deep pyramid network[C]//Proc of IEEE Winter Conference on Applications of Computer Vision.Piscataway,NJ:IEEE Press,2019:1052-1060.
[2]Mandikal P,Navaneet K L,Agarwal M,et al.3D-LMNet:latent embedding matching for accurate and diverse 3D point cloud reconstruction from a single image[EB/OL].(2019-03-26).https://arxiv.org/abs/1807.07796.
[3]Gadelha M,Wang Rui,Maji S.Multiresolution tree networks for 3D point cloud processing[C]//Proc of European Conference on Compu-ter Vision.Cham:Springer,2018:105-122.
[4]Wang Weiyun,Huang Qiangui,You Suya,et al.Shape inpainting using 3D generative adversarial network and recurrent convolutional networks[C]//Proc of IEEE International Conference on Computer Vision.Piscataway,NJ:IEEE Press,2017:2317-2325.
[5]羅開乾,朱江平,周佩,等.基于多分支結(jié)構(gòu)的點(diǎn)云補(bǔ)全網(wǎng)絡(luò)[J].激光與光電子學(xué)進(jìn)展,2020,57(24):209-216.(Luo Kaiqian,Zhu Jiangping,Zhou Pei,et al.Point cloud completion network based on multi-branch structure[J].Progress in Laser and Optoelectro-nics,2020,57(24):209-216.)
[6]Qi C R,Su Hao,Mo Kaichun,et al.PointNet:deep learning on point sets for 3D classification and segmentation[C]//Proc of IEEE Confe-rence on Computer Vision and Pattern Recognition.2017:652-660.
[7]Qi C R,Yi Li,Su Hao,et al.PointNet++:deep hierarchical feature learning on point sets in a metric space[C]//Proc of the 31st International Conference on Neural Information Processing Systems.Red Hook,NY:Curran Associates Inc.,2017:5105-5114.
[8]Wu Wenxuan,Qi Zhongang,Li Funxin.PointConv:deep convolutional networks on 3D point clouds[C]//Proc of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE Press,2019:9613-9622.
[9]Yuan Wentao,Khot T,Held D,et al.PCN:point completion network[C]//Proc of International Conference on 3D Vision.Piscataway,NJ:IEEE Press,2018:728-737.
[10]Liu Minghua,Sheng Lu,Yang Sheng,et al.Morphing and sampling network for dense point cloud completion[C]//Proc of AAAI Confe-rence on Artificial Intelligence.2020:11596-11603.
[11]Tchapmi L P,Kosaraju V,Rezatofighi H,et al.TopNet:structural point cloud decoder[C]//Proc of IEEE/CVF Conference on Compu-ter Vision and Pattern Recognition.Piscataway,NJ:IEEE Press,2019:383-392.
[12]Xie Haozhe,Yao Hongxun,Zhou Shangchen,et al.GRNet:gridding residual network for dense point cloud completion[C]//Proc of European Conference on Computer Vision.Cham:Springer,2020:365-381.
[13]Wang Yue,Sun Yongbin,Liu Ziwei,et al.Dynamic graph CNN for learning on point clouds[J].ACM Trans on Graphics,2019,38(5):1-12.
[14]Fan Haoqiang,Su Hao,Guibas L J.A point set generation network for 3D object reconstruction from a single image[C]//Proc of IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE Press,2017:2463-2471.
[15]Savva M,Yu F,Su Hao,et al.SHREC16 track:largescale 3D shape retrieval from ShapeNet core55[C]//Proc of Eurographics Workshop on 3D Object Retrieval.[S.l.]:The Eurographics Association,2016.