Artigo Revisado por pares

Certifiable Object Pose Estimation: Foundations, Learning Models, and Self-Training

2023; Institute of Electrical and Electronics Engineers; Volume: 39; Issue: 4 Linguagem: Inglês

10.1109/tro.2023.3271568

ISSN

1941-0468

Autores

Rajat Talak, Lisa Peng, Luca Carlone,

Tópico(s)

Domain Adaptation and Few-Shot Learning

Resumo

In this article, we consider a certifiable object pose estimation problem, where—given a partial point cloud of an object—the goal is to not only estimate the object pose, but also provide a certificate of correctness for the resulting estimate. Our first contribution is a general theory of certification for end-to-end perception models. In particular, we introduce the notion of $\zeta$ -correctness , which bounds the distance between an estimate and the ground truth. We then show that $\zeta$ -correctness can be assessed by implementing two certificates: 1) a certificate of observable correctness , which asserts if the model output is consistent with the input data and prior information; and 2) a certificate of nondegeneracy , which asserts whether the input data are sufficient to compute a unique estimate. Our second contribution is to apply this theory and design a new learning-based certifiable pose estimator. In particular, we propose C-3PO , a semantic-keypoint-based pose estimation model, augmented with the two certificates, to solve the certifiable pose estimation problem. C-3PO also includes a keypoint corrector , implemented as a differentiable optimization layer, that can correct large detection errors (e.g., due to the sim-to-real gap). Our third contribution is a novel self-supervised training approach that uses our certificate of observable correctness to provide the supervisory signal to C-3PO during training. In it, the model trains only on the observably correct input–output pairs produced in each batch and at each iteration. As training progresses, we see that the observably correct input–output pairs grow, eventually reaching near 100% in many cases. We conduct extensive experiments to evaluate the performance of the corrector, the certification, and the proposed self-supervised training using the ShapeNet and YCB datasets. The experiments show that 1) standard semantic-keypoint-based methods (which constitute the backbone of C-3PO ) outperform more recent alternatives in challenging problem instances; 2) C-3PO further improves performance and significantly outperforms all the baselines; and 3) C-3PO ’s certificates are able to discern correct pose estimates. 1

Referência(s)
Altmetric
PlumX