TY - JOUR
T1 - Deep Image Matting with Sparse User Interactions
AU - Wei, Tianyi
AU - Chen, Dongdong
AU - Zhou, Wenbo
AU - Liao, Jing
AU - Zhao, Hanqing
AU - Zhang, Weiming
AU - Hua, Gang
AU - Yu, Nenghai
N1 - Research Unit(s) information for this publication is provided by the author(s) concerned.
PY - 2024/2
Y1 - 2024/2
N2 - Image matting is a fundamental and challenging problem in computer vision and graphics. Most existing matting methods leverage a user-supplied trimap as an auxiliary input to produce good alpha matte. However, obtaining high-quality trimap itself is arduous. Recently, some hint-free methods have emerged, however, the matting quality is still far behind the trimap-based methods. The main reason is that, some hints for removing semantic ambiguity and improving matting quality are essential. Apparently, there is a trade-off between interaction cost and matting quality. To balance performance and user-friendliness, we propose an improved deep image matting framework which is trimap-free and only needs sparse user click or scribble interaction to minimize the needed auxiliary constraints while still allowing interactivity. Moreover, we introduce uncertainty estimation that predicts which parts need polishing and conduct uncertainty-guided refinement. To trade off runtime against refinement quality, users can also choose different refinement modes. Experimental results show that our method performs better than existing trimap-free methods and comparably to state-of-the-art trimap-based methods with minimal user effort. Finally, we demonstrate the extensibility of our framework to video human matting without any structure modification, by adding optical flow-based sparse hint propagation and temporal consistency regularization imposed on the single frame. © 2023 IEEE.
AB - Image matting is a fundamental and challenging problem in computer vision and graphics. Most existing matting methods leverage a user-supplied trimap as an auxiliary input to produce good alpha matte. However, obtaining high-quality trimap itself is arduous. Recently, some hint-free methods have emerged, however, the matting quality is still far behind the trimap-based methods. The main reason is that, some hints for removing semantic ambiguity and improving matting quality are essential. Apparently, there is a trade-off between interaction cost and matting quality. To balance performance and user-friendliness, we propose an improved deep image matting framework which is trimap-free and only needs sparse user click or scribble interaction to minimize the needed auxiliary constraints while still allowing interactivity. Moreover, we introduce uncertainty estimation that predicts which parts need polishing and conduct uncertainty-guided refinement. To trade off runtime against refinement quality, users can also choose different refinement modes. Experimental results show that our method performs better than existing trimap-free methods and comparably to state-of-the-art trimap-based methods with minimal user effort. Finally, we demonstrate the extensibility of our framework to video human matting without any structure modification, by adding optical flow-based sparse hint propagation and temporal consistency regularization imposed on the single frame. © 2023 IEEE.
KW - Estimation
KW - Image Matting
KW - Image segmentation
KW - Runtime
KW - Semantics
KW - Sparse Interactions
KW - Task analysis
KW - Training
KW - Uncertainty
KW - Uncertainty Estimation
KW - Video Human Matting
UR - http://www.scopus.com/inward/record.url?scp=85176307617&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85176307617&origin=recordpage
U2 - 10.1109/TPAMI.2023.3326693
DO - 10.1109/TPAMI.2023.3326693
M3 - RGC 21 - Publication in refereed journal
SN - 0162-8828
VL - 46
SP - 881
EP - 895
JO - IEEE Transactions on Pattern Analysis and Machine Intelligence
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
IS - 2
ER -