A Dense Prediction ViT Network for Single Image Bokeh Rendering
2022; Springer Science+Business Media; Linguagem: Inglês
10.1007/978-3-031-18916-6_18
ISSN1611-3349
Autores Tópico(s)Generative Adversarial Networks and Image Synthesis
ResumoRendering bokeh effects has become a research hotspot in the field of computational photography. Its essence is to focus on foreground area of interest, and blur other background areas for vision aesthetic requirements. Witness of the great success of vision transformer on dense prediction tasks, in this paper, we further expand the ability of transformer on new task and propose a dense prediction ViT structure for single bokeh image rendering. We leverage vision transformers as the backbone networks and operates on "bag-of-words" representations of image at high levels. Image-like feature representations of different resolutions are aggregated to obtain the final dense prediction. The proposed network has been compared with several state-of-the-art methods on a public large-scale bokeh dataset- the "EBB!" Dataset. The experiment results demonstrate that the proposed network can achieve new state-of-the-art performances on SSIM, LPIPS and MOS criterions. Its predicted bokeh effects are more in line with popular perception. Related source codes and pre-trained models of the proposed model will be available soon on https://github.com/zfw-cv/BEViT .
Referência(s)