Artigo Revisado por pares

Image completion using multispectral imaging

2018; Institution of Engineering and Technology; Volume: 12; Issue: 7 Linguagem: Inglês

10.1049/iet-ipr.2017.1203

ISSN

1751-9667

Autores

Frédéric Bousefsaf, Mohamed Tamaazousti, Souheil Hadj Said, Rémi Michel,

Tópico(s)

Advanced Image Processing Techniques

Resumo

IET Image ProcessingVolume 12, Issue 7 p. 1164-1174 Research ArticleFree Access Image completion using multispectral imaging Frédéric Bousefsaf, Corresponding Author Frédéric Bousefsaf frederic.bousefsaf@univ-lorraine.fr orcid.org/0000-0002-6235-6761 CEA LIST, Vision and Content Engineering Laboratory, CEA Saclay Nano-INNOV, Bat. 861-PC142, F-91191 Gif-sur-Yvette, France Current affiliation: LCOMS Laboratory, Université de Lorraine, 7 rue Marconi, 57070 Metz, France.Search for more papers by this authorMohamed Tamaazousti, Mohamed Tamaazousti CEA LIST, Vision and Content Engineering Laboratory, CEA Saclay Nano-INNOV, Bat. 861-PC142, F-91191 Gif-sur-Yvette, FranceSearch for more papers by this authorSouheil Hadj Said, Souheil Hadj Said CEA LIST, Vision and Content Engineering Laboratory, CEA Saclay Nano-INNOV, Bat. 861-PC142, F-91191 Gif-sur-Yvette, FranceSearch for more papers by this authorRémi Michel, Rémi Michel CEA LIST, Vision and Content Engineering Laboratory, CEA Saclay Nano-INNOV, Bat. 861-PC142, F-91191 Gif-sur-Yvette, FranceSearch for more papers by this author Frédéric Bousefsaf, Corresponding Author Frédéric Bousefsaf frederic.bousefsaf@univ-lorraine.fr orcid.org/0000-0002-6235-6761 CEA LIST, Vision and Content Engineering Laboratory, CEA Saclay Nano-INNOV, Bat. 861-PC142, F-91191 Gif-sur-Yvette, France Current affiliation: LCOMS Laboratory, Université de Lorraine, 7 rue Marconi, 57070 Metz, France.Search for more papers by this authorMohamed Tamaazousti, Mohamed Tamaazousti CEA LIST, Vision and Content Engineering Laboratory, CEA Saclay Nano-INNOV, Bat. 861-PC142, F-91191 Gif-sur-Yvette, FranceSearch for more papers by this authorSouheil Hadj Said, Souheil Hadj Said CEA LIST, Vision and Content Engineering Laboratory, CEA Saclay Nano-INNOV, Bat. 861-PC142, F-91191 Gif-sur-Yvette, FranceSearch for more papers by this authorRémi Michel, Rémi Michel CEA LIST, Vision and Content Engineering Laboratory, CEA Saclay Nano-INNOV, Bat. 861-PC142, F-91191 Gif-sur-Yvette, FranceSearch for more papers by this author First published: 01 July 2018 https://doi.org/10.1049/iet-ipr.2017.1203Citations: 2 AboutSectionsPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onFacebookTwitterLinkedInRedditWechat Abstract Here, the authors explore the potential of multispectral imaging applied to image completion. Snapshot multispectral cameras correspond to breakthrough technologies that are suitable for everyday use. Therefore, they correspond to an interesting alternative to digital cameras. In their experiments, multispectral images are acquired using an ultracompact snapshot camera-recorder that senses 16 different spectral channels in the visible spectrum. Direct exploitation of completion algorithms by extension of the spectral channels exhibits only minimum enhancement. A dedicated method that consists in a prior segmentation of the scene has been developed to address this issue. The segmentation derives from an analysis of the spectral data and is employed to constrain research area of exemplar-based completion algorithms. The full processing chain takes benefit from standard methods that were developed by both hyperspectral imaging and computer vision communities. Results indicate that image completion constrained by spectral presegmentation ensures better consideration of the surrounding materials and simultaneously improves rendering consistency, in particular for completion of flat regions that present no clear gradients and little structure variance. The authors validate their method with a perceptual evaluation based on 20 volunteers. This study shows for the first time the potential of multispectral imaging applied to image completion. 1 Introduction Image completion consists in filling or restoring missing or damaged regions in a visually plausible way. This image processing technique has many applications, such as the removal of unwanted objects in photos and panoramas [1], image restoration [2], and diminished reality [3]. The research in this field has reached an advanced level of maturity, some of the methods being incorporated in raster graphics editors [e.g. PatchMatch [4] in Photoshop CS5 (http://www.adobe.com/technology/projects/patchmatch.html)]. The completion task is non-trivial and is of growing importance in computer vision and computer graphics. New completion methods were recently proposed to guide the filling of missing regions using prior information about structures [5] and perspectives [6], by using guidance maps [7] or by using statistics of similar patches [8]. This high-level information corresponds to prior knowledge on the geometry of the scene. At last, the completion process is performed and represented with red, green, and blue (RGB) values. Rather than employing RGB cameras, multispectral camera-recorders [9] provide more detailed information about the spectrum of objects present in the scene. Those cameras may be of help to address standard computer vision tasks [10], especially when considering the recent introduction of snapshot multispectral camera-recorders [11]. Basically, the content of an image depends on both its geometrical and spectral dimensions [10]. Multispectral images are represented through three-dimensional (3D) datacubes, where a set of 2D images is acquired at different bands of wavelengths using dedicated optical devices [9]. In the fields of Earth and planetary sciences, datacubes delivered by multispectral or hyperspectral cameras are processed and analysed to provide relevant information about the chemical composition of the recorded scenes. One of the important advantages of this technique is that physical processes like absorption, reflectance, or fluorescence spectrum can be estimated for each pixel in the image. It allows the detection of chemical changes of objects that cannot be identified with monochromatic or colour (RGB) data [12]. The spectral information has been notably employed to characterise ocean colour [13], classify glacier surfaces [14], or to sense gypsum on Mars [15]. Also, spectral imaging corresponds to a powerful analytical tool for biological and biomedical research, notably in order to identify tissue abnormalities [12]. The spectral information of a pure material is enough scale-invariant to provide very valuable cues to better understand the contents of an image [10]. Material recognition is presumed to reinforce image processing and understanding techniques such as object detection, object recognition, and image segmentation [16]. To date, there have been no studies that analyse the relevance of multispectral imaging in the image completion context. Analysing multispectral frames instead of RGB frames amounts to process the spectral dimension at each pixel of the image. This information can be used to improve the renderings by properly updating photometric parameters, in particular for diminished reality applications [3]. In this study, we propose to investigate the relevance of multispectral frames applied to image completion, an application initially dedicated to 3D RGB images. The study first provides, in Section 2, a focused description of image completion algorithms that will be of interest for the use of multispectral images within the next sections. Some basics of multispectral imaging (sensor specifications and pre-processing operations) are then presented in Section 3. As the main purpose consists in better completing images dedicated to visualisation, this section also includes elements about the conversion from the recorded multispectral channels to the standard RGB colour space. In Section 4, we describe the behaviour of a reference completion algorithm on multispectral datacubes by directly extending its input (from 3D RGB images to 16 multispectral channels). Section 5 presents a better two-step method dedicated to the use of multispectral channels for image completion. A pre-segmentation of the geometry of the scene based on the spectral dimension is described in the first step. Research of substitution pixels is then geometrically constrained to a predefined area: only the segments located in the vicinity of the missing region are considered (see Fig. 1 for a representative example). Fig. 1Open in figure viewerPowerPoint Image completion constrained by spectral segmentation (a) Image recorded by the multispectral camera and converted to RGB. The red box in the paperboard was selected by the user and corresponds to the area to be completed (missing region), (b) Pixels selected by standard exemplar-based completion algorithm (i.e. PatchMatch [4]) to complete the missing region are highlighted in green. The algorithm considers (by mistake) some pixels from the curtains to complete the paperboard, their RGB values being very similar, (c) Resulting completion is visually altered and is partially grey, (d) Spectral segmentation deriving from noise-adjusted principal component analysis of the multispectral image. Note that the spectral segmentation produces regions that seem to be consistent with the geometry and materials of the objects, (e) The research is geographically limited to the segments in the neighbourhood of the region to be completed (i.e. the magenta segment in d), (f) Completion constrained by the spectral segments is more compatible with standard visual assessment in computer vision and computer graphics Section 6 is dedicated to the analysis of results from a perceptual quality assessment procedure based on standard subjective questionnaires over a panel of 20 observers. The proposed method (presented in Section 5) delivers completed images that are more compatible with standard visual assessment in computer vision and computer graphics. 2 Related work Image completion methods can be classified in three categories in the literature [2]: diffusion-based [17] and exemplar-based methods [4, 6, 8, 18]. Recent studies provided methods to extend single-frame completion to video sequences [19] or in real time [3] for diminished reality applications. More recently, Baek et al. [20] proposed to complete both colour and depth channels from multiview images. 2.1 Diffusion-based methods Diffusion-based techniques were developed to fill small or narrow holes by propagating adjacent image pixels into the missing area [17]. Smoothness priors are introduced through parametric models or through partial differential equations to diffuse local structures [21]. These techniques tend to blur and are less effective in handling large missing regions due to their inability to properly recover the textures. Tensor decomposition and completion [22] consist in propagating structures to fill missing regions by estimating missing values in tensors of visual data. The methods are based on matrix completion, which uses the matrix trace norm, but extended to the tensor case. Thus, tensor completion corresponds to a high-order extension of matrix completion and is formulated as a global convex optimisation problem [2, 22]. 2.2 Exemplar-based methods Exemplar-based methods take their origin from texture synthesis methods [23]. These approaches use textures in an image as exemplars based on the assumption that the patches in the target region are similar to those in the rest of the image [4, 6, 8, 18]. In a progressive manner, texture is successively copied to the boundary of the target region. Although this method can replicate complex textures in the missing region, the success of the structure propagation largely depends on the order of copy [18]. To tackle this issue and to produce more plausible results, the order of copy can be determined with particular criteria. For example, Criminisi et al. [18] proposed a gradient-based method that encourages the propagation of textures along linear structures like strong edges. As they progress patch per patch in a greedy fashion, the previous approaches do not ensure global consistency. To address this issue, Wexler et al. [24] proposed to constrain the missing values by solving a global optimisation problem. Based on this work and to reduce the computational burden, Barnes et al. have developed PatchMatch [4], a fast algorithm that iteratively generates textures in the target region by minimising an objective function based on pattern similarity between the missing region and the rest of the image. The relative location from where exemplar-based methods copy the content (a pixel or a patch) is called an offset. All possible offsets are generally accepted in the optimisation process. Nevertheless, constraints on offsets can be imposed to produce better results in terms of quality, particularly by using statistics of similar patches [8]. Additional constraints can be included to exemplar-based algorithms by, for example, guiding the reconstruction with a prior geometric scheme in order to propagate long edges [5] or by segmenting the known region into planes to properly consider the perspectives of the scene [6]. 2.3 Learning-based methods Hays and Efros [25] proposed to pair the image completion process with a database that contains a large amount of reference images. The missing regions are completed by copying similar patches from the database. The candidate patches must be seamless and semantically consistent. Pathak et al. [26] have used convolutional neural networks trained to generate the contents of a missing region by capturing the context of the image. Recent results show that approaches based on deep neural networks can effectively compete with most recent exemplar-based completion methods [27]. In this study, we have employed PatchMatch [4] as a reference image completion technique. The algorithm ensures consistency by solving a global optimisation problem and is faster than comparable completion techniques. 3 Multispectral data This section presents details about the multispectral device in addition to the image processing operations that were employed to analyse the multispectral data. 3.1 Camera specifications The multispectral imaging technology we used (Fig. 2a) in this study was designed by IMEC [11]. The device corresponds to a snapshot (i.e. non-scanning) and ultra-compact spectrometer. The camera records the spectral irradiance of a scene through a multispectral image, i.e. a 3D data set typically called a datacube or hypercube [9]. The device can nominally deliver 170 datacubes per second in real time. This value is constrained by the exposure time in practice. Fig. 2Open in figure viewerPowerPoint Multispectral camera specifications (a) Snapshot real-time multispectral camera designed by IMEC [11], (b) Spectral sensitivity of the 16 camera channels, which uniformly encompass most of the visible spectrum (475–650 nm). Spectral bandwidth is ∼20 nm per channel. In practice, partial correlation between channels results in 14 independent components instead of 16, (c) ColorChecker Classic (X-Rite). The colour chart contains 24 colour patches [28]. Their reference spectra, defined between 380 and 730 nm, are provided by the manufacturer, (d) Image and spectra derived from the multispectral camera. Reference and reconstructed spectra match within up to 90% RMS. The slight discrepancies result from uncertainties in the spectral calibration procedure Practically, the camera senses 16 different spectral bands between 475 and 650 nm. The bandwidth of each band comprises between 15 and 20 nm (Fig. 2b). The full resolution of the CMOS sensor is defined to 2048 × 1024 pixels but reduced to 512 × 256 pixels for each spectral channel (each cell being formed by a 4 × 4 multispectral mosaic [11]). Pixel intensity (bit depth) is signed over 10 bits. 3.2 Pre-processing 3.2.1 Spectral reconstruction Spectral reconstruction corresponds to a primary procedure essentially employed to calibrate multi- or hyperspectral sensors in order to assess apparent reflectances from raw spectral channels [29]. In the present study, spectral reconstruction was performed using a colour chart that includes 24 different colour patches (see Fig. 2c). Given that all the optical parameters cannot be estimated beforehand, an indirect method was employed to calibrate the multispectral sensor. For the sake of completeness, the interested reader can refer to the original article [29] in order to get the full implementation details. The reconstructed reflectance of the blue, red, and green patches of the colour chart are illustrated in Fig. 2d. The observable discrepancies result from uncertainties on the calibration procedure, which closely depends on the spectral sensitivity responses (Fig. 2b). 3.2.2 Multispectral to RGB conversion As completion algorithms deliver images that are displayed on screen and visually evaluated by humans, a conversion to the standard RGB colour space is required. In practice, this conversion is achieved using apparent reflectances deriving from the camera calibration procedure (Section 3.2.1) and by the means of the CIE colour matching functions (see Fig. 3b). An example of standard RGB conversion is presented in Fig. 2d. Fig. 3Open in figure viewerPowerPoint Spectral resolution significance. Averaged spectra along with their respective RGB values have been extracted from P1 and P2 patches. RGB values indicate that the colours are very similar. Multispectral sampling allows a more precise observation of chromatic differences. A reduction in spectra to three R, G, and B values leads to smooth and filter out spectral details, in particular when relevant variations are cancelled due to integration by the CIE matching functions (, , and curves) 4 Preliminary analyses 4.1 Significance of the spectral sampling Image completion is based on colour and brightness analysis of different image patches. Fig. 3a presents a typical example, where P1 and P2 correspond to patches of similar RGB colour. Working with more spectral bands (by increasing the spectral sampling) can be helpful in order to reveal additional relevant information. Fig. 3b presents the spectrum along with the corresponding RGB values of both the P1 and P2 patches. Herein, important chromatic differences appear between 590 and 730 nm. These disparities are partially cancelled due to integration by the CIE colour matching functions (, , and on Fig. 3b). Employing more spectral bands seems relevant in order to better consider chromatic variations when performing image completion. 4.2 Experimental procedure A set composed of ten different multispectral images was employed to assess the relevance of the multispectral data applied to image completion. The frames were recorded with the multispectral device presented in Section 3.1, the scenes being selected to emphasise current image completion limits. To this purpose, objects and backgrounds of similar colours were employed (Fig. 4a). Fig. 4Open in figure viewerPowerPoint Experimental set-up (a) Typical image acquired with the multispectral device and converted to RGB. The red region, which is selected by the user, denotes the area to be completed (missing region), (b) Ground truth material mask (). The white region has been manually segmented and corresponds to the best zone of research (in terms of material) for completion candidates Each area to be completed was manually chosen and presents no clear gradients and little spatial structure variance. For validation purpose, the regions were defined to avoid entire overlapping of an object and are comprised on a single material. To evaluate the behaviour of the completion procedure, ground truth material masks were manually defined (Fig. 4b). They correspond to the region defined by the same material than the one which surrounds the area to be completed. These material masks are also used to evaluate the relevance of the spectral segmentation proposed in this study (see Section 5.2.2). We propose to assess the behaviour of standard completion algorithm (Section 4.3) in regard to the materials that surround the region to be completed (Section 4.4), in particular when increasing the number of multispectral channels. We also propose to empirically evaluate the quality of the completion by comparing the synthesised area with its original content (Section 4.5). 4.3 Implementation details PatchMatch [4], which was initially proposed by Barnes et al., is used as a reference image completion technique. The algorithm ensures consistency by solving a global optimisation problem and is faster than comparable completion techniques. The method is composed of a sequence of specific steps. The interested reader can refer to the original article [4] in order to get the full implementation details. Briefly, the method is defined over three main steps: (i) initialisation: a random patch offset is given to each pixel at the coarsest pyramid level of the image. The result is propagated to the next pyramid level where a propagation and random search steps are applied at each level; (ii) propagation: the pertinence of the offsets is evaluated with respect to the neighbouring patches at each iteration using an objective function; (iii) random search: a search step is employed to look for better patch within a concentric radius around the current offset. The new offset is adopted if the new objective function is lower. A particular implementation of the initialisation step was employed in this study. A first exhaustive search of the best matching offsets is performed [30] instead of a random one. Also, the patch size has been set to 13 × 13 pixels. Owing to the random process included in PatchMatch, 50 trials per image were launched to compute statistical tendencies, a single run being non-representative. 4.4 Materials consideration In this section, we propose to assess the behaviour of the completion algorithm in regard to the materials that surround the region to be completed, in particular when increasing the number of multispectral channels. The full image I is separated into two disjoint sets: T corresponds to the target (or missing) region, completed using pixels in S (source region). , , and . The image completion algorithm replaces all pixels included in T. The offsets represent the difference of position between a pixel in the area to be completed (target region) and its corresponding candidate in the source region. Offsets are defined with a mapping function f that maps each target position to a source position (see Figs. 1b and e for typical examples): (1)f corresponds to a transformation that solves a global minimisation problem and is determined for each target pixel. The synthesised image is then created by replacing all target pixels with their corresponding source pixels. It is important to note that only the offsets, i.e. the difference of position between a pixel included in the area to be completed and its corresponding candidate in the rest of the image, are susceptible to fluctuate. The synthesising procedure (pixel copy) is ultimately performed on RGB frames using the defined offsets. To understand if the completion algorithm is able to correctly use pixels from surrounding materials, the percentage of good match [ in (2)] between the offsets and the ground truth material mask was assessed for each of the ten input images. It corresponds to the number of times the completion algorithm uses a pixel from the ground truth material region over the total number of pixels in the target area: (2) (3)where corresponds to the ground truth material region (Fig. 4b). is defined for each target pixel (p). N corresponds to the total number of pixels from the target region and to the match rate (units: %). Results are presented in Figs. 5a and b using boxplot representations. Each boxplot includes 500 computed match rates (10 images recorded by the multispectral camera × 50 completion trials per image). For comparison purposes, the match rates computed using RGB images were reported on these figures (red boxes). Fig. 5Open in figure viewerPowerPoint Respect of the surrounding materials by the completion algorithm. The match rates are computed between offsets and the ground truth material mask for each image. The presented results integrate all the 500 trials. For comparison purposes, match rates computed using RGB images are indicated on each figure (red boxplot) (a) Match rates computed on raw multispectral channels, starting from single (monochromatic) channel to all the 16 channels, (b) Match rates computed on noise-adjusted principal components Fig. 5a presents the match rates computed when completion is performed on raw multispectral channels. Starting from all the 16 channels, we progressively averaged the spectral image two channels by two channels until reaching a single channel (monochromatic image). Fig. 5b presents the same percentage of good match, but when performing completion on principal components. The latter were computed from a noise-adjusted principal component analysis, a transformation developed to sort principal components by image quality (decreasing image quality with increasing component number). We have employed minimum/maximum autocorrelation factors to estimate the noise covariance matrix. The method has been proposed by Green et al. [31] and uses between-neighbour differences to estimate the noise covariance. Results presented in Fig. 5a exhibit an increase in the match rates that are correlated with the augmentation of the number of channels. Also, the boxplots length indicates that the variance tends to simultaneously decrease. Adding a more precise spectral information to the completion algorithm leads to better considerate the physical properties of materials. Subtle variations that were not necessarily observable in the standard RGB colour space are considered (see Section 4.1). Image completion based on principal components (Fig. 5b) tends to better consider the surrounding materials, the maximum median value being equal to 99% (instead of maximally 80% when considering raw multispectral channels). In addition, only four components are necessary to achieve this score. The last principal components containing more and more noise, the induced artefacts generate a bias that leads the completion to pick patches in a random fashion, thus reducing the mean percentage of good match while increasing the variance. 4.5 Rendering analysis In this section, we propose to empirically assess the quality of the completion by comparing the synthesised area with its original content using an error function. The latter corresponds to the Euclidean distance based on the R, G, and B channels and is computed for each pixel of the target region. Fig. 6a presents the Euclidean errors computed using offsets that where determined on raw multispectral channels. As before (see Section 4.4) and starting from all the 16 channels, we progressively averaged the spectral image two channels by two channels until reaching a single channel (monochromatic image). Fig. 6b presents the same information, but when performing completion on principal components. A close-up view is displayed on the top of the figure to identify the error minimum. Errors computed when completion is performed on standard RGB images are, respectively, reported on the two figures using red boxes. Fig. 6Open in figure viewerPowerPoint Rendering analysis assessed using Euclidean errors computed between synthesised and original RGB images. The results are averaged over the 500 trials. For comparison purposes, the errors computed when completion used offset defined on RGB frames are indicated on each figure using a red boxplot (a) Errors computed using offsets determined on raw multispectral channels, (b) Errors computed using offsets determined on principal components Completion based on four multispectral channels (Fig. 6a) presents the general minimum error. From Fig. 6b, completion based on the first two principal components presents the minimum error. Employing more principal components gives worse completion results. This effect is inherent to the noise-adjusted principal component transform: the last components containing more and more noise, the induced artefacts generate a bias that leads the completion to pick patches in a random fashion. In addition to these statistical tendencies, illustrative completion results are presented in Fig. 7 to visually compare renderings. From these results, we can conclude that completion based on four multispectral channels produces plausible results when the materials are respected (i.e. when only the pixels included in the region defined by the material that surrounds the missing region are used for completion. See Fig. 7, images # 1 and 6, for a typical example). In comparison, completion based on the first four principal components produces less consistent results. The chromaticity (colours) is respected, but the intensity (brightness) seems inconsistently distributed. In contrast, completion based on four multispectral channels tends to produce chromatic inconsistencies when the materials are not respected (see results of images # 8 and 9 on Fig. 7). This time, completion based on the first four principal components delivers more plausible results, even if brightness discrepancies can still be noted. Fig. 7Open in figure viewerPowerPoint Spectral completion (a) Source image with (b) its corresponding close-up view. The red pattern corresponds to the area to be completed, (c) G

Referência(s)
Altmetric
PlumX