Structure preservation in content‐aware image retargeting using multi‐operator

Artigo Acesso aberto Revisado por pares

Structure preservation in content‐aware image retargeting using multi‐operator

2020; Institution of Engineering and Technology; Volume: 14; Issue: 13 Linguagem: Inglês

10.1049/iet-ipr.2019.1032

ISSN

1751-9667

Autores

Ankit Garg, Ashish Negi,

Tópico(s)

Advanced Image Fusion Techniques

Resumo

IET Image ProcessingVolume 14, Issue 13 p. 2965-2975 Research ArticleFree Access Structure preservation in content-aware image retargeting using multi-operator Ankit Garg, Corresponding Author agarg1@ggn.amity.edu Department of Computer Science & Engineering, UTU, Premnagar, Dehradun, IndiaSearch for more papers by this authorAshish Negi, Department of Computer Science & Applications, GBPEC, Ghurdauri, Pauri, IndiaSearch for more papers by this author Ankit Garg, Corresponding Author agarg1@ggn.amity.edu Department of Computer Science & Engineering, UTU, Premnagar, Dehradun, IndiaSearch for more papers by this authorAshish Negi, Department of Computer Science & Applications, GBPEC, Ghurdauri, Pauri, IndiaSearch for more papers by this author First published: 01 November 2020 https://doi.org/10.1049/iet-ipr.2019.1032AboutSectionsPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onEmailFacebookTwitterLinked InRedditWechat Abstract The evolution of image retargeting technique demands the exploitation of multi-operators since they are capable of preserving the structure and salient objects of the image. However, these multi-operators are mostly based on seam carving with scaling or cropping operators which lead to significant distortions in the retargeted image. This study proposes a new multi-operator scheme which has improved seam carving, through the proposed seam diversion based image retargeting algorithm, integrated with the cropping and warping operator. A total of six different multi-operator schemes have been proposed out of which the MO6 technique gave remarkable results in terms of image quality, least distortion, and lowest run-time. To simplify image retargeting operations, an optimised image distance function was used. The optimised image distance function was formulated which combines bidirectional image Euclidean distance, dominant colour descriptor, and an energy-based coefficient to bypass seams from the point where seams start clashing and hence the defined threshold violates. By integrating cropping and warping into the proposed algorithm, it preserves the salient features of the retargeted images. Typical results have been presented which demonstrates the effectiveness of the proposed methods. User-based subjective analysis has also been carried out which shows that image retargeted using the MO6 technique has high user preference. 1 Introduction Today, there are a large number of images and videos that are uploaded on the internet. Methods for changing the size of the images are gaining importance with the advancement of display devices (PDA, PC, cell phones etc.). To resolve the problem of image incompatibility with different sizes of screen display, content-aware image resizing has gained popularity and have become the latest topic of research. In content-aware image resizing, it is essential to preserve the visual quality of the retargeted image. After the image retargeting process, the main parameter of image quality assessment is whether the resized image is a good approximation of the original image or not. Non-uniform warping and seam carving (SC) offer better results than other content unconscious techniques such as cropping or scaling. For better results, the content-aware image retargeting operator trusts on the importance map to safeguard the salient region of the image and avoid less-important contents. Importance measures comprise of image gradients, saliency detector, and entropy, along with high-level cues like face and motion detectors. Though, content-aware image retargeting methods are not succeeding for all of the cases of image resizing. If the important object in the image covers a larger area within that image then during retargeting, those important objects might distort, resulting in the distorted image quality. Hence, the results of various image retargeting methods are dependent upon the contents available in the image. A certain image retargeting method is suitable for one image type, and hence may not be suited for other similar image types. In this situation, a combination of multiple existing operators might attain better results than any particular standalone method. Multiple operator image retargeting defines a sequence of existing media operators to retarget image to a desirable size. At this point, many sequences of media retargeting operators might be suggested to give better results of image retargeting. The sequence of media operators is not commutative and hence many paths might be suggested for better media retargeting results. This paper proposes a new multi-operator scheme which has improved SC, through the proposed seam diversion based image retargeting algorithm (SDIR), integrated with the cropping and warping operator. The major contributions can be summarised as follows: (i) The SDIR algorithm improves the conventional SC technique by preventing the intersection of seams at single and multiple points and also restricts the seam removal from a single row/column which covers pixels beyond a defined threshold. (ii) In the SDIR algorithm, to bypass the seam from the intersection point, the energy of the pixel at that point is increased by searching and replacing the maximum energy in the corresponding row/column. (iii) To further improve the efficiency of the SDIR algorithm, a hybrid approach is suggested in which the SDIR algorithm is combined and tested with different image retargeting operators such as scaling, cropping, and warping. (iv) To compare the proposed multi-operator with the other state-of-the-art image retargeting techniques, a qualitative and quantitative analysis has been presented. The proposed technique has enhanced performance during the image retargeting process with minimal distortion of salient objects. 2 Background study SC is a content-aware image retargeting technique to remove optimal seams of low energy pixels from the image. The seam removal process of the SC change aspect ratio of the image. Avidan and Shamir [1] proposed a SC algorithm which retargets the image in a content-aware fashion. In 2008, Rubinstein et al. [2] proposed the forward energy criterion which specifies, after the optimal seam removal process, the minimum amount of energy reintroduced and causing artefacts. The discrete method of content-aware image resizing is based on the scaling method of SC. This method considers the image as discrete pixels and deletes pixels from the image based on their importance [3, 4]. The Seam Carving - Scaling - Combined optimized Image resizing algorithm (SCSC) algorithm is a discrete method that coherently executes line cropping and uniform scaling to scale an image. The continuous method of content-aware image resizing is based on mesh deformation [5, 6]. Often the result of image retargeting algorithms depicts content loss and deformation of important structures present in the image. Various image resizing methods [7] may produce a different amount of content loss in the salient region of the image [8]. Discrete content-aware image retargeting methods consider images as pixels [9], structural distortion in the salient region of the image occurs due to the insertion and deletion of pixels. Duplication and deletion of pixels results in jagged lines, edges, and change in aspect ratio (e.g. cropping, SC, and shift-map). Continuous retargeting methods consider the image as the composition of different meshes. In the continuous method, the image distortion occurs due to adjustment of mesh size under some constraints (e.g. non-homogeneous warping [5], scaling, scale-and-stretch [10], and streaming video [11]). 2.1 Scaling Mũnoz et al. [12] proposed an optimal spline-based algorithm with arbitrary scale factors. The suggested algorithm minimises information loss in the least-squares sense. The complication of this approach is autonomous of the scale factor and outperforms the standard interpolation technique for image retargeting. Kim et al. [13] proposed an algorithm, to perform the scaling transformation using an area pixel model. In this algorithm, a maximum of four pixels of a source image has been used to compute one pixel of a target image. For suggested algorithm hardware is designed using an FPGA and some test scenes are displayed on LCD. Andreadis and Amanatiadis [14] proposed a scaling algorithm that uses a mask of maximum four pixels and computes the luminosity of each pixel by combining two factors; the percentage of area covered by the mask and the difference in luminosity among the source pixels. Shi et al. [15] proposed a technique which is based on morphological edge interpolation. Opening and closing image morphological operations are used to eliminate noise and to make the image smooth. For plain and edge regions present in the image, two interpolation algorithms were used. Jiang et al. [16] proposed an efficient edge-adaptive scaling algorithm, in which the input image is segmented in four types of image blocks having a directional edge detector. Interpolation is applied along the direction of edges. Liang et al. [17] proposed a patch-wise scaling method that focuses on the salient area of the image and also tries to preserve the global visual effect. Local bidirectional similarity measures and smoothness measures are used to assess the quality of the retargeted image. Based on the saliency map source image is segmented in important and unimportant patches. In the reported scaling literature [12–17], there is a scope of further improvement in the areas of aliasing and image artefacts, distorting line structures and problem in interpolation. 2.2 Cropping The cropping technique generates the best results if the image contains only one object centred in an image. If the image contains multiple important objects at image borders, one of the objects might cut through. Setlur et al. [18] suggested a technique that avoids this situation by cutting the salient region and adding them back into the retargeted image. Kopf et al. [19] enhanced the automatic cropping technique presented by Suh et al. [20]. Cropping techniques presented in this section first calculate an importance map and start the search for the cropping window which covers the most significant regions. This search is always a heuristic one: a heuristic graph-cut [21], or backtracking [22], a greedy search [18–20]. The attention-based cropping approaches [18–22], searching for an optimum cropping rectangle and loss of image information around the image periphery needs further improvements. 2.3 Seam carving Yan et al. [23] computed coherent scaling factor of individual pixel and hence seams are identified. Qi et al. [24] suggested an algorithm to resize the image to a non-rectangular shape. Kiess et al. [25] enhanced SC to preserve line structure properties that are present in the image. Patel and Shanmuganathan [26] proposed a technique to speed up the naive SC process by elimination or inclusion of multiple pixels wide batch seam in a single iteration rather than a single-pixel wide seam. Along with the energy of pixels to be removed, inserted energy after the removal of a batch seam is also minimised to prevent the inclusion of false edges. The width of a batch seam is a critical factor that made adaptive during the retargeting process to preserve the energy of an image. Zhou et al. [27] proposed a significance detection method, which considers the edge information and visual saliency. To assess the compressibility of the image, a wall-seam model was built and assign the accurate number of seams for each direction. Finally, the suggested algorithm is completed with the supplement of uniformly scaling, the stretched image is resized to the target size with the least structure distortion. Liu et al. [28] presented an adaptive image retargeting approach using saliency-based continuous seam carving (CSC) to efficiently display images on small screens. A multiscale contrast-based saliency map is first generated and used as the energy map for conventional discrete SC, and a reasonable number of seams are adaptively extracted. Then the reduced dimensions allocated to CSC and possible scaling are determined by the analysis of the energy curve of the extracted seams. In most existing SC literature [1–3, 23, 25, 27, 28], problems of intersection of seams, segmentation, multiple pixel removal within a single row/column leading to distortion have not been addressed completely. 2.4 Warping Warping is another retargeting operator used in content-aware fashion and controls scaling factors in an efficient manner. In the downsampling technique, applying a higher scale factor on the image can modify ROIs. To overcome this problem, the image is partitioned into a mesh grid with triangular or quads. Liu and Gleicher [29] proposed a non-photo-realistic fisheye-view warping method which suggested that without using the cropping method, an image can be reduced to smaller sizes. The main deficiency of this method was that it is capable to generate photorealistic results. Panozzo et al. [30] suggested the axis-aligned deformation image resizing technique which suppresses the distortions of the fisheye-view warping method. To improve the height or width of the rows or columns, axis-aligned deformation uses a rectangular grid. Liang et al. [31] proposed another similar approach. Hu et al. [32] also adopt axis-aligned deformation to retarget images by a Laplacian regularisation term to the adjacent cells to suppress unexpected deformations. Another early version image retargeting technique based on warping was proposed by Gal et al. [33] for 2D texture mapping. Wang et al. [10] suggested an approach based on warping which subdivided the image and defined a scaling factor iteratively for each of the subdivisions of the image. Guo et al. [34] used triangular mesh to warp the original image to the desired size. The limitations of this approach are the small amount of area of pixels covered by the triangles. The literature on homogeneous and non-homogenous warping [10, 29, 30, 33, 34] report the problems in pixel mapping and resampling during forward and reverse mapping. The selection of accurate mesh is also a big challenge for different researchers. 2.5 MultiOp (MO) It is already stated in the previous sections that each operator has some drawbacks. To rectify these deficiencies, a sequence of operators has to be designed. This can be achieved by using them one after the other. Rubinstein et al. [35] comprised cropping in addition to scaling and SC in multi-operator image retargeting approach. Dong et al. [36] proposed an operator-cost-based approach that combined scaling, cropping, and SC to attain faster processing times. In two consecutive steps, Wu et al. [37] used SC and warping. The first step detects and removes the seams until the visual image quality degradation reaches up to a defined threshold. In the next step, warping is applied. Zhang et al. [38] combined SC and warping in which SC detected ROIs by considering seams with maximum energy and warping retargeted the image. Dong et al. [39] suggested a retargeting approach that used different approaches for textured and non-textured regions. Kiess et al. [40] suggested the SeamCrop algorithm to combine SC with cropping and Kiess et al. [41] proposed the extension of SeamCrop. Wang et al. [42] outlined an optimisation problem that considered the parameters for scaling, cropping, and warping in the single step. The aforementioned literature reports the problems of computing complexities and performance evaluation, along with the optimisation of the different operators and their sequence. Zeng et al. [43] suggested an efficient mosaic algorithm named combined SIFT and dynamic programming. The proposed algorithm uses an improved optimal seam searching criterion that provides ‘protection mechanisms’ for moving objects with an edge-enhanced weighting intensity difference operator and ultimately solves the ghosting and incomplete effect induced by moving objects. Li et al. [44] proposed a robust mosaicking procedure. The proposed algorithm realised tonal adjustment using local moment matching algorithm, then an automatic piecewise dynamic program algorithm for seamline detection is proposed to detect the optimal seamline on the overlapped area. The cosine distance weighted blending method is also proposed to ensure that the seamline is as invisible as possible. In contrast to the aforementioned approaches, the proposed work focuses on several overlooked aspects such as seam intersection and multiple pixel removal within a single row/column; avoiding image artefacts; lines and curved structure preservation; searching the optimum cropping rectangle; improving the computational complexities and performance evaluation. 3 Preliminary algorithm The algorithm is based on seam carving which simultaneously optimises and improves the existing seam carving technique. The SDIR algorithm counters the problems commonly faced during SC image retargeting, viz. intersection at single or multiple points during seam removal and removal of the seam in a single row/column if it covers a number of pixels beyond a defined threshold, resulting in image distortion in the retargeted image. To get the minimum energy of pixels in optimal seam path, (1a) and (1b) are used (1a)From (1a), the obtained index of minimum energy from the first row of energy map M is 1, where (i = 0 to j = 1). In backtrack matrix, the values are changed (where i = 1, j = 0, Idx = 1). Hence, backtrack [1, 0] = 1. After comparing the energy of neighbouring pixels, the obtained least energy is 3. After the first iteration, minimum energy value will be added with 10 in energy map which is indexed at M [0, 1]. If the value of j > n then the conditions in (1a) will be false (1b)From (1b), if j≠0 then Idx = min (M [1, 0…3]), here M [1,0…3] will generate row with energy 13, 12, and 11. Idx = min (M [1, 0…3]) results Idx = 2, backtrack [i, j] = Idx + j–1 will generate backtrack [2, 1] = 2 + 1 − 1, 2 would be stored at index backtrack [2, 1]. From Minenergy = M [i − 1, Idx + j − 1] is used to obtain the Minenergy which is 8. Finally, value of Minenergy would be added with M [2, 1], i.e. 5 + 11 = 16 (2)Equation (2) is used to obtain the location of intersection point to bypass the seam from intersection point in a single row and column, initially both the mask (Mask1 and Mask2) are set to zero. Mask1 and Mask2 are updated by 1 at the same location where Minenergy is obtained in the energy map. After calculating the Minenergy location in the energy map in next iteration, the corresponding locations in both the masks are checked, if the locations in both masks Mask1 (i, j) and Mask2 (i, j) is already set to 1, it shows that seams are trying to clash. In this case, maximum energy is searched in the corresponding row in the energy map and energy of intersection point replaced with maximum energy, M (i, j) = Maxenergy otherwise the process of finding the least energy pixel in energy map continued (3)From (3) seam bypassing operation can be performed if the seam covers the number of pixels in a single row beyond the defined threshold (T). In each iteration where j = 1, 2, 3…n, number of covered pixels by optimal seam are counted. If the count of pixels Rpixelscount>T, the entry location of the seam in the corresponding row is updated by the maximum energy otherwise the seam travels in the row to obtain the optimal pixel locations which can be further added in the path of the optimal seam (4)From (4) seam bypassing operation is performed if the number of pixels covered by a seam in single-column beyond the defined threshold (T). In each iteration where i = 1, 2, 3…m, number of covered pixels by the optimal seam are counted. If the count of pixels Cpixelscount>T, the entry location of the seam in the corresponding column is updated by the maximum energy otherwise the seam travels in the column to obtain the optimal pixel locations which can be further added in the path of the optimal seam (see Fig. 1: Algorithm 1). Fig. 1Open in figure viewerPowerPoint Algorithm 1: SDIR algorithm 4 Proposed MO framework In this section, the SDIR algorithm (PA) will be combined with the different operators such as scaling (Sl), cropping (Cr), and warping (W) to obtain a retargeted image with minimal distortion. In scaling algorithm (see Fig. 2: Algorithm 2), (5) change the aspect ratio of the original image p using new width and height (5) (6)To find the relative position of the pixel in the source image p, (6) is used where and are the location of the lower-left corner of the scaled image and and are the location of the lower-left corner of the original image p. In each iteration, and the value of and is found. In each iteration, and are the relative positions of pixels from the source image to the target image I′ (7) To set the RGB colour of relative locations same as original image p intensity of pixels is calculated using (7) where Yr, Yg, Yb are the colour components of the relative positions in the retargeted image I′. Fig. 2Open in figure viewerPowerPoint Algorithm 2: image retargeting using scaling In graphics and photography industries, cropping image retargeting operator is used to remove the unwanted area from the periphery of the original image. This operator is one of the most basic images retargeting operator for photo-manipulation processes such as removing noise and unwanted objects from the side-line of the image, to change the aspect ratio of the image. Cropping can be made by cutting off the top, bottom, left, and right margins of the image. In the proposed cropping algorithm (see Fig. 3: Algorithm 3), the saliency map is obtained from the proposed image retargeting algorithm shown in Fig. 1(8)Using (8) in each iteration where and j = 0, 1, 2…m, number of rows in the saliency map M is counted where M (i, j)≠0 and the count K1 is stored in variable (9)Using (9) number of columns are counted where for each location M (j, i) ≠ 0 and the count K2 is stored in . The number of rows and columns having all zero intensity represents the portion of the saliency map where salient objects are not present as the gradient in the saliency map in not changing continuously. Fig. 3Open in figure viewerPowerPoint Algorithm 3: image retargeting using cropping To find out the corner locations of the cropping window which entirely covers the salient objects, the location of first and last non-zero pixels is saved horizontally and vertically such as where and j = 0, 1, 2, 3…n − 1. holds locations of pixels where intensity is non-zero in X and Y directions (10)To calculate the left, right, top, and bottom margins in cropping algorithm, the can provide the locations of rows and columns from the saliency map which contains pixels having non-zero intensities. To obtain the width and height of the salient region which is to be cropped can be taken by choosing appropriate margins using (10). Warp operator includes a process to manipulate structural properties on the image in such a way that the prominent objects in the image have not been significantly distorted. To find the relative position of pixel location from the source image to retargeted image, forward and reverse mapping is applied pixel by pixel. In the warping algorithm (see Fig. 4: Algorithm 4), ideal resampling requires correct filtering to minimise the image artefacts. In resampling with the filter, the output is the weighted average of the input where weights are normalised values of the filter kernel The resampling using a filter can be applied using (11) (12)Reverse mapping in image warping is performed by (12) where is the location of pixels in the destination image which is mapped with the pixel location in the source image. and are the functions for performing the reverse mapping of each pixel in the destination image. is the resampling of the pixel in the source image (13)Forward mapping in image warping is performed by (13) wherein each iteration which is the location pixel in the source image is mapped with the location in the destination image. , are the functions for performing the reverse mapping which is used to find out the locations of each pixel in the destination image. Fig. 4Open in figure viewerPowerPoint Algorithm 4: image retargeting using warping 4.1 Developing the operators A total of six multi-operators have been proposed in this section which will be developed and used in generating results for the testing and comparison. The proposed multiOp (MO) will be tested for three types of images and the results will be presented in the subsequent sections. The proposed MOs are presented below: (14)In this paper, the proposed multi-operators will be denoted using the annotations as mentioned in (14). 5 Objective function for resizing In this section, the objective function has been defined which will be used to optimise the image resizing process, compare, and evaluate the resized images generated by the different MOs explained in the previous section. For this objective function, two methods have been used; viz. patch-based bi-directional image Euclidean distance (IMED) and image dominant colour descriptor (DCD). Its function is represented in (15). In this equation, F is the original image while S is the new image with a user-specified size. γ is the user-specified coefficient having a value between 0 and 1. is the distance function that measures the distance between F and S. The similarity of DCDs is represented by . f represents the energy of the pixel from which seams are bypassed in our algorithm (15)In computer vision, Euclidean distance is the most common image distance measure, to transform images into vectors according to the intensity of each pixel, and compare the difference of grey level pixel by pixel. A small variation in similar images results in large Euclidean distance. To calculate the IMED of the image, two steps must be performed. In first step linear transformation is performed on the original image and then traditional Euclidean distance is calculated between the transformed images. In many existing pattern classifiers such as PCA and SVMs, IMED can be easily embedded. Based on past knowledge that pixels positioned near one another have a small change in grey levels, IMED decides the correlation among pixels only according to the distance among pixels on the image lattice. Distance function DIE is calculated using (16). In this equation, U is the patch in the original image; V is the patch in the new image; D is the distance (16)The image Euclidean distance between two image patches can be obtained using (17). In this equation, sij is the spatial relationship between pixels; pi and qi are the pixels in the patch U and V, respectively, of the image n × m(17)Equation (18) gives the spatial relationship sij, between pixels; pi and qi. In this equation, δ is the width parameter; is the spatial distance between pi and qj on the image pattern (18)where x and y are the location of pixels (pi and qi) in the image patch. For image retrieval, the DCD scheme is used to represent multimedia contents and related distance measures have become essential tools. The similarity of the DCD is an important term that delivers an actual, compact, and intuitive explanation of the descriptive colours in an image or region [44]. Statistical properties of dominant colour values can be specified by the DCD, where statistical properties can be variance and distribution. Users can search and retrieve images in the database based on dominant colour values. The DCD can be defined using (19). The descriptor comprises a percentage of the pixel (pi) corresponding to ith dominant colour, colour variance (ci) represents the variation of the dominant colour, vi shows dominant colour, n represents the number of dominant colours, Sc represents the spatial coherency (19)Consider the two DCDs (K1 and K2) as shown in the following equations: (20) (21)The dissimilarity between the two images with respect to DCDs can be obtained using (22). In this D is the distance between two images with respect to the DCDs; ΔDC is the dominant colour difference (22)For obtaining the dominant colour difference ΔDC, following distance function can be used as shown in the following equation: (23)where represents the maximum distance between two colours which are considered similar and , where lies between 10 and 20 and from 1.0 to 1.5. For finding the similarity between two DCDs, (24) is used (24)These objective functions will give the number of distortions persistent in the retargeted images using the global colour map and distance which can affect the visual appearance and appeal of the image. This function plays an important role in preserving the salient features of the image while image retargeting. This also helps in preserving the small objects present in the image which can be removed or neglected while retargeting. 6 Results and discussion This section presents testing results that give a clear picture of the selection of the optimised MO amongst others. The system used for evaluation has Intel Core i5-8250U clock

Ver no editor

Altmetric

PlumX

Entrar

Lembrar minha senha

Receber meu e-mail de confirmação

Structure preservation in content‐aware image retargeting using multi‐operator