Contour Based Retrieval for Plant Species

Автор: Komal Asrani, Renu Jain

Журнал: International Journal of Image, Graphics and Signal Processing(IJIGSP) @ijigsp

Статья в выпуске: 9 vol.5, 2013 года.

Бесплатный доступ

Recognizing a plant in any huge vegetation is a tedious work for us. We recognize a plant on the basis of its size, leaves, flowers, fruits, etc. Leaf is a part of the plant which can be found on plants almost in all seasons and most of the time we have to recognize plants on the basis of its leaf. But when dealing with leaf of plant, it is important to consider the finer details of the contour representing the shape of the leaf. We are trying to build a system which has a database of leaves of different plants and given a leaf, we find out the plant to which it may belong. In this paper, we present the results of tangential angle approach used for retrieval. A database of around one thousand leaves of different plants has been created. Each leaf image is preprocessed to extract its boundary. Then tangential angle approach is applied which captures the angular details of the boundary of shape. We have done the testing for around 1000 leaves and on the basis of that recall, precision and error rate have been calculated to measure the effectiveness of the proposed method.

Еще

Image Retrieval, Shape based retrieval, Content Based Retrieval, Feature Vector

Короткий адрес: https://sciup.org/15013029

IDR: 15013029

Текст научной статьи Contour Based Retrieval for Plant Species

With the rapid increase of digital images, image retrieval has become a topic of research. Users provide images as query and want a set of similar images based on certain parameters. For content based image retrieval system, there are several parameters which can be used to represent the image like shape, color, texture, color layout. Among these parameters, shape is considered to be an important low level image feature and provides a powerful cue for similarity matching. But, various issues need to be considered while dealing with the shape parameter so that the extraction process is able to recognize similar shaped objects inspite of their different orientations, size and position. Thus, a good shape representation and retrieval system should have following important properties: (1) Each shape should have a unique representation in terms of feature vectors, which should be invariant to scale, rotation and translation. (2) A shape represented in the form of feature vectors should occupy minimum memory space. (3) The feature vectors representing the shape should reduce the complexity in representation. Typically, an image retrieval system consists of two stages: first is representation of images in the form of feature vectors and second is defining similarity between the query and the database collection.

In general, two dimensional shape descriptions can be divided into two broad categories: - Contour based and Region Based. Contour based approach assumes the boundary details of the shape. This approach assumes that object boundaries can be modeled through a continuous function. It is effective in the process of overall recognition of the object. Region based approaches consider the internal details of the object along with the boundary details. Hence, it provides a meaningful description when the objects have similar spatial distribution of pixels by using the geometric properties of the shape [1]. This method is more reliable when dealing with complex images and when there is lot of intra-class variations.

Sufficient amount of work has been done in the field of image retrieval for applications like fingerprint recognition, face recognition, shoeprint recognition, trademark recognition etc. A number of researchers have considered the problem of content based image retrieval. Some of the widely used contour based approaches are chain code[26], Fourier transform[22], shape signature[23], polygonal approximation[27], curvature scale space[25], contour saliences [24], Delaunay triangulation method[29,30], shock graph[31], wavelet transform[32,33], medial axis transform[34], deformable templates[35,36], moment invariants[37]. Chalechale et al.[2] considers the shape in the form of pixels and analyses the pixel density using Angular Decomposition Approach, but it is difficult to identify an image merely on the basis of pixel density as it would provide incorrect and ineffective results. The edge histogram descriptor (EHD) was proposed in the visual part of MPEG-7 standard [3][4]. The 2D Fourier transform in polar coordinates is employed for shape description in [5]. Edge Pixel Neighborhood Information (EPNI) employs neighborhood structure of edge pixels to define an extended feature vector. The vector is used for measuring the similarity between the query and arbitrary model of images, but it is not rotation invariant [6].

Elaborate work has been done in the field of plant recognition. Abbasi et. al. [8] and Moktarian et. al. [9] proposed a curvature scale space (CSS) image to represent leaf shapes for Chrysanthemum variety classification. Wang et al.[10][11] described a method which combines different features based on centroidcontour distance curve and adopted a fuzzy integral for leaf image retrieval. Wang [12] proposed to decompose the binary image into convex subparts and the shape of each subpart was represented in terms of basic structural elements. Though the approach was geometric invariant and robust to noise, but it made high approximations of the input shape. Du et. al.[16] generated polygonal representation of leaves. Yang et. al.[13] proposed hierarchical decomposition descriptor to extract features by recursively estimating the bounding box of filled binary image and checking whether the corresponding bounding box is similar to the image or not. Hang at. al [15] proposed combination of Fourier Analysis and Procrustes Analysis to perform plant species identification on a database of 2420 leaves of 151 plant species. Yunyoung Nam et. al[17] proposed a new shape representation scheme based on Minimum Perimeter Polygon algorithm, which reduces the number of points required for performing the matching. Pan et al.[18] considered extracting the skeletal features representing the shape using medial axis transform and then transformed the features into string of symbols. Belongie et. al. [19] proposed descriptor named shape context which was able to capture the distance of a point to all the other points representing the shape. Shock trees were used to represent shape and this approach was based on sub-graph isomorphism [20]. Zheng et. al.[38] proposed perceptual shape parsing and grouping. It defined the shape in long and short vector form. Chia Hung[39] proposed two component feature matching in terms of global and local descriptors. For global descriptors, Zernike functions are used and local feature extraction used curvature of each boundary point. Arica et al. in [40] proposed the Beam Angle Statistics (BAS) descriptor and tested its effectiveness on MPEG-7. Ming Li et. al[41] presented a novel shape representation defined as elliptical shape coding, which represents the shape in the form of periodic signals. The alignment between two shapes is performed by performing convolution of the corresponding periodic signals.

Here, in this paper, we propose identification for plant species based on the shape features of its leaf. The reason for considering leaf for plant recognition is that it is very easy to scan a leaf and moreover, it does not lose its details while being captured as a two dimensional object. According to Wang and Zheng [14], it is difficult to analyse the shape and structure of the flower as they have a complex three dimensional structure. Moreover, leaves are available throughout the seasons whereas flowers are available only in their blooming season. So leaf is considered for computer aided plant identification. This paper focuses on identifying the plant using shape parameter as shape features are considered to be extremely powerful for recognition purpose. Visual shape descriptors provide effective results if the query image and database images are retrieved against white background. In case of leaves, we cannot take into account the color feature because mostly leaves are green. Though texture of leaves vary, it requires complex computation. Therefore, we have tried to recognize the plant species based on the shape of leaf using tangential approach assuming that the leaf image has been taken in a homogeneous background.

The organization of the paper is as follows. The details of the proposed approach are discussed in Section 2. Section 3 presents and analyses the experimental results. Section 4 concludes the paper and suggests new directions.

II. FEATURE EXTRACTION AND SHAPE REPRESENTATION

We need to extract edges and generate the feature vectors representing the shape of the leaf. This requires image processing.

A. Leaf Image Processing

The leaf images are scanned using flat bed scanner against a plain background. They are processed to extract the feature vectors. There is no restriction on the dimensions of the image while scanning. Once the images are scanned, they are scaled maintaining the aspect ratio, as the property of aspect ratio is very important for identifying the leaf. After scaling, the image is processed to extract the edges using ImageJ. For extracting the edges, the colored leaf image is converted to binary image and then edges are detected. Fig. 1 shows the steps involved in preprocessing the leaf for extracting the coordinates of the leaf image. Then the boundary image of the leaf is processed to extract the coordinates of the leaf. It has been implemented using NetBeans. The coordinates of the image are read in counterclockwise direction and the values of the coordinates are stored in database as feature vectors. The database used is Oracle 10g. Thereby the coordinate values are read and tangential angle approach is applied. Fig. 2 shows the sequence of the steps to be followed for generating feature vectors of the leaf.

Figure 1. Stages involved in Leaf image Preprocessing

Figure 2. Steps for generating feature vectors for a leaf image

B. Tangential Angle Approach

This method describes the boundary of the shape by the turning function. This function measures the boundary of the shape as tangential function of the consecutive points. The angle is measured in radians. This helps in tracking the deviations along the boundary with reference to the previous coordinate. Mathematically, the tangential angle is calculated as follows:

tan θ = y n+1 - y n (1)

xn+1 - xn where (xn+1, yn+1) and (xn, yn) are consecutive coordinate values read in counterclockwise direction. Once the angles for the complete feature vectors are generated, they are stored in the database as an array. The size of the array is dynamic because it is dependent on number of points required for representing the leaf boundary.

C. Image Matching and Similarity

Once the feature vectors are generated and stored in the database, next stage involved in image retrieval is image matching and similarity measure. For this, distance metric is the tool which is used for retrieving similar images when a user provides query to match against the database of images. For measuring the similarity between images, various techniques like Minkowski Form, Quadratic Form, Mahalanobis Distance, Euclidean distance, City Block distance are available[21]. Here, in this paper, we have used Euclidean distance for measuring the distance between the query and database images. Euclidean distance is also referred as L2 distance. For the query image q and the database images p, the distance between the query image and database images d can be defined as:

Thus, using equation (2), an array of distances is generated which represents the dissimilarity between the query and the database image. The values are sorted and the minimum distance defines the closest match/ best match.

D. Effectiveness Measure

Once the leaf image is defined in the form of angular deviations, it is important to measure the effectiveness of the approach. Here recall, precision and error rate are calculated to measure the effectiveness of the method. Recall is defined as ratio of number of relevant retrieved images to number of all relevant images [7].

Recall = Number of relevant retrieved images (3) Number of all relevant images

Precision is defined as the ratio of number of relevant retrieved images to total number of retrieved images[28].

Precision= Number of relevant retrieved images (4)

Total number of retrieved images

Error rate is defined as ratio of non-relevant images retrieved to the total number of images retrieved.

Error Rate = Number of irrelevant images (5)

Total number of images retrieved

III. EXPERIMENTAL RESULTS

We created a database of one thousand different leaves of various plants by scanning them against a white background. The scanned colored images are converted to binary image and edges of the image are generated using ImageJ. The tangential angle approach has been implemented in NetBeans using Java. Using NetBeans, the array of angles, representing the boundary is generated and the angles as feature vectors are stored in Oracle. Fig. 3 shows the snapshot of the scanned plant leaves.

Figure 3. Subset of scanned leaf image database

Fig. 4 shows the snapshot of the plant leaves after they had been processed to generate edges using ImageJ.

Figure 4. Subset of boundaries extracted for leaf image database

Fig. 5 shows the user interface for taking the input from the user and processing it for execution using Tangential Angle Approach:

Once the leaf is processed by tangential angle approach, the feature vectors generated for Ashok leaf is as follows:

1.3111666, 1.3149878, 1.3187003,

1.3223089, 1.3258177, 1.3292307,

■ 1.2545102, 1.259798, 1.2649175,

1.2698761, 1.2746812, 1.2793396,

1.2838576, 1.2882414, 1.2924967,

1.2076494, 1.2141762, 1.2204821,

1.2265776, 1.2324727, 1.1684753, 1.1760052, ………

Figure. 6 depicts the results of query obtained by executing the query for ‘Ashok’ leaf.

Figure 6. Results of the retrieved images generated by tangential method for the query image depicted in the left

Table 1 shows the results generated after execution of the query for different leaves. Fig. 7 shows Recall/Precision/Error Rate plot for results obtained by executing queries as tabulated in Table 1.

Recall/Precision/Error Rate

Figure 5. Snapshot of the application used for query handling

—♦— Recall

—■— Precision

Error Rate

Figure 7. Recall / Precision/ Error Rate representation for query leaf images

IV. CONCLUSIONS AND FUTURE WORK

The approach presented in this paper (tangential angle approach) enables to measure the similarity based on shape boundary. Tangential angle approach is used to extract the features that are invariant to scale, rotation and translations. Experimental results show an average precision rate of 31% and average recall rate 55% is obtained. The main advantage of this approach is that it is scale, rotation and position invariant. Also the computational complexity is too low. But the major drawback identified is that only angle details cannot be taken as a parameter to represent the complete image. To improve the retrieval performance, the approach needs to be refined by considering more details of the boundary like magnitude besides maintaining the tangential angle details. Also, additional parameters representing the shape need to be identified so that the feature vector corresponding to the shape is more representative and meaningful.

Table 1. Results generated for subset of query images

Query	Recall	Precision	Error Rate
Ashok	0.67	0.31	0.71
Bakeina	0.575	0.25	0.75
Balahaal	0.485	0.25	0.75
Berry	0.6	0.375	0.625
Bhas	0.47	0.4	0.6
Bhel	0.53	0.125	0.875
Cauliflower	0.6	0.375	0.625
Chilli	0.67	0.285	0.71
Cucu mber	0.43	0.167	0.83
Groundnut	0.5	0.2	0.8
Guyiya	0.428	0.375	0.625
Imli	0.567	0.2	0.8
Kathaal	0.5	0.333	0.874
Mango	0.67	0.5	0.5
Neem	0.5	0.375	0.63
Potato	0.6	0.375	0.63
Orange	0.528	0.210	0.714
Akebia	0.612	0.289	0.815
Alumroot	0.592	0.189	0.621
Amiga	0.629	0.176	0.592
Bellflower	0.541	0.152	0.s723
Birch	0.494	0.261	0.615
Cabbage	0.595	0.312	0.582
Chestnut	0.423	0.248	0.519
Cocoa	0.428	0.195	0.725
Delband	0.717	0.278	0.656
Gendaa	0.535	0.189	0.521

Список литературы Contour Based Retrieval for Plant Species

S.Locaric, "A Survey of Shape Analysis Techniques," Pattern recognition, vol 34, no. 8, pp. 983-1001, August 1998.
A. Chalechale, G. Naghdy and A. Mertins, "Sketch-Based Image Matching Using Angular Partitioning," IEEE Transactions on Systems, Man, Cybernetics - Part A: Systems and Humans, vol. 35, no. 1, pp. 28-41, Jan. 2005.
Thomas Sikora, "The MPEG-7 Visual Standard for Content Description—An Overview" IEEE Transactions on circuits and systems for video technology, vol. 11, No. 6, June 2001.
C. S. Won, D. K. Park, and S. Park, "Efficient use of MPEG-7 edge histogram descriptor" Etri Journal, vol. 24, no. 1, pp. 23–30, Feb. 2002.
D. Zhang and G. Lu, "Generic Fourier descriptor for shape-based image retrieval," in Proc. IEEE Int. Conf. Multimedia and Expo, vol. 1, pp. 425–428, 2002.
A. Chalechale and A. Mertins, "An Abstract Image Representation Based on Edge Pixel Neighborhood Information(EPNI)", in Proc. EurAsia-ICT, pp.67-74, 2002.
G.Lu, Multimedia Database Management Systems, Arctech House Publishers, Boston, 1999.
S.Abbasi, F.Mokhtarian and J.Kittler "Reliable classification of Chrysanthemum leaves through curvature scale space" ICSSRC97, pages 284-295, 1997.
F.Mokhtarian ,S.Abbasi and J.Kittler "Efficient and robust retrieval by shape content through curvature scale space" Proceedings of the International Workshop Image Databases and Multimedia Search, pages 35-42, 1996.
Z.Wang ,Z.Chi and D.Feng "Fuzzy integral for leaf image retrieval" Proceedings of Fuzzy System, 1:372-377,2002.
Z.Wang , Z.Chi and D.Feng "Shape based leaf image retrieval", Image Signal Process, 150:34-43,2003.
L.Yu, R. Wang "Shape representation based on mathematical morphology", Pattern Recognition 26(2005) 1354-1362.
M.Yang, G. Qui, Y.Huang, D.Elliman "Near duplicate image recognition and content based image retrieval using adaptive hierarchical geometric centroids" Proceeding of the 18th International Conference on Pattern Recognition(ICPR 2006), Hong Kong, China,2006, pp.958-961.
Aiang-Kui, Chun- Hou Zhang, Xiao Feng wang and Feng Yan Lin "Recognition of Plant Species using support vector machine" ICIC 2008, CCIS 15, pp 192-199, 2008.
D.J. Hearn, "Shape analysis for automated identification of plants from images of leaves" Taxon 58 (2009) pp. 934-954.
Ji-Xiang Du, De-Shuang Huang, Xiao-Feng Wang and Xiao Gu "Computer-Aided Plant Species Identification (CAPSI) Based on Leaf Shape Matching Technique" Transactions of the Institute of Measurement and Control (2006), 28, pages 275-284.
Yunyoung Nam, Eenjun Hwang and Dongyoon Kim CLOVER " A Mobile Content-Based Leaf Image Retrieval System" ICADL 2005, LNCS 3815, pp. 139 – 148, 2005.
Pan Hongfei, Liang Dong , Tang Jun,Wang Nian, LI Wei "Shape Recognition and Retrieval Based on Edit Distance and Dynamic Programming" Tsinghua Science and Technology ISSNll1007-0214ll11/16llpp739-745 Volume 14, Number 6, December 2009.
Belongie S, Malik J, Puzicha J.."Shape matching and object recognition using shape context", IEEE Trans. Pattern Anal. Mach. Intell., 2002, 24(4): 509-522.
Siddiqi K, Kimia B B.,"A shock grammar for recognition", In: Proc. IEEE Conf. Computer Vision and Pattern Recognition San Francisco, CA, USA, 1996: 507- 513.
Zhong Li, Qiaolin Ding, Weihua Zhang, "A comparative study of different distances for similarity estimation", R. Chen (Ed.) ICICIS 2011, Part I, CCIS 134, pp.483-488, 2011.
Eric Persoon and King-sun Fu. " Shape Discrimination Using Fourier Descriptors", IEEE Trans. On Systems, Man and Cybernetics, Vol.SMC-7(3):170-179, 1977.
E. R. Davies. Machine Vision: Theory, Algorithms,Practicalities. Academic Press, 1997.
R. da S. Torres, A.X. Falca "Contour salience descriptors for effective image retrieval and analysis", Image and Vision Computing 25 (2007) 3–13.
H. Muller, W. Muller, D.M. Squire, S.M. Maillet, T. Pun, "Performance evaluation in content- based image retrieval: overview and proposals", Pattern Recognition Letters 22 (2001) 593–601.
H. Freeman. "On the encoding of arbitrary geometric configurations", IRE Transactions on Electronic Computers EC- 10(1961) 260-268.
U. Ramer. "An iterative procedure for the polygonal approximation of plane curves". Computer, Graphics and Image Processing, 1:244–256, 1972.
Y. Liu, D. Zhang, G. Lu, and W. Ma, "A survey of content based image retrieval with high-level semantics," Journal of Pattern Recognition, vol. 40, pp. 262–282, Nov. 2007.
Tao Y, Grosky WI "Delaunay triangulation for image object indexing: a novel method for shape representation" In: Proceedings of the 7th SPIE symposium on storage and retrieval for image and video databases, San Jose, CA, pp 631–642, January 1999.
Tao Y, Grosky WI, "Object-based image retrieval using point feature maps" In: Proceedings of the international conference on database semantics (DS-8), Rotorua, New Zealand, pp 59–73, January 1999.
Sebastian, T.B. Klein, P.N. , Kimia, B.B. "Recognition of shapes by editing their shock graphs" IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26(5), pp. 550-571, 2004.
Mahmoud I. Khalil, Mohamed M. Bayoumi "A Dyadic Wavelet Affine Invariant Function for 2D Shape Recognition " IEEE Trans. Pattern Anal. Mach. Intell 01/2001 ; 23:1152-1164. pp. 1152-1164.
Guangyi Chen, Tien D. Bui "Invariant Fourier-wavelet descriptor for pattern recognition", Pattern Recognition Volume 32, Issue 7, July 1999, Pages 1083–1088.
Peleg. S., and Rosenfeld A."'A min-man medial axis transformation", IEEE Pattern Anal. Mach . lnlell. 1981. 3, pp. 208-210.
Jain. A. K and Vailaya, A. "Shape-based retrieval: a case study with trademark image database", Pattern Recognition. 1998 31, (9). pp. 1369-1390.
Mehtre, B.V. M., Kankanhalli, M.S. and Lee. W.F. "Shape measures for content based image retrieval: a comparison", Inf. Process Manage., 1997, 33, (3). pp. 319-337.
Y. Rui, T.S. Huang, S.F. Chang, "Image retrieval: current techniques, promising directions and open issues", Journal of Visual Communication and Image Representation, Vol. 10, no. 4, pp. 39-62, April 1999.
Xianofen Zheng, Sherrill Mix , Gao "Perpetual Shape Based Natural Image Representation and Retrieval" International Conference on Semantic Computing 2007.
Chia-Hung Wei, Yue Li,Wing-Yin Chau, Chang-Tsun Li, "Trademark image retrieval using synthetic features for describing global shape and interior structure" Pattern Recognition Volume 42, Issue 3, March 2009, Pages 386–394.
Arica, Vural. BAS, "A Perceptual Shape Descriptor based on the Beam Angle Statistics", Pattern Recognition Letters, 24(9-10):1627-1639,June 2003.
Jia MingLi, Wei Yang Lin, "Efficient Shape Retrieval using elliptical Shape Coding" Asian Journal of Health and Information Sciences, Vol. 3, Nos. 1-4, pp. 101-109,2208.

Еще

Статья научная