`Bolle et al.
`
`54 PRODUCE RECOGNITION SYSTEM
`75) Inventors: Rudolf M. Bolle, Bedford Hills;
`Jonathan H. Connell, Cortlandt-Manor;
`Norman Haas, Mount Kisco, all of
`N.Y.; Rakesh Mohan, Stamford, Conn.;
`Gabriel Taubin, Hartsdale, N.Y.
`73) Assignec: International Business Machines
`Corporation, Armonk, N.Y.
`
`56
`
`(21) Appl. No. 235,834
`22 Filed:
`Apr. 29, 1994
`51) Int. Cl. ...
`... G06K 9146; G06K 9/66
`52 U.S. Cl. ...................... 382/190; 382/110; 382/164;
`382/165; 382/170; 382/173
`58) Field of Search ..................................... 382/l 10, 164,
`382/165, 170, 173, 190, 199, 181
`References Cited
`U.S. PATENT DOCUMENTS
`l 1/1973 Greenwood et al. ................... 250/227
`3,770, l l
`8/1978 Warkentin ct al. ......,
`... 209/74
`4,106,628
`4,515,275 5/1985 Mills et al. .......
`... 209/585
`4,534,470 8/1985 Mills .................
`... 209/558
`4,574,393
`3/1986 Blackwell et al. .
`364/526
`4,718,089
`1/1988 Hayashi et al. ...
`... 382/91
`4,735,323
`5/1988 Okada et al. ..
`...... 209/582
`5,020,675
`6/199 Cowlin et al. .
`... 209,538
`5,060,290 10/1991 Kelly et al. ............................. 382/10
`5,085.325 2/1992 Jones et al.
`... 209/580
`5,164,795 1 1/1992 Conway ...
`... 356/407
`5,253,302 10/1993 Massen ................................... 382,165
`FOREIGN PATENT DOCUMENTS
`3044268 2/1991 Japan.
`5063968 3/1993 Japan.
`OTHER PUBLICATIONS
`M. J. Swain & D. H. Ballard, "Color Indexing,' Int. Journal
`of Computer Vision, vol. 7, No. 1, pp. 11-32, 1991.
`M. Miyahara & Y. Yoshida, “Mathematical Transform of
`(R,G,B,) color data to Munsell (H.V.C.) color data,' SPIE
`vol. 1001 Visual Communications and lmage Processing,
`1988, pp. 650-657.
`
`||||III
`US005546475A
`(1)
`Patent Number:
`5,546,475
`45) Date of Patent:
`Aug. 13, 1996
`
`L. van Gool, P. Dewaele, & A. Oosterlinck, "Texture Analy
`sis anno 1983.” Computer Vision, Graphics, and Image
`Processing, vol. 29, 1985, pp. 336-357.
`T. Pavlidis, "A Review of Algorithms for Shape Analysis."
`Computer Graphics and Image Proccssing vol. 7, 1978, pp.
`243-258.
`S. Marshall, "Review of Shape Coding Techniques." Image
`and Vision Computing, vol. 7, No. 4, Nov. 1989, pp.
`28-294.
`S. Mersch, “Polarizcd Lighting for Machine Vision Appli
`cations." Proc. of R/SME Third Annual Applied Machinc
`Vision Co?., Feb. 1984, pp. 40–54 Schaumburg.
`B. G. Batchelor, D. A. Hill & D. C. Hodgson, "Automated
`Visual Inspection” IFS (Publications) Ltd. UK North-Hol
`land (A div. of Elsevicr Science Publishers BV) 1985 pp.
`39-178.
`
`Primary Examiner-Leo Boudrcau
`Assistant Examiner-Phuoc Tran
`Attorney, Agent, or Firm-Louis J. Percello
`
`ABSTRACT
`57
`The present system and apparatus uses image processing to
`recognize objects within a scene. The system includes an
`illumination sourcc for illuminating the scene. By control
`ling the illumination source, an image processing system can
`take a first digitize image of thc scene with the object
`illuminated a higher level and a second digitized image with
`thc object illuminated at a lower level. Using an algorithm,
`the object(s) image is segmented from a background image
`of the scene by a comparison of the two digitized images
`taken. A processed imagc (that can be used to charact crize
`features) of the object(s) is then compared to stored rcfcr
`cnce images. The object is recognized when a match occurs.
`The system can recognize objects independent of size and
`number and can be trained to recognize objects that is was
`not originally programmed to recognize.
`
`32 Claims, 16 Drawing Sheets
`
`
`
`OJ
`
`Light SCR-fce
`1.
`
`-
`
`|
`
`'Algorithms
`--
`200
`; : 2G
`170
`
`... -------,
`
`computer
`42
`
`--------
`
`-- - - - - - -
`
`Fromegrabber
`|
`142
`
`- - - - - -------
`
`- - -
`----
`Hunger -
`decision making
`
`
`
`I
`
`herociw
`- output
`devKce
`lc3
`
`Training
`--- i.
`
`Patent Owner’s Ex. 2005, Page 1 of 29
`
`
`
`U.S. Patent
`
`Aug. 13, 1996
`
`Sheet 1 of 16
`
`5,546,475
`
`FIG.
`
`COmero
`
`OO
`
`Light Source
`
`7O
`
`Memory
`StOrOOe 9
`
`44
`
`Algorithms
`2OO
`
`
`
`Fromegrobber
`42
`
`Humon
`decision moking
`
`Computer
`l4O
`1
`
`Weighing
`----- Device
`7O :
`:
`InterOctive
`Output
`Cevice
`60
`
`Troining
`l62
`
`Patent Owner’s Ex. 2005, Page 2 of 29
`
`
`
`U.S. Patent
`
`Aug. 13, 1996
`
`Sheet 2 of 16
`
`5,546,475
`
`
`
`FIG. 2
`
`lmoging O
`for Cet object
`C
`Jeco
`
`Segmenting the
`torget object
`imode 9 220
`
`Computing One or more
`forget object fedtures
`
`230
`
`Chorocterizing the
`forget object
`feoture(S)
`
`Normalizing the
`forget Object
`ChorCCterizotions
`
`250
`
`Storoge
`Criterio
`255
`
`Comporing the normolized
`forget object chorocterizotion to o
`reference to recognize
`
`26)
`
`Patent Owner’s Ex. 2005, Page 3 of 29
`
`
`
`U.S. Patent
`
`Aug. 13, 1996
`
`Sheet 3 of 16
`
`5,546,475
`
`
`
`Second image
`
`Patent Owner’s Ex. 2005, Page 4 of 29
`
`
`
`U.S. Patent
`
`Aug. 13, 1996
`Sheet 4 of 16
`F.G. 4
`
`5,546,475
`
`Bockground
`
`32
`
`Control
`
`
`
`
`
`
`
`N-4Ol
`
`Output device
`
`O
`
`Computer
`40
`
`
`
`Algorithm
`
`Algorithm
`
`22O
`
`2OO
`
`4OO
`
`Patent Owner’s Ex. 2005, Page 5 of 29
`
`
`
`U.S. Patent
`
`Aug. 13, 1996
`
`Sheet 5 of 16
`
`5,546,475
`
`F.G. 5
`
`
`
`Accuire light
`Imoge l
`
`50
`
`Accuire dork
`imoge 2
`
`52O
`
`340
`
`
`
`33O
`
`Compore on pixel
`to pixel bosis 530
`
`542
`YES1 Pixel
`\
`fo-r- brighter?
`
`
`
`
`
`NO
`
`544
`
`Object
`imoge
`
`13
`
`BOCkground
`imoge
`3ll
`
`.
`
`bright? -: NO
`YES Š.
`-552- .
`:
`555
`a
`553:
`-1
`N
`---, - ---------------.
`: Translucent : : Opaque
`:
`Image 554.
`Image ss.,
`
`as a
`
`br
`
`up - e s a u o or e o or ur n
`
`- of
`
`g- - - - or or ur v
`
`-
`
`P
`
`- a m
`
`-
`
`- -e e us
`
`22O
`
`Patent Owner’s Ex. 2005, Page 6 of 29
`
`
`
`U.S. Patent
`
`Aug. 13, 1996
`Sheet 6 of 16
`F. G. 6
`
`5,546,475
`
`2O
`
`
`
`
`
`
`
`Segmenting
`
`Fl (e.g., Hue)
`
`Histogromming
`
`
`
`230
`
`640
`
`Patent Owner’s Ex. 2005, Page 7 of 29
`
`
`
`U.S. Patent
`
`Aug. 13, 1996
`
`Sheet 7 of 16
`
`5,546,475
`
`FIG. 7
`
`
`
`32O
`
`Segmenting
`
`
`
`22O
`
`Segmenting
`
`22O
`
`HistoCrOmmin
`9
`9 640
`
`
`
`7
`
`NormolizCition
`
`NOrmolizCition
`
`75O
`
`75O
`
`760
`
`77O
`
`Patent Owner’s Ex. 2005, Page 8 of 29
`
`
`
`U.S. Patent
`
`Aug. 13, 1996
`
`Sheet 8 of 16
`
`5,546,475
`
`27O
`
`FIG. 8
`
`26O
`
`Comporing
`84O
`
`Lill
`
`8 O
`
`Hill
`
`83
`
`832
`
`833
`
`834
`Hill
`
`835
`
`836
`
`I Ill
`
`837
`
`82O
`
`Patent Owner’s Ex. 2005, Page 9 of 29
`
`
`
`U.S. Patent
`
`Aug. 13, 1996
`
`Sheet 9 of 16
`
`5,546,475
`
`F. G. 9
`Troining 910
`
`
`
`NOrmolized
`Histogrom
`Recognized?
`
`Yes
`
`260
`
`Meet
`stOroge
`Criterio?
`
`D
`Segmenting
`220
`
`230
`
`Histogramming
`640
`
`NOrmolizotion
`750
`
`Patent Owner’s Ex. 2005, Page 10 of 29
`
`
`
`U.S. Patent
`
`Aug. 13, 1996
`Sheet 10 of 16
`FIG. O
`
`5,546,475
`
`
`
`32O
`
`
`
`COmporing
`
`84O
`
`Memory Storoge 44
`
`6 O
`
`Patent Owner’s Ex. 2005, Page 11 of 29
`
`
`
`U.S. Patent
`
`Aug. 13, 1996
`F.G.
`
`
`
`Sheet 11 of 16
`
`5,546,475
`
`Segmenting
`
`22O
`
`Texture computotion
`40
`
`Histogramming
`9
`
`l5O
`
`NOrmolizCition
`
`60
`
`Patent Owner’s Ex. 2005, Page 12 of 29
`
`
`
`U.S. Patent
`
`Aug. 13, 1996
`
`Sheet 12 of 16
`
`5,546,475
`
`
`
`FIG. 2
`
`Segmenting
`
`220
`
`Boundory extraction
`l2O
`
`Boundory shope
`computotion
`
`220
`
`Histogromming
`
`Length
`normolizOtion
`
`23O
`
`235
`
`24O
`
`J.
`
`III
`
`Weighing device
`
`Computer
`40
`
`Patent Owner’s Ex. 2005, Page 13 of 29
`
`
`
`U.S. Patent
`U.S. Patent
`
`Aug. 13, 1996
`Aug. 13, 1996
`
`Sheet 13 of 16
`Sheet 13 of 16
`
`5,546,475
`5,546,475
`
`
`
`- 4 O5
`
`43O
`
`1430 1450
`
`455
`1455
`
`FG. 4
`FIG. 44
`
`Patent Owner’s Ex. 2005, Page 14 of 29
`
`Patent Owner’s Ex. 2005, Page 14 of 29
`
`
`
`U.S. Patent
`U.S. Patent
`
`Aug. 13, 1996
`Aug. 13, 1996
`
`Sheet 14 of 16
`Sheet 14 of 16
`
`5,546,475
`5,546,475
`
`
`
`F.G. 5
`
`FIG.45
`
`Patent Owner’s Ex. 2005, Page 15 of 29
`
`Patent Owner’s Ex. 2005, Page 15 of 29
`
`
`
`U.S. Patent
`
`Aug. 13, 1996
`
`Sheet 15 of 16
`
`5,546,475
`
`6OO
`
`F. G. 6
`
`
`
`Red
`l62
`
`Green
`63
`
`Yellow
`64
`
`Round
`66
`
`Straight
`67
`
`Leofy
`68
`
`Peppers Potofoes
`l62
`622
`
`RED DEL APPLE
`
`() GAA APPLE
`
`63
`
`632
`
`Patent Owner’s Ex. 2005, Page 16 of 29
`
`
`
`U.S. Patent
`
`Aug. 13, 1996
`
`Sheet 16 of 16
`
`5,546,475
`
`F. G. 7
`
`
`
`Weighing device
`
`StorOge
`
`Patent Owner’s Ex. 2005, Page 17 of 29
`
`
`
`5,546,475
`
`PRODUCE RECOGNITION SYSTEM
`
`FIELD OF THE INVENTION
`This invention relates to the field of recognizing (i.e.,
`identifying, classifying, grading, and verifying) objects
`using computerized optical scanning devices. More specifi
`cally, the invention is a trainable system and method relating
`to recognizing bulk items using image processing.
`
`BACKGROUND OF THE INVENTION
`Image processing systems exist in the prior art for rec
`ognizing objects. Often these systems use histograms to
`perform this recognition. One common histogram method
`either develops a gray scale histogram or a color histogram
`from a (color) image containing an object. These histograms
`are then compared directly to histograms of referencc
`images. Alternatively, features of the histograms are
`extracted and compared to features extracted from histo
`grams of images containing reference objects.
`The reference histograms or features of these histograms
`are typically stored in computer memory. The prior art often
`performs thesc methods to verify that the target object in
`image is indeed the object that is expccted, and, possibly, to
`grade/classify the object according to the quality of its
`appearance relative to the reference histogram. An alterna
`tive purposc could be to identify thc target object by
`comparing the target image object histogram to the histo
`grams of a number of reference images of objects.
`In this description, identifying is defined as determining,
`given a set of reference objects or classes, which reference
`object the target object is or which reference class the target
`object belongs to. Classifying or grading is defined as
`determining that the target object is known to be a certain
`object and/or that the quality of the object is some quanti
`tatively value. Herc, one of the classes can be a "reject'
`class, meaning that either the quality of the object is too
`poor, or the object is not a member of the known class.
`Verifying, on the other hand, is defined as determining that
`the target is known to be a certain object or class and simply
`verifying this is to be true or false. Recognizing is defined
`as identifying, classifying, grading, and/or verifying.
`Bulk items include any item that is sold in bulk in
`Supermarkets, grocery stores, retail stores or hardware
`stores. Examples include produce (fruits and vegetables),
`Sugar, coffee beans, candy, nails, nuts, bolts, general hard
`ware, parts, and package goods.
`In image processing, a digital image is an analog image
`from a camera that is converted to a discrete representation
`by dividing the picture into a fixed number of locations
`called picture elements and quantizing the value of the
`image at those picture elements into a fixed number of
`values. The resulting digital image can be processed by a
`computer algorithm to dc velop other images. These images
`can be stored in memory and/or used to determine informa
`tion about the imaged object. A pixel is a picture element of
`a digital image.
`Image processing and computer vision is the processing
`by a computer of a digital image to modify the image or to
`obtain from the image properties of the imaged objects such
`as object identity, location, etc.
`An scene contains one or more objects that arc of interest
`and the surroundings which also get imaged along with the
`objects. Thesc surroundings are called the background. The
`
`10
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`O
`
`65
`
`2
`background is usually further away from the camera than thc
`object(s) of interes1.
`Segmenting (also called figure/ground separation) is sepa
`rating a scene image into separate object and background
`images. Segmenting refers to identifying those image pixels
`that are contained in the image of the object versus those that
`belong to the image of the background. The segmented
`object image is then the collection of pixels that comprises
`the object in the original image of the complete scene. The
`area of a segmented object image is the number of pixels in
`the object image.
`Illumination is the light that illuminates the scene and
`objects in it. Illumination of the whole scenc directly deter
`mines the illumination of individual objects in the scene and
`therefore the reflected light of the objects received by
`imaging apparalus such as video camera.
`Ambient illumination is illumination from any light
`source except the special lights used specifically for imaging
`an object. For example, ambient illumination is the illumi
`nation due to light sources occurring in the environment
`such as the sun outdoors and room lights indoors.
`Glarc or specular reflection is the high amount of light
`reflected off a shiny (specular, exhibiting mirror-like, pos
`sibly locally, properties) object. The color of the glare is
`mostly that of the illuminating light (as opposed to the
`natural color of the object).
`A feature of an image is defined as any property of thc
`image, which can be computationally cxtracted. Features
`typically have numerical values that can lic in a certain
`range, say, RO-R1. In prior art, histograms are computed
`over a whole image or windows (sub-images) in an image.
`A histogram of a feature of an image is a numerical
`representation of the distribution of feature values over the
`image or window. A histogram of a feature is developed by
`dividing the feature range, R0–R1, into M intervals (bins)
`and computing the feature for each image pixel. Simply
`counting how many image or window pixels fall in each bin
`gives the feature histogram.
`Image features include, but are not limited to, color and
`texture. Color is a two-dimensional properly, for example
`Hue and Saturation or other color descriptions (explaincd
`below) of a pixel, but often disguised as a three-dimensional
`property, i.c., the amount of Red, Green, and Blue (RGB).
`Various color descriptions are used in the prior art, including
`(1) the RGB space; (2) the opponent color space; (3) the
`Munsell (H.VC) color space; and, (4) the Huc, Saturation,
`and Intensity (H.S.I) space. For the latter, similar to the
`Munsell space, Huc refers to the color of the pixel (from red,
`to green, to blue), Saturation is the "deepness' of thc color
`(e.g., from greenish to deep saturated green), and Intensity
`is the brightness, or what the pixel would look like in a gray
`scalc image.
`Texturc, on the other hand, is an visual image feature that
`is much more difficult to capture computationally and is a
`feature that cannot be attributed to a single pixel but is
`attributed to a patch of image data. The texture of an image
`patch is a description of the spatial brightness variation in
`that patch. This can be a repetitive pattern (of texels), as the
`pattern on an artichoke or pineapple, or, can be more
`random, like the pattern of the leaves of parsley. These arc
`called structural textures and statistical textures, respec
`tively. There exists a wide range of textures, ranging from
`the purely deterministic arrangement of a texel on some
`tesselation of the two-dimensional plane, to "salt and pep
`per' whitc noise. Research on image texture has been going
`on for over thirty years, and computational measures have
`
`Patent Owner’s Ex. 2005, Page 18 of 29
`
`
`
`5,546,475
`
`10
`
`15
`
`20
`
`25
`
`3
`been developed that arc one-dimensional or higher-dimen
`sional. However, in prior art, histograms of texture features
`are not known to the inventors.
`Shape of some boundary in an image is a feature of
`multiplc boundary pixels. Boundary shape refers to local
`features, such as, curvature. An apple will have a roughly
`constant curvature boundary, whilc a cucumber has a piece
`of low curvature, a picce of low negative curvature, and two
`pieces of high curvature (the end points). Other boundary
`shape measures can be used.
`Some prior art uses color histograms to identify objects.
`Given an (R,G,B) color image of the target object, the color
`representation used for thc histograms are the opponent
`color:rg=R-G, by=2*B-R-G, and wb=R+G+B. The wb
`axis is divided into 8 sections, whilcrg and by axes arc
`divided into 6 scctions. This results in a three-dimensional
`histogram of 2048 bins. This system matches target image
`histograms to 66 pre-stored reference image histograms. The
`set of 66 pre-stored reference image histogram is fixed, and
`therefore it is not a trainable system, i.e., unrecognized target
`images in one instance will not be recognized in a lalcr
`instance.
`U.S. Pat. No. 5,060,290 to Kelly and Klein discloses the
`grading of almonds based on gray scale histograms. Falling
`almonds are furnished with uniform light and pass by a
`linear camera. A gray histogram, quantized into 16 levels, of
`the image of the almond is developed. The histogram is
`normalized by dividing all bin counts by 1700, where 1700
`pixels is the size of the largest almond expected. Five
`features are extracted from this histogram: (1) gray value of
`the peak; (2) range of the histogram; (3) number of pixels at
`peak; (4) number of pixels in bin to the right of peak; and,
`(5) number of pixels in bin 4. Through lookup tables, an
`cight digit codc is developed and if this code is in a library,
`the almond is accepted. The system is not trainable. The
`appearances of almonds of acceptable quality are hard
`coded in the algorithm and the system cannot bc trained to
`grade almonds differently by showing new instances of
`almonds.
`U.S. Pat. No. 4,735,323 to Okada el al. discloses a
`mechanism for aligning and 1ransporting an object to be
`inspect.cd. The system more specifically relates to grading of
`oranges. The transported oranges are illuminated with a light
`within a predetermined wavelength range. The light
`reflected is received and converted into an electronic signal.
`A level histogram dividcd into 64 bins is developed, where
`
`4
`the object is in the image and not obscured by other objects),
`(3) there is little difference in illumination of the scene of
`which the images (reference and targel images) are taken
`from which the reference object histograms and larget object
`histograms arc developed, and (4) the object can bc casily
`segmented out from the background or there is relatively
`little distraction in the background. Under these conditions,
`comparing a target object image histogram with reference
`object image histograms has been achieved in numerous
`ways in the prior art.
`
`STATEMENT OF PROBLEMS WITH THE
`PRIOR ART
`Some prior art matching systems and methods, claim to bc
`robust to distractions in the background, variation in view
`point, occlusion, and varying image resolution. However, in
`some of this prior art, lighting conditions are not controlled.
`The systems fail when the color of the illumination for
`obtaining the reference object histograms is different from
`the color of the illumination when obtaining thc largct object
`image histogram. The RGB values of an image point in an
`image are very dependent on thc color of the illumination
`(even though humans have little difficulty naming the color
`given the whole image). Consequently the color histogram
`of an image can change dramatically when the color of the
`illumination (light frequency distribution) changes. Further
`more, in these prior art systems the objects arc not seg
`ment cd from the background, and, therefore, the histograms
`of the images arc not area normalized. This means the
`objects in target images have to be the same size as the
`objccts in the rcference images for accurate recognition
`becausc variations of the object size with respect to the pixel
`size can significantly change the color histogram. It also
`means that the parts of the image that correspond to the
`background have to be achromatic (c.g. black), or, at least,
`or a coloring not prescnt in the object, or they will signifi
`cantly perturb the derived image color histogram.
`Prior art such as that disclosed in U.S. Pat. No. 5,060,290
`fail if the size of the almonds in the image is drastically
`different than expected. Again, this is becausc thc system
`does not explicitly separatic thc object from its background.
`This system is used only for grading almonds: it can not
`dislinguish an almond from (say) a peanut.
`Similarly, prior art such as that disclosed in U.S. Pat. No.
`4,735,323 only recognizes different grades of oranges. A
`reddish grapefruit might very well be deemed a very large
`orange. Thc system is not designed to operate with more
`than one class of fruit at a time and thus can makc do with
`weak features such as the ratio of grecn to white reflectivity.
`In summary, much of the prior art in the agricultural
`arena, typified by U.S. Pat. Nos. 4,735,323 and 5,060,290, is
`concerned with classifyinglgrading produce items. This
`prior art can only classifyfidentify objects/products/produce
`if they pass a scanner one object at a time. It is also required
`that the range of sizes (from smallest to largest possible
`object size) of the object/product/produce be known before
`hand. These systems will fail if more than one item is
`scanncod at the same time, or to be more precise, if more than
`onc object appears at a scanning position at the same time.
`Further, the prior art often requires carcfully engineered
`and expensive mechanical environment with carefully con
`trolled lighting conditions where the items are transported to
`predefined spatial locations. These apparatuses arc designed
`specifically for onc type of shapcd object (round, oval, etc.)
`and are impossible or, at lcast, not casily modified to deal
`
`35
`
`45
`
`Level=(the intensity of totally reflected light)/(the intensity of
`green light reflected by an orange)
`The median, N, of this histogram is determined and is
`considered as representing the color of an orange. Based on
`N, the orange coloring can be classified into four gradcs of
`"excellent,"good."fair' and "poor,"or can be gradcd finer.
`The systems is not trainable, in that the appearance of the
`different grades of oranges is hard-coded into the algorithms,
`The use of gray scale and color histograms is a very
`effective method for grading or verifying objects in an
`image. The main reason for this is that a histogram is very
`compact representation of a referencc object that does not
`depend on the location or orientation of the object in the
`image.
`However, for image histogram-based recognition to work,
`certain conditions have to be satisfied. It is required that: (1)
`the size of the object in the image is roughly known, (2)
`there is relatively little occlusion of the object (i.e., most of
`
`50
`
`55
`
`60
`
`65
`
`Patent Owner’s Ex. 2005, Page 19 of 29
`
`
`
`S
`with other object types. The shape of the objects inspires the
`means of object transportation and is impossible or difficult
`for the transport means to transport different object types.
`This is especially true for oddly shaped objects like broccoli
`or ginger. This, and the use of features that are specifically
`selected for the particular objects, does not allow for the
`prior art to distinguish between types of produce.
`Additionally, none of the prior art are trainable systems
`where, through human or computer intervention, new items
`are learned or old items discarded. That is, the systems can
`not be taught to recognize objects that were not originally
`programmed in the system or to stop recognizing objects
`that were originally programmed in the system.
`One area where the prior art has failed to be effective is
`in produce check out. The current means and methods for
`checking out produce poses problems. Affixing (PLU-price
`lookup) labels to fresh produce is disliked by customers and
`produce retailers/wholesalers. Pre-packaged produce items
`are disliked, because of increased cost of packaging, dis
`posal (solid waste), and inability to inspect produce quality
`in pre-packaged form.
`The process of produce check-out has not changed much
`since the first appearance of grocery stores. At the point of
`sale (POS), the cashier has to recognize the produce item,
`weigh or count the item(s), and determine the price. Cur
`rently, in most stores the latter is achieved by manually
`entering the non-mnemonic PLU code that is associated with
`the produce. Thesc codes are available at the POS in the
`form of printed list or in a booklet with pictures.
`Multiple problems arise from this process of produce
`check-out:
`(1) Losses incurred by the store (shrinkage). First, a
`cashier may inadvertently enter the wrong code num
`ber. If this is to the advantage of the customer, the
`35
`customer will be less motivated to bring this to the
`attention of the cashier. Second, for friends and rela
`tives, the cashier may purposely enter the code of a
`lower-priced produce item (sweethearting).
`(2) Produce check-out tends to slow down the check-out
`process because of produce identification problems.
`(3) Every new cashier has to be trained on produce names,
`produce appearances, and PLU codes.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`40
`
`5,546,475
`
`6
`level. Using an algorithm, the object(s) imagc is novelly
`segmented from a background image of the scene by a
`comparison of the two digitizcd images taken. A processed
`image (that can be used to characterize features) of the
`object(s) is then compared to stored reference images. The
`object is recognized when a match occurs.
`Processed images of an unrecognized object can be
`labeled with identity of object and stored in memory, based
`on certain criteria, so that the unrecognized object will be
`recognize when it is imaged in the future. In this novel way,
`the invention is taught to recognize previously unknown
`objects.
`Recognition of the object is independent of the size or
`number of the objects because the object image is novelly
`normalized bcfore it is compared to the reference images.
`Optionally, use interfaces and apparatus that determines
`other features of the object (like weight) can bc used with the
`system.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`FIG. 1 is a block diagram of the onc preferred embodi
`ment of the present system.
`FIG. 2 is a flow chart showing on preferred embodiment
`of the present method for recognizing objects.
`FIG. 3 illustrates segmenting a scene into an object image
`and a background image.
`FIG. 4 is a block diagram of a preferred embodiment of
`apparatus for segmenting images and recognizing object in
`images.
`FIG. 5 is a flow chart of a preferred method for segmcnt
`ing target object images.
`FIG. 6 is a flow chart showing a preferred method of
`characterizing reference ot target object feature(s).
`FIG. 7 is a flow chart showing a preferred method for
`(area/length) normalization of object feature(s) character
`ization.
`FIG. 8 illustrates the comparison of an arcallength nor
`malized target object characterization to one or more arca
`normalized reference object characterizations.
`FIG. 9 is a flow chart showing a preferred (algorithmic)
`method of training the present apparatus to recognize new
`images.
`FIG. 10 is a block diagram showing multiple features of
`an object being extracted.
`FIG. 11 is a flow chart showing the histogramming and
`normalizing of the feature of texture.
`FIG. 12 is a flow chart showing the histogramming and
`normalizing of the feature of boundary shape.
`FIG. 13 is block diagram showing a weighing device.
`FIG. 14 shows an image where the segmented objcct has
`two distinct regions determined by segmenting the object
`image and where these regions are incorporated in rccogni
`tion algorithms.
`FIG. 15 shows a human interface to the present apparatus
`which presents an ordered ranking of the most likely iden
`litics of the produce bcing imaged.
`FIG. 16 shows a means for human determination of the
`identity of object(s) by browsing through subset(s) of all the
`previously installed stored icon images, and the means by
`which the subscts are selected.
`FIG. 17 is a preferred embodiment of the present inven
`tion using object weight to price object(s).
`
`OBJECTS OF THE INVENTION
`An object of this invention is an improved apparatus and
`method for recognizing objects such as produce.
`An object of this invention is an improved trainable
`apparatus and method for recognizing objects such as pro
`duce.
`Anothcrobjcct of this invention is an improved apparatus
`and method for recognizing and pricing objects such as
`produce at the point of sale or in the produce department.
`A further object of this invention is an improved means
`and method of user interface for automated produce identi
`fication, such as, produce.
`
`45
`
`50
`
`55
`
`SUMMARY OF THE INVENTION
`Thc present invention is a system and apparatus that uses
`image processing to recognize objects within a scene. The
`system includes an illumination source for illuminating the
`scene. By controlling the illumination source, an image
`processing system can take a first digitized image of the
`scene with the object illuminated at a higher level and a
`second digitized image with the object illuminated at a lower
`
`65
`
`Patent Owner’s Ex. 2005, Page 20 of 29
`
`
`
`7
`DETALED DESCRIPTION OF THE
`INVENTION
`The apparatus 100 shown in FIG. 1 is one preferred
`embodiment of the present invention that uses image pro
`cessing to automatically recognize one or more objects 131.
`A light source 110 with a light frequency distribution that
`is constant over time illuminates the object 131. The light is
`non-monochromatic and may include infra-red or ultra vio
`let frequencies. Light being non-monochromatic and of a
`constant frequency distribution ensures that the color
`appearance of the objects 131 does not change due to light
`variations betwecn different images taken and that stored
`images of a given object can be matched to images taken of
`that object at a later time. The preferred lights are flash tubcs
`Mouser U-4425, or two GE cool-whitc fluorescent bulbs (22
`Watts and 30 Watts), GE FE8T9-CW and GE FC12T9-CW.
`respectively. Such light sources are well known.
`A video input device 120 is used to convert the reflected
`light rays into an image. Typically this image is two dimen
`sional. A preferred video input device is a color camera but
`any device that converts light rays into an image can be used.
`These cameras would include CCD camera and CID cam
`cras. The color camera output can be RGB, HSl, YC, or any
`other representation of color. One preferred camcra is a Sony
`card-camera CCB-C35YC or Sony XC-999. Video input
`devices like this 120 are well known.
`Color images are the preferred sensory modality in this
`invention. However, other sensor modalitics arc possible,
`e.g., infra-rcd and ultra-violet images, smell/odor (measur
`able, e.g., with mass spectrometer), thermal decay proper
`tics, ultra-sound and magnclic resonance images, DNA,
`fundamental frequency, stiffness and hardness. These
`modalities can be enabled with known methods of illumi
`nating, mcasuring, or taking samples of the object 131 and
`with a compatible imaging device 120 for creating the
`image.
`The object 131 is the objcct being imaged and recognized
`by the system 100. The object 131 can comprisc one or more
`items. Although it is preferred that objects 131 be of one type
`(variety), e.g., one or more apples, the items can be of
`different types, e.g., a cercal box (Object A) and an apple
`(Object B). System 100 will then recognize objects as either
`as (1) Object A, (2) Object B, (3) both Object A and Objcct
`B, or, (4) reject objects as unrecognizable. The object(s) can
`bc virtually anything that can bc imaged by the system 100,
`however preferred objects 131 are bulk items including
`produce (fruits and vicgctables), hardware, boxed goods, etc.
`A calculating device 140, typically a computer 140, is
`used to process the image generated by the video input
`50
`device 120 and digitized (to be compatible with the com
`puter 140) by a frame grabber 142.
`The processing is performed by an algorithm 200. Other
`calculating devices 140 include: personal computers, and
`workstations. The calculating device 140 can also be one or
`more digital signal processors, either stand-alone or installcd
`in a computer. It can also be any special hardware capable
`of implementing the algorithm 200. A preferred embodiment
`is a Datatranslation DSP board DT 2878 coupled to a
`Datatranslation DT2871 framc grabber board residing in an
`IBM ValuePoint computer, or in the IBM 4690 series of POS
`Cash Registers. The frame grabber 142 is a device that
`digitizes the image signal from the camera 120. If the
`camera 120 is a digital camera th