`Toyoda
`
`USOO6452348B1
`US 6,452,348 B1
`(10) Patent No.:
`Sep. 17, 2002
`(45) Date of Patent:
`
`(54) ROBOT CONTROL DEVICE, ROBOT
`CONTROL METHOD AND STORAGE
`MEDIUM
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`(75) Inventor: Takashi Toyoda, Tokyo (JP)
`(73) Assignee: Sony Corporation, Tokyo (JP)
`(*) Notice:
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 0 days.
`
`(21) Appl. No.: 09/724,988
`(22) Filed:
`Nov. 28, 2000
`(30)
`Foreign Application Priority Data
`Nov. 30, 1999
`(JP) ........................................... 11-340466
`(51) Int. Cl." .................................................. HO2K 7/14
`(52) U.S. Cl. ........................... 318/3; 318/632; 700/259;
`434/308
`(58) Field of Search ..................... 318/3, 632; 446/268,
`446/279, 280, 298, 299, 330; 381/110;
`700/258, 259, 901/46, 47; 434/308; 463/35,
`39
`
`1/1988 Furukawa ................... 446/175
`4,717,364 A
`2/1996 Mohr et al. .................... 3.18/3
`5,493,185. A
`5,832,189 A 11/1998 Tow ............................ 901/47
`6,160,986 A 12/2000 Gabai et al. ................ 434/308
`* cited by examiner
`
`Primary Examiner Khanh Dang
`(74) Attorney, Agent, or Firm-Frommer Lawrence &
`Haug LLP; William S. Frommer; Gordon Kessler
`(57)
`ABSTRACT
`A robot control device for controlling a robot having a
`Substantial entertainment value is disclosed. A Sensor Signal
`processor recognizes the Voice of a user, Sets an association
`between the Voice recognition result and an action of the
`robot, and registers the association in a behavior association
`table of a behavior association table memory. A behavior
`decision unit decides which action for the robot to take based
`on the behavior association table.
`
`8 Claims, 9 Drawing Sheets
`
`
`
`IPR2023-00037
`Apple EX1023 Page 1
`
`
`
`U.S. Patent
`U.S. Patent
`
`Sep. 17, 2002
`Sep. 17, 2002
`
`Sheet 1 of 9
`Sheet 1 of 9
`
`US 6,452,348 B1
`US 6,452,348 B1
`
`FIG.
`
`1
`
`
`
`
`
`IPR2023-00037
`Apple EX1023 Page 2
`
`IPR2023-00037
`Apple EX1023 Page 2
`
`
`
`U.S. Patent
`
`Sep. 17, 2002
`
`Sheet 2 of 9
`
`US 6,452,348 B1
`
`FIG. 2
`
`MICROPHONE
`
`
`
`11
`
`
`
`CONTROLLER
`
`
`
`8
`
`CAMERA
`
`10 PRESSURE
`SENSOR
`
`
`
`
`
`
`
`ROTARY
`ENCODER
`
`121 . . . . .
`
`.
`
`. . 12N
`
`ACTUATOR
`(MOTOR)
`
`ACTUATOR
`(MOTOR)
`
`
`
`
`
`20
`
`:
`23
`
`21 PROGRAM
`MEMORY
`
`
`
`
`
`22
`
`MOTOR
`DRIVER
`
`
`
`
`
`24
`
`25
`
`FROM MICROPHONE,
`CAMERA, PRESSURE
`SENSOR, AND
`ROTARY ENCODER
`>TO MOTOR
`:
`
`:
`
`:
`:
`
`IPR2023-00037
`Apple EX1023 Page 3
`
`
`
`U.S. Patent
`
`Sep. 17, 2002
`
`Sheet 3 of 9
`
`US 6,452,348 B1
`
`FIG. 4
`
`9
`
`8
`
`10
`
`MICRO-
`PHONE
`
`CCD
`CAMERA
`
`PRESSURE
`SENSOR
`
`SENSOR SIGNAL
`PROCESSOR
`
`FROM ROTARY
`ENCODER
`
`32
`
`EMOTION/INSTINCT
`MODEL UNIT
`
`33
`
`----11
`
`BEAVIQR MODEL-33A
`BEHAVIOR MEMORY
`BEGISION
`BEHAvior
`
`33B
`
`ASSOCATION
`TABLE MEMORY
`
`34
`
`POSTURE
`TRANSITION UNIT
`
`35 N-DATA CONTROL UNIT
`
`a
`
`-
`
`-
`
`-
`
`-
`
`-
`
`-
`
`-
`
`is
`
`is
`
`as
`
`as opp
`
`as as as as a
`
`s
`
`s ed
`
`71
`
`.
`
`. . . . .
`
`7N-1
`
`7N
`
`IPR2023-00037
`Apple EX1023 Page 4
`
`
`
`U.S. Patent
`
`Sep. 17, 2002
`
`Sheet 4 of 9
`
`US 6,452,348 B1
`
`FIG. 5
`
`FIG. 6
`
`
`
`REY WALKING LOOKING
`FORWARDUP
`
`IPR2023-00037
`Apple EX1023 Page 5
`
`
`
`U.S. Patent
`
`Sep. 17, 2002
`
`Sheet 5 of 9
`
`US 6,452,348 B1
`
`FIG. 7
`
`VOICE DATA
`
`
`
`41
`
`FEATURE PARAMETER
`EXTRACTOR
`
`
`
`42
`
`MATCHING UNIT
`
`VOICE RECOGNITION
`RESULT
`
`ACOUSTIC MODEL U43
`MEMORY
`
`DICTIONARY
`MEMORY
`
`GRAMMAR
`MEMORY
`
`44
`
`45
`
`IPR2023-00037
`Apple EX1023 Page 6
`
`
`
`U.S. Patent
`
`Sep. 17, 2002
`
`Sheet 6 of 9
`
`US 6,452,348 B1
`
`FIG. 8
`
`VOICE RECOGNITION
`PROCESS
`
`EXTRACIFEATURES
`PARAMETER
`
`PERFORM S2
`MATCHING
`
`YES / UNKNOWN \S3
`WORD?
`NO
`
`
`
`
`
`OUTPUT WORD AS A RESULT S4
`OF VOICE RECOGNITION
`
`S5 OUTPUT PHONOLOGICAL
`INFORMATION
`
`
`
`IPR2023-00037
`Apple EX1023 Page 7
`
`
`
`U.S. Patent
`
`Sep. 17, 2002
`
`Sheet 7 of 9
`
`US 6,452,348 B1
`
`BEHAVIOR LEARNING
`PROCESS
`
`RECEIVE VOICE
`RECOGNITION RESULT
`
`S11
`
`S12 / UNKNOWN \YES
`WORD?
`
`
`
`S13
`REGISTER UNKNOWN
`WORD IN TABLE
`
`DECIDE AND PERFORM ACTION
`
`
`
`
`
`VOICE RECOGNITION RESULT2
`YES
`
`S15
`
`No(TIME UP)
`
`
`
`PERFORM ASSESSMENT S 17
`
`
`
`
`
`
`
`
`
`
`
`BASED ON ASSESSMENT, MODIFY SCORE S18
`OF BEHAVOR RESPONSIVE TO WORD
`IN ACCORDANCE WITH VOICE
`RECOGNITION RESULT
`
`IPR2023-00037
`Apple EX1023 Page 8
`
`
`
`U.S. Patent
`
`Sep. 17, 2002
`
`Sheet 8 of 9
`
`US 6,452,348 B1
`
`FIG 10
`
`BEHAVIOR LEARNING
`PROCESS
`
`DECIDE AND PERFORM ACTIONS21
`
`NO VOICE RECOGNITION RESULT2) S22
`YES
`
`S23
`NO TIME UP?
`YES
`
`UNKNOWN S24
`WORD?
`YES
`NO
`
`REGISTER UNKNOWN S25
`WORD IN TABLE
`
`INCREASE SCORE OF BEHAVIOR IS26
`RESPONSIVE TO WORD IN
`ACCORDANCE WITH VOICE
`RECOGNITION RESULT
`
`IPR2023-00037
`Apple EX1023 Page 9
`
`
`
`U.S. Patent
`
`Sep. 17, 2002
`
`Sheet 9 of 9
`
`US 6,452,348 B1
`
`FIG 11
`
`BEHAVIOR LEARNING
`PROCESS
`
`ENABLE POSTURE SETTING S31
`
`NO (posTURE MODIFIED) S32
`S33
`YES
`
`NoV TIME UP!
`YES
`
`REGISTER ACTION IN
`RESPONSE TO MODIFIED
`POSTURE, IN TABLE
`AND BEHAVOR MODEL
`
`S34
`
`
`
`S35
`VOICE RECOGNITION RESULT2
`
`
`
`S37/ UNKNOWN
`WORD?
`NO
`
`REGISTER UNKNOWN
`WORD IN TABLE
`
`TIME UP?
`YES
`
`NO
`
`INCREASE SCORE OF BEHAVIOR
`RESPONSIVE TO WORD IN
`ACCORDANCE WITH VOICE
`RECOGNITION RESULT
`
`DISABLE POSTURE SETTING | S40
`
`END
`
`IPR2023-00037
`Apple EX1023 Page 10
`
`
`
`US 6,452,348 B1
`
`1
`ROBOT CONTROL DEVICE, ROBOT
`CONTROL METHOD AND STORAGE
`MEDIUM
`
`2
`Preferably, the robot control device further includes a
`posture detector for detecting a posture of the robot, wherein
`the Setting unit Sets an association between the Voice rec
`ognition result of the Voice recognition unit and an action
`which the robot needs to take to reach the posture detected
`by the posture detector.
`Preferably, the control unit controls the drive unit in
`accordance with the association Set between the action of the
`robot and the Voice recognition result of the Voice recogni
`tion unit.
`Another aspect of the present invention relates to a robot
`control method for controlling the action of a robot, and
`includes a voice recognition Step of recognizing a voice, a
`control step of controlling a drive unit that drives the robot
`for action, and a Setting Step of Setting an association
`between the Voice recognition result provided in the Voice
`recognition Step and the action of the robot.
`Yet another aspect of the present invention relates to a
`Storage medium for Storing a computer-executable code for
`controlling the action of a robot, and the computer
`executable code performs a voice recognition Step of rec
`ognizing a Voice, a control Step of controlling drive unit that
`drives the robot for action, and a Setting Step of Setting an
`asSociation between the Voice recognition result provided in
`the Voice recognition Step and the action of the robot.
`In accordance with the present invention, the drive unit is
`control to drive the robot for action while the voice is being
`recognized, and an association is Set between the Voice
`recognition result and the behavior of the robot.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`FIG. 1 is an external perspective view showing one
`embodiment of the robot of the present invention;
`FIG. 2 is a block diagram showing the internal construc
`tion of the robot;
`FIG. 3 is a block diagram showing the hardware con
`Struction of a controller;
`FIG. 4 is a functional block diagram that is performed
`when the controller executeS programs,
`FIG. 5 shows a stochastic automaton as a behavioral
`model;
`FIG. 6 shows a behavior association table;
`FIG. 7 is a block diagram showing the construction of a
`Voice recognition module that performs Voice recognition in
`a Sensor input processor,
`FIG. 8 is a flow diagram illustrating the operation of the
`Voice recognition module;
`FIG. 9 is a flow diagram illustrating a first embodiment of
`the behavior learning process of a behavior decision unit;
`FIG. 10 is a flow diagram illustrating a second embodi
`ment of the behavior learning process of the behavior
`decision unit, and
`FIG. 11 is a flow diagram illustrating a third embodiment
`of the behavior learning process of the behavior decision
`unit.
`
`BACKGROUND OF THE INVENTION
`1. Field of the Invention
`The present invention relates to a robot control device, a
`robot control method, and a Storage medium, and, more
`particularly, to a robot control device and a robot control
`method for controlling a robot with which an individual
`enjoys a training process like training an actual pet, Such as
`a dog or cat, and to a storage medium for Storing a Software
`program for the robot control method.
`2. Description of the Related Art
`Commercially available are a number of (stuffed) toy
`robots which act in response to the pressing of a touch
`Switch or a voice of an individual having an intensity above
`a predetermined level. In the context of the present
`invention, the toy robots include stuffed toy robots.
`In Such conventional robots, the relationship between the
`pressing of the touch Switch or the input of the Voice and the
`action (behavior) of the robot is fixed, and a user cannot
`modify the behavior of the robot to the user's preference.
`The robot merely repeats the same action for Several times,
`and the user may grow tired of the toy. The user thus cannot
`enjoy a learning process of the robot in the same way as a
`dog or cat may learn trickS.
`SUMMARY OF THE INVENTION
`Accordingly, it is an object of the present invention to
`provide a robot which offers substantial entertainment value.
`An aspect of the present invention relates to a robot
`control device for controlling the action of a robot, and
`includes a voice recognition unit for recognizing a voice, a
`control unit for controlling a drive unit that drives the robot
`for action, and a Setting unit for Setting an association
`between the Voice recognition result provided by the Voice
`recognition unit and the behavior of the robot.
`The control unit may decide an action for the robot to
`take, and controls the drive unit to drive the robot to perform
`the decided action, wherein the Setting unit Sets an associa
`tion between the decided action and the Voice recognition
`result immediately Subsequent to the decided action taken by
`the robot.
`The robot control device preferably includes an assess
`ment unit for assessing a voice recognition result obtained
`Subsequent to the first voice recognition result provided by
`the Voice recognition unit, wherein the control unit controls
`the drive unit to drive the robot to perform a predetermined
`action in response to the first voice recognition result, and
`wherein the Setting unit Sets an association between the
`predetermined action and the first voice recognition result in
`accordance with the assessment result of the next voice
`recognition result.
`The Setting unit preferably registers an association
`between the Voice recognition result and the action of the
`robot in an association table that associates a word, which
`the Voice recognition unit receives for voice recognition,
`with the action of the robot.
`When the voice recognition result provided by the voice
`recognition unit indicates that the word is an unknown one,
`the Setting unit preferably registers the unknown word in the
`asSociation table, and preferably registers an association
`between the registered unknown word and the action of the
`robot.
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`DESCRIPTION OF THE PREFERRED
`EMBODIMENTS
`FIG. 1 is an external perspective view showing one
`embodiment of a robot of the present invention, and FIG. 2
`shows an electrical construction of the robot.
`In this embodiment, the robot models a dog. A head unit
`3 is connected to a torso unit 2 at the forward end thereof,
`
`IPR2023-00037
`Apple EX1023 Page 11
`
`
`
`3
`and foot units 6A and 6B are respectively composed of
`thighs 4A-4D and heels 5A-5D, and are respectively con
`nected to the torso unit 2 on the side walls at the front and
`the back thereof. A tail 1 is connected to the back end of the
`torSo unit 2.
`. 7 as actuators are respectively
`.
`MotorS 7, 72, .
`arranged at the joints between the tail 1 and the torSO unit 2,
`between the head unit 3 and the torso unit 2, between each
`of the thighs 4A-4D and the torso unit 2, and between the
`thighs 4A-4D and the respective heels 5A-5D. With the
`motorS 7, 72, ... 7 turning, the tail 1 and the head unit 3
`are rotated about each of the three axes, i.e., the X, y, and Z
`axes, the thighs 4A-4D are rotated about each of the two
`axes, i.e., the X and y axes, the heels 5A-5D are rotated
`about the Single axis, i.e., the X axis. In this way, the robot
`takes a variety of actions.
`The head unit 3 contains a CCD (Charge-Coupled
`Device) camera 8, a microphone 9, and a pressure sensor 10
`at the predetermined positions thereof. The torSo unit 2
`houses a controller 11. The CCD camera 8 picks up a picture
`of the Surroundings of the robot, including the user. The
`microphone 9 picks up ambient Sounds including the Voice
`of the user. The pressure Sensor 10 detects pressure applied
`the head unit 3 by the user or other objects. The controller
`11 thus receives the image of the Surroundings taken by the
`CCD camera 8, the ambient sound picked up by the micro
`phone 9, pressure applied on the head unit 3 by the user, as
`image data, Sound data, and preSSure data, respectively.
`Rotary encoderS 12, 12, .
`.
`. 12 are respectively
`arranged for the motors 7, 7, . . . 7 at the respective
`articulation points. The rotary encoderS 12, 12, . . . 12
`respectively detect the angles of rotation of the rotary shafts
`of the respective motors 7,72, ...7. The angles of rotation
`detected by the rotary encoders 12, 12, . . . , 12 are fed
`to the controller 11 as detected angle data.
`The controller 11 determines the posture thereof and the
`Situation Surrounding the robot based on the image data from
`the CCD camera 8, the Sound data from the microphone 9,
`the pressure data from the pressure Sensor 10, and the angle
`data from the rotary encoderS 12, 12, .
`.
`. 12. The
`controller 11 decides a Subsequent action to take next in
`accordance with a preinstalled control program. Based on
`the decision, any of the motorS 7, 72, . . . 7 is driven as
`required.
`The robot thus acts in a self-controlled fashion by moving
`the tail 1, the torso unit 2, and the foot units 6A-6D to a
`desired State.
`FIG.3 shows the construction of the controller 11 of FIG.
`
`2
`
`The controller 11 includes a CPU (Central Processing
`Unit) 20, program memory 21, RAM (Random Access
`Memory) 22, non-volatile memory 23, interface circuit (I/F)
`24, and motor driver 25. All of these components are
`interconnected via a buS 26.
`The CPU20 controls the behavior of the robot by execut
`ing a control program Stored in the program memory 21. The
`program memory 21 is an EEPROM (Electrically Erasable
`Programmable Read Only Memory), and stores the control
`program executed by the CPU 20 and required data. The
`RAM 22 temporarily stores data needed by the CPU 20 in
`operation. The non-volatile memory 23, as will be discussed
`later, Stores an emotion/instinct model, a behavioral model,
`a behavior association table, etc., which must be retained
`throughout power interruptions. The interface circuit 24
`receives data supplied by the CCD camera 8, the micro
`phone 9, the pressure Sensor 10, and the rotary encoderS 12
`
`US 6,452,348 B1
`
`4
`through 12, and sends the data to the CPU 20. Under the
`control of the CPU 20, the motor driver 25 feeds, to the
`motorS 7 through 7, drive Signals to drive these motors.
`The CPU 20 in the controller 11 controls the robot in
`accordance with a functional block diagram shown in FIG.
`4, by executing the control program Stored in the program
`memory 21.
`FIG. 4 thus illustrates the function of the controller 11.
`A Sensor Signal processor 31 recognizes external Stimu
`lation acting on the robot or the Surroundings of the robot,
`and feeds these data of the external Stimulation and the
`Surroundings to an emotion/instinct model unit 32 and a
`behavior decision unit 33.
`The emotion/instinct model unit 32 manages an emotion
`model and an instinct model respectively expressing the
`State of the emotion and the instinct of the robot. In response
`to the output from the Sensor Signal processor 31, and the
`output of the behavior decision unit 33, the emotion/instinct
`model unit 32 modifies parameters defining the emotion
`model and the instinct model, thereby updating the State of
`the emotion and the instinct of the robot.
`The behavior decision unit 33 contains a behavior model
`memory 33A and a behavior association table memory 33B,
`and decides a next behavior to be taken by the robot based
`on the content of the memory, the output of the Sensor Signal
`processor 31, and the emotion model and the instinct model
`managed by the emotion/instinct model unit 32. The behav
`ior decision unit 33 then feeds the information of the
`behavior (hereinafter referred to as behavior information) to
`a posture transition unit 34.
`In order to cause the robot to behave in accordance with
`the behavior information supplied by the behavior decision
`unit 33, the posture transition unit 34 calculates control data,
`Such as angles of rotation and rotational Speeds of the motors
`7 through 7 and outputs the control data to a data control
`unit 35.
`The data control unit 35 drives the motors 7 through 7.
`in response to the control data coming from the posture
`transition unit 34.
`The sensor signal processor 31 in the controller 11 thus
`constructed recognizes a particular external State, a particu
`lar action taken by the user, and an instruction given by the
`user based on the image data Supplied by the camera 8, the
`Voice data provided by the microphone 9, and the preSSure
`data output by the pressure Sensor 10. The recognition result
`is then output to the emotion/instinct model unit 32 and the
`behavior decision unit 33.
`The Sensor Signal processor 31 performs image recogni
`tion based on the image data provided by the camera 8. For
`example, the Sensor Signal processor 31 recognizes that there
`is a pole or a wall, and then feeds the recognition result to
`the emotion/instinct model unit 32 and the behavior decision
`unit 33. The sensor signal processor 31 performs voice
`recognition by processing the Voice data from the preSSure
`sensor 10. For example, when the pressure sensor 10 detects
`a pressure of short duration of time at a level higher than a
`predetermined threshold, the Sensor Signal processor 31
`recognizes that the robot is being "beaten or chastised'.
`When the pressure Sensor detects a pressure of long duration
`of time at a level lower than a predetermined threshold, the
`Sensor Signal processor 31 recognizes as being "stroked or
`praised'. The Sensor Signal processor 31 then feeds the
`recognition result to the emotion/instinct model unit 32 and
`the behavior decision unit 33.
`The emotion/instinct model unit 32 manages the emotion
`m model expressing emotional States, Such as Such as “joy',
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`IPR2023-00037
`Apple EX1023 Page 12
`
`
`
`US 6,452,348 B1
`
`"Sadness”, “anger”, etc., and the instinct model expressing
`“appetite”, “sleepiness”, “exercise”, etc.
`The emotion model and the instinct model express the
`States of the emotion and instinct of the robot by integer
`numbers ranging from Zero to 100, for example. The
`emotion/instinct model unit 32 updates the values of the
`emotion model and instinct model in response to the output
`of the Sensor Signal processor 31, and the output of the
`behavior decision unit 33 with a time elapse taken into
`considered. The emotion/instinct model unit 32 feeds the
`values of the updated emotion model and instinct model (the
`states of the emotion and the instinct of the robot) to the
`behavior decision unit 33.
`The States of the emotion and instinct of the robot change
`in response to the output of the behavior decision unit 33 as
`discussed below, for example.
`The behavior decision unit 33 supplies the emotion/
`instinct model unit 32 with the behavior information of the
`behavior the robot took in the past or is currently taking (for
`example, “the robot looked away or is looking away”).
`Now, when the robot already in anger is stimulated by the
`user, the robot may take an action of "looking away in
`response. In this case, the behavior decision unit 33 Supplies
`the emotion/instinct model unit 32 with the behavior infor
`mation of "looking away'.
`Generally Speaking, an action of expressing discontent in
`anger, Such as the action of looking away, may Somewhat
`calm down anger. The emotion/instinct model unit 32 then
`decreases the value of the emotion model representing
`“anger’ (down to a Smaller degree of anger) when the
`behavior information of “looking away” is received from the
`behavior decision unit 33.
`The behavior decision unit 33 decides a next action to take
`based on the recognition result of the Sensor Signal processor
`31, the output of the emotion/instinct model unit 32, elapsed
`time, the memory content of the behavior model memory
`33A, and the memory content of the behavior association
`table memory 33B. The behavior decision unit 33 then feeds
`the behavior information, representing the action, to the
`emotion/instinct model unit 32 and the posture transition
`unit 34.
`The behavior model memory 33A stores a behavioral
`model that defines the behavior of the robot. The behavior
`asSociation table memory 33B Stores an association table
`that associates the Voice recognition result of the Voice input
`to the microphone 9 with the behavior of the robot.
`The behavioral model is formed of a stochastic automaton
`shown in FIG. 5. In the stochastic automaton shown here, a
`behavior is expressed by any node (State) among NODEo
`through NODE, and a transition of behavior is expressed
`by an arc ARC representing a transition from a node
`NODE to another node NODE (note that there is a case
`when another node is the original node) (mo, ml=0, 1,...,
`M).
`The arc ARC, representing the transition from the node
`NODE to the node NODE, has a transition probability
`P, and the probability of node transition, namely, the
`transition of behavior is determined, in principle, based on
`the corresponding transition probability.
`Referring to FIG. 5, for simplicity, the stochastic automa
`ton having (M+1) nodes includes arcs ARC through ARC
`respectively extending from node NODE the other nodes
`NODE through NODE.
`As shown in FIG. 6, the behavior association table
`registers the association between each word obtained as a
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`result of Voice recognition and an action to be taken by the
`robot. The table shown in FIG. 6 lists, as a correlation score
`of an integer number, the association between a voice
`recognition result and a behavior. Specifically, the integer
`number representing the degree of association between the
`Voice recognition result and the behavior is the correlation
`Score. When a voice recognition result is obtained, the robot
`changes the probability or the degree of frequency of a
`behavior depending on the correlation Score.
`When the voice recognition result is “Hey” in the behav
`ior association table in FIG. 6, the degrees of frequency of
`actions of “walking forward” and “biting” (each having no
`Zero correlation Scores) taken by the robot are respectively
`increased by correlation scores of 10 and 20. When the voice
`recognition result is “come over here', the degree of fre
`quency of the action of “walking forward’ (having no zero
`correlation score) taken by the robot is increased by a
`correlation score of 60. When the voice recognition result is
`“Shake hands', the degree of frequency of the action of
`“looking up’ (having no Zero correlation score) taken by the
`robot is increased by a correlation Score of 20, and at the
`Same time, the degree of frequency of the action of “shaking
`hands” is increased by a correlation score of 70.
`The behavior decision unit 33, in principle, determines
`which node to transition to from a node corresponding to a
`current behavior in the Stochastic automaton as a behavioral
`model (see FIG. 5), based on the values of the emotion
`model and instinct model of the emotion/instinct model unit
`32, elapsed time, the recognition result of the Sensor Signals
`provided by the Sensor Signal processor 31, besides the
`transition probability Set for the arc extending from the
`current node. The behavior decision unit 33 then supplies the
`emotion/instinct model unit 32 and posture transition unit 34
`with the behavior information representing the behavior
`corresponding to the node Subsequent to the node transition
`(also referred to as a post-node-transition action).
`Depending on the values of the emotion model and
`instinct model, the behavior decision unit 33 transitions to a
`different node even if the Sensor Signal processor 31 outputs
`the same external recognition results.
`Specifically, now, the output of the Sensor Signal proces
`Sor 31 indicates that the palm of a hand is stretched out in
`front of the robot. When the emotion model of “anger”
`indicates that the robot is “not angry” and when the instinct
`model of “appetite' indicates that the robot is not hungry, the
`behavior decision unit 33 decides to drive the robot to shake
`hands as a post-node-transition action, in response to the
`Stretched palm.
`Similarly, the output of the Sensor Signal processor 31
`now indicates that the palm of the hand is stretched out in
`front of the robot. Although the emotion model of “anger”
`indicates that the robot is “not angry” but the instinct model
`of “appetite” indicates that the robot is hungry, the behavior
`decision unit 33 decides to lick at the palm of the hand as a
`post-node-transition action.
`Again, the output of the Sensor Signal processor 31 now
`indicates that the palm of the hand is stretched out in front
`of the robot. When the emotion model of “anger” indicates
`that the robot is “angry”, the behavior decision unit 33
`decides to drive the robot to abruptly look away, as a
`post-node-transition action, regardless of the value of the
`instinct model of “appetite'.
`When the recognition result of the sensor output provided
`by the Sensor Signal processor 31 determines that the Voice
`is a user's own voice, the behavior decision unit 33 deter
`mines which node to transition to from the node for the
`
`IPR2023-00037
`Apple EX1023 Page 13
`
`
`
`7
`current behavior, based on the correlation Scores of the
`behaviors indicated by the Voice recognition result, regis
`tered in the behavior association table (see FIG. 6) in the
`behavior association table memory 33B. The behavior deci
`sion unit 33 then supplies the emotion/instinct model unit 32
`and the posture transition unit 34 with the behavior infor
`mation indicating the behavior (post-node-transition action)
`corresponding to the decided node. In this way, the robot
`behaves differently dependent on the correlation scores of
`the behaviors in accordance with the Voice recognition
`result.
`Upon receiving a predetermined trigger, the behavior
`decision unit 33 transitions to a node in the behavior model,
`thereby deciding a post-node-transition action to take.
`Specifically, the behavior decision unit 33 decides a post
`node-transition action to take, when a predetermined time
`has elapsed Since the robot Started the current action, when
`the Sensor Signal processor 31 outputs a particular recogni
`tion result Such as a voice recognition result, or when the
`value of each of the emotion model or the instinct model of
`the emotion/instinct model unit 32 rises above a predeter
`mined threshold.
`Based on the behavior information provided by the behav
`ior decision unit 33, the posture transition unit 34 generates
`posture transition information for transitioning from a cur
`rent posture to a next posture, and outputs the posture
`transition information to the data control unit 35.
`Specifically, the posture transition unit 34 recognizes the
`current posture based on the outputs from the rotary encod
`erS 12 through 12, and calculates the angles of rotation and
`rotational speeds of the motors 7 through 7 for the robot
`to take an action (a post-node-transition action) correspond
`ing the behavior information from the behavior decision unit
`33, and then outputs as the posture transition information to
`the data control unit 35.
`The data control unit 35 generates drive signals for
`driving the motors 7 through 7 in accordance with the
`posture transition information from the posture transition
`unit 34, and supplies the motors 7 through 7 with the drive
`Signals. The robot thus takes a post-node-transition action
`accordingly.
`FIG. 7 is a functional block diagram of a portion of the
`sensor signal processor 31 shown in FIG. 4, which is
`hereinafter referred to as a voice recognition module and
`performs voice recognition in response to Voice data from
`the microphone 9.
`The Voice recognition module recognizes a voice input to
`the microphone 9 using a continuous HMM (Hidden
`Markov Model), and outputs voice recognition results.
`A feature parameter extractor 41 receives the Voice data
`from the microphone 9. The feature parameter extractor 41
`performs MFCC (Mel Frequency Cepstrum Coefficient)
`analysis on the Voice data input thereto on a frame by frame
`basis. The MFCC analysis result is output to a matching unit
`42 as a feature parameter (feature vector). AS feature
`parameters, the feature parameter extractor 41 may further
`extract a linear prediction coefficient, a cepstrum coefficient,
`a line spectrum pair, and power in every predetermined
`frequency band (output of a filter bank).
`Using the feature parameters from the feature parameter
`extractor 41, the matching unit 42 recognizes the Voice input
`to the microphone 9 based on the continuous HMM model
`while referencing an acoustic model memory 43, a dictio
`nary memory 44, and a grammar memory 45 as necessary.
`The acoustic model memory 43 Stores an acoustic model
`that represents an acoustic feature Such as phonemes and
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`US 6,452,348 B1
`
`8
`Syllables in a voice to be recognized. Since Voice recognition
`is here carried out using the continuous HMM, an HMM is
`employed. The dictionary memory 44 Stores a dictionary of
`words which contains information of the pronunciation
`(phonological information) of each word to be recognized.
`The grammar memory 45 Stores a grammar which describes
`how each word registered in the data control unit 35 is
`chained. The grammar may be a context-free grammar, or a
`rule based on word chain probability (N-gram).
`The matching unit 42 produces an acoustic model of a
`word (a word model) by connecting acoustic models stored
`in the dictionary memory 44 through referencing the dic
`tionary in the dictionary memory 44. The matching unit 42
`further connects Several word models by referencing the
`grammar Stored in the grammar memory 45, and processes
`the connected word models through the continuous HMM
`method based on the feature parameters, thereby recogniz
`ing the Voice input to the microphone 9. Specifically, the
`matching unit 42 detects a word model having the highest
`Score (likelihood) from the time-Series feature parameters
`output by the feature parameter extractor 41, and outputs a
`word (a word chain) corresponding to the word model. The
`Voice recognition result of the matching unit 42 is thus
`output to the emotion/instinct model unit 32 and the behav
`ior decision unit 33 as the output of the Sensor Signal
`processor 31.
`The operation of the Voice recognition module shown in
`FIG. 7 is now discussed, with reference to a flow diagram
`shown in FIG. 8.
`The voice is now input to the microphone 9. The digital
`Voice data responsive to the Voice is fed to the Sensor Signal
`processor 31, the microphone 9. The voice data is then fed
`to the feature parameter extractor 41.
`The feature parameter extractor 41 Subjects the voice data
`to acoustic analysis, thereby extracting a time-Series feature
`parameter in Step S1. The feature parameter is then fed to the
`matching unit 42. In Step S2, the matching unit 42 Subjects
`the Voice input to the microphone 9 to voice recognition
`through the continuous HMM method, using the feature
`parameter from the feature parameter extractor 41.
`In step S3, the matching unit 42 determines whether the
`voice recognition result indicates an unknown word (an
`unknown word chain), namely, a word not registere