`
`Introduction to Speech Recognition
`
`high-level descriptions depend entirely upon the amount of work a developer puts
`into factoring the application.
`
`To write scripts, users must have AppleScript—aware applications and must know the
`scripting commands for those applications.
`
`To record scripts, users must have applications that can convert actions into commands
`as users perform them. Moreover, they’ll probably have to look at the recorded script to
`determine whether the application recorded all their actions.
`
`As shipped, the Macintosh Quadra 84OAV and Macintosh Centris 660AV are not supplied
`with scriptable or recordable applications. Some third-party applications are currently
`available. The Apple Scriptable Text Editor is recordable, and Excel and FileMaker® Pro
`are scriptable. The AppleScript system Will become more useful as more and more
`applications support it.
`
`QuicKeys
`
`QuicKeys is good for recording low-level events and thus for handling simple
`interactions with most applications. It suffers many of the same problems as the original
`MacroMaker from Apple:
`
`I It usually just replays users actions exactly (users see the interface flying by as if they
`were doing it).
`
`Since it’s just replaying low-level events, many of its commands break down if the
`position of the underlying object changes.
`
`It lacks the full expressive qualities of AppleScript; it’s really its own language, but
`one lacking sophisticated conditionals, loops, procedure calls, and so on.
`
`To write scripts, users must know the QuicKeys language supported by the OSA
`component so that they can change volatile commands such as Drag and Click At to
`more stable commands where possible. The Macintosh Quadra 840AV and Macintosh
`Centris 660AV system software supports QuicKeys, so users can create new macros. CE
`Software also provides a set of example macros written with QuicKeys. The QuicKeys
`scripting language may not include the full power of the “normal” QuicKeys system; for
`example, QuicKeys extensions, which circumvent the interface and set values directly,
`may not be supported.
`
`
`
`User Requirements
`
`As With AppleScript, users of the Speech Macro Editor will have to be fairly sophisticated
`to be able to write and edit scripts; the majority of users will have to use prewritten
`scripts. Recording should allow less experienced users to create voice macros, but
`recording must be viewed as a shortcut for typing a new script; further editing will
`probably be required.
`
`Speech Macro Editor
`
`327
`
`PUMA EXHIBIT 2005
`
`Page 351 of 500
`
`PART 8 OF 10
`
`Page 351 of 500
`
`
`
`CHAPTER 7
`
`Introduction to Speech Recognition
`
`Since the SME isn’t trying to reproduce the full suite of scripting tools being developed
`for AppleScript (no debugging, no access to help on scripting commands, and so on), its
`users will need to know how to find the answers to these questions:
`
`I Is the application AppleScript aware? Is it also recordable? What scripting commands
`does it provide?
`
`I What scripting commands does the script system provide?
`
`The script editor provided with AppleScript has facilities for developing more
`complicated scripts than are possible with SME and includes complete debugging and
`error reporting features.
`
`Using the Speech Macro Editor
`
`The Speech Macro Editor is an application in the Extras folder on the hard disk. Here’s
`how to start it:
`
`1. Open the Extras folder.
`
`2. Open the Speech Macro Editor by double-clicking its icon.
`When the Speech Macro Editor starts up, by default it automatically opens the Speech
`Macros document from the Extensions folder. The document window for Speech
`Macros lists all the speech macros it contains. Initially, the SME does not select an item
`in the list. A typical Speech Macros document window is shown in Figure 7-5.
`
`Figure 7-5
`
`Typical Speech Macro document window
`
`Speech Macros
`
`
`Speech Macro Editor
`
`Close aii subfoiders
`Finder
`Finder
`Find the orioinai item
`Finder
`Throw these awag
`Finder
`Turn offthe Trash warning
`Finder
`Turn on file sharing
`Macro 1
`MS Word
`Macro 2
`MS Word
`Macro 3
`MS Word
`Macro 1
`StudioIS2
`Macro 2
`Studio-’32
`
`Studioa’SZMacro 3
`
`IMPORTANT
`
`Speech macros can exist in any SME document. The Speech Macros
`document is just the default document shipped with the computer. An
`SME document must be in the Extensions folder or the System Folder
`for it to become part of the current grammar. A
`
`Page 352 of 500
`
`Page 352 of 500
`
`
`
`CHAPTER 7
`
`Introduction to Speech Recognition
`
`Recording a New Macro
`
`To record a new macro, follow the steps below. As an example, we’ll create a speech
`macro for copying the first item in the Scrapbook.
`
`1. Choose Create New Macro from the Macro menu.
`
`A blank macro Window appears onscreen. The insertion point is set in the Name field.
`
`. Type the phrase that you want to speak.
`It’s best for the name to be a phrase rather than a single word. Recognition works
`faster and more accurately if the differences among names are more pronounced. Also
`note that, unlike most speech recognition technologies, Casper can recognize
`continuous speech.
`
`. From the Context pop-up menu, choose the context in which you’ll be able to speak
`the new macro.
`
`In this case, you want to have this macro available at all times, so change the context
`to Anywhere.
`
`Figure 7—6 shows these three steps performed in a New Macro window.
`
`
`
`4. Choose a script language for recording the macro.
`The choice of script system determines what applications (and events in those
`applications) are recordable. Users need to understand the benefits and limitations of
`a particular choice here. Since the system and Finder won’t support AppleScript,
`change the script system to QuicKeys.
`
`Figure 7-6
`
`Typical New Macro window
`
`
`
`Name: mm; from Scraphouk|
`
`context:
`
`smut:
`
`
`
`Note
`
`If users open the SME on a system that’s not running AppleScript, they
`can only edit scripts. The following actions with the Record, Stop, and
`Play buttons and the Script pop—up menu will not be available; these
`items will be dimmed. O
`
`Speech Macro Editor
`
`Page 353 of 500
`
`Page 353 of 500
`
`
`
`CHAPTER 7
`
`Introduction to Speech Recognition
`
`. Click the Record button.
`
`The button “locks” in place, and a small recording icon blinks over the Apple menu
`While the user is in record mode. The icon that appears depends on the script system
`(AppleScript displays a small cassette, QuicKeys a small microphone).
`
`. Switch to the application and perform the desired actions.
`In this example, pull down the Apple menu, choose Scrapbook, choose Copy, and
`close the window.
`
`IMPORTANT
`
`Script systems may handle the posting of commands differently.
`For example, AppleScript sends commands to the SME after
`each one occurs. QuicKeys returns an entire script after the user
`stops recording. A
`
`7. Return to the SME by clicking an open SME window or by choosing SME from the
`Application menu, then click Stop.
`Wait until the recording icon stops blinking. The script appears in the script area of
`the window.
`
`Speech Macro Editor
`
`. Click the close box to save the new macro.
`
`Renaming a Macro
`
`To change the name of a macro, follow these steps:
`
`1. Select the macro you want to edit and choose Edit Macro from the Macro menu, or
`double-click the macro name.
`
`2. A macro window appears.
`
`3. Type the new name in the text field.
`
`4. Click the close box.
`
`The window disappears, the name and context items in the list of macros change, and
`the list is sorted.
`
`Saving Macro Changes
`
`At any point, the user can save changes to an SME document by choosing one of the
`Save commands in the File menu. These commands are available from the main
`
`document window or any of the macro windows. If a save command is chosen when a
`macro window is active, the SME saves the entire document in which that macro resides.
`
`The SME displays the standard Save Changes dialog box if the user closes a document
`window without having first saved changes.
`
`Loading Macros
`
`Casper loads macros at the following times:
`
`I When users turn speech on from the Speech Setup control panel, Casper loads rules
`from any SME document that is in the Extensions folder or the System Folder on the
`startup disk. Casper also loads any speech rules documents found in either of these
`two locations.
`
`Page 354 of 500
`
`Page 354 of 500
`
`
`
`CHAPTER 7
`
`Introduction to Speech Recognition
`
`When users make changes to any of the SME documents loaded from the Extensions
`folder or the System Folder, Casper reloads the changed SME documents when the
`user saves the document. Casper should acknowledge that it’s reloading the macros
`(so that users know it’s happening) by posting a message to the feedback window.
`
`Users who place new SME documents or new speech rules documents in the
`Extensions folder or the System Folder must stop and restart Casper to load the new
`documents. Casper keeps track only of the documents it loaded when starting up.
`
`If an application contains speech rules in its resource fork, they will be loaded when
`the application is launched. For further information, see “Speech Rules Files,” in
`Chapter 8.
`
`Built-in Speech Rules and Grammar
`
`Speech rules are structures used to define how words can be strung together for speech
`recognition. They are discussed in detail in Chapter 8, “Speech Rules.” Since many
`commands (such as those that choose menu items) are required in all applications, a
`standard set of rules is built into Casper to provide a robust set of standard commands.
`Many menu functions are common across a wide variety of applications, and most
`applications will also use Finder-type commands to access the Apple menu items.
`
`Built-in Speech Rules and Grammar
`
`In English there are grammar rules that define the noun-verb-subject sequence. A similar
`sequence must be identified explicitly for the speech recognizer. For example:
`
`"Open Chooser”
`
`"Open the Chooser”
`
`"Open menu item Chooser"
`
`could all be used to open the Chooser control panel. All of the acceptable word strings
`must be defined in order for the Speech Monitor to select the correct command. If the
`user says "Chooser open,” the rules in this example will not recognize that statement as
`an acceptable command. If the word string “Chooser open” is added to the rules, then
`Casper will respond with an acceptable command.
`
`In the Macintosh Quadra 840AV and Macintosh Centris 660AV speech recognition
`software, all menu items and dialog box buttons are controllable by speech. Use the
`following command forms:
`
`I "Open AppleMenuItem,” where AppleMenuItem is any item within the Apple menu—
`for example, "Open Alarm Clock”
`
`“Switch to ProcessMenuItem,” where ProcessMenuItem is the name of any process—for
`example, "Switch to Finder”
`
`A new Speakable Items folder exists in the Apple Menu Items folder in the Macintosh
`Quadra 840AV and Macintosh Centris 660AV system software. Any item or alias to an
`item within it will be speakable. Some aliases to standard items are installed
`automatically, such as Open System Folder. The phrase to speak these items is the same
`as the name given to the item. AppleScript (or QuicKeys) items can be placed in the
`
`Page 355 of 500
`
`Page 355 of 500
`
`
`
`CHAPTER 7
`
`Introduction to Speech Recognition
`
`Speakable Items folder as well. This folder is not dynamically updated at present, so
`speech must be shut down and restarted to load any new items placed in it.
`
`Here are some sample Finder phrases:
`
`I “Hello”
`
`I "What time is it”
`
`I "What day is it”
`
`"close window”
`
`"close all windows” (available only when the Finder is frontmost)
`
`“zoom window”
`
`"is file sharing on"
`
`"start file sharing”
`
`I “stop file sharing”
`
`I “shut speech of ”
`
`Here are sample printing phrases:
`
`I "Print from n to n”
`
`I "Print from page 11 to m”
`
`I "Print n copies”
`
`"Print page 11”
`
`Performance
`
`Casper’s speech recognition goal is a minimum in—grammar error rate for a typical
`task in a low-noise environment. In-grammar error rate is the number of times the
`speech recognition software does not respond as intended when a defined command
`is spoken. All of the variables listed in the next section affect the ability of the system
`to recognize speech.
`
`"Print page 11 to m,” where n and m are numbers from 1 through 99. (This works in all
`applications that use Cmd-P to print.)
`
`Performance
`
`Real-Time Response
`
`Response time is a function of several variables:
`
`I Clear pronunciation. The search tends to be faster if the utterance is spoken clearly in
`North American English.
`
`I Grammar complexity. The higher the number of possible word phrases in the speech
`rules, the longer the search and the higher the error rate.
`
`Page 356 of 500
`
`Page 356 of 500
`
`
`
`CHAPTER 7
`
`Introduction to Speech Recognition
`
`Word complexity. The choice of words can affect the duration of the search;
`similar-sounding words are harder to distinguish.
`
`Extraneoas noise. Additional noise affects the quality of the input and potentially
`increases the search time as the noise is increased.
`
`Room acoustics. Bad acoustics may degrade system performance, including response
`time from one acoustic environment to the next.
`
`Environmental adaptation. This algorithm adapts to changing room conditions and
`background noise—after every five utterances, the environmental adaptation is
`updated.
`
`Types of Errors
`
`Taken as a group, the rules for a specific application form the grammar for that applica-
`tion. The recognition search returns the best match from the available grammar.
`
`One type of error occurs when the search results are too uncertain, in which case the
`speech recognizer rejects the sentence as unrecognizable. Another type of error occurs
`when an in-grammar match is selected to an incorrect sentence and the speech
`recognizer responds although no command was given.
`
`Apple’s naming conventions for speech recognition responses, both correct and
`erroneous, are shown in Table 7-1.
`
`Performance
`
`For the in-grammar case the first item is the nonerror response. For the out-of-grammar
`case, the first two items are the nonerror responses. There are several reasons why a
`phrase might not be properly recognized—for example, unclear pronunciation,
`background noise, or bad room acoustics.
`
`
`
`Table 7-1 Grammatical naming conventions
`
`ln-grammar
`
`Correct recognition
`
`Incorrect recognition
`
`Out-of-grammar
`
`Correct rejection
`
`Correct detection of new word
`
`Incorrect rejection
`(correct words/ grammar not identified)
`
`Incorrect recognition
`(through substitution or insertion)
`
`Acceptable Limits or Constraints
`
`The system is constrained to North American adult English used in grammatically
`simple sentences. Note that this also implies a limited vocabulary.
`
`The system accuracy will typically drop during changing environmental conditions.
`Adaptation takes approximately five utterances.
`
`The speech recognition software currently understands only clearly spoken English
`words. The user must speak in well-defined sounds for all words and sentences.
`
`Page 357 of 500
`
`Page 357 of 500
`
`
`
`CHAPTER 7
`
`Introduction to Speech Recognition
`
`Note
`
`The speech recognizer cannot recognize most nonstandard English
`words. A nonstandard word could be any word that is formed as a
`result of concatenating words, using abbreviations, or other shortcuts,
`which typically result in many ways to say the same word. The current
`recognizer accepts only one pronunciation for a word, with only small
`variations from that pronunciation. As an example, a made-up word
`used as a filename may not be recognizable. Abbreviated forms of words
`(such as MPW) are not typically recognizable as words. 0
`
`Performance
`
`Page 358 of 500
`
`Page 358 of 500
`
`
`
`CHAPTERS
`
`Speech Rules
`
`
`
`Page 359 of 500
`
`Page 359 of 500
`
`
`
`CHAPTER 8
`
`Speech Rules
`
`This chapter describes how the speech recognition software in the Macintosh
`Quadra 840AV and Macintosh Centris 66OAV uses speech rules to interpret and respond
`to the user’s utterances. It also describes the CompileRules tool available with the
`
`Macintosh Programmer’s Workshop (MPW), which compiles the rule source files
`into resources. Read Chapter 7, “Introduction to Speech Recognition,” before reading
`this chapter.
`
`Overview
`
`At the heart of Apple’s speech recognition system is a data structure called a speech rule.
`A speech rule is a word or a sentence that is defined to perform an action Within the
`current computer environment. Each rule performs a unique function depending on the
`words spoken. An application’s grammar is derived from the set of speech rules and the
`current context.
`
`A rule can include variables used in locations that can be more than a single word. A
`word within a sentence that can be substituted with another word is called a category.
`A category can be an individual word or another category. When it is a predefined
`category, the acceptable words are listed in that category. For example, <number> can
`be a number from 1 to 9. A <ten> is defined as a number in the tens location, plus a
`<number> or a 0. A <hundred> is defined as a number in the hundreds location, plus a
`<ten> or a 0, plus a <number> or a 0. This process can be continued to make up any
`arbitrarily large number. In each case the category is made up of previously defined
`categories, except for <number>, which is a list of individual words.
`
`Overview
`
`In its simplest form, a speech rule maps some spoken utterance to a value or an action.
`When the speech recognition software detects that the user has uttered the phrase, the
`corresponding value is computed or an action is performed. Here is an example of a
`simple speech rule:
`
`%rule
`
`bold
`
`%action
`
`tell application "MyApp"
`
`set style of selection to bold
`end
`
`%end
`
`The effect of this rule is that whenever the user says “bold,” the application named
`MyApp changes the selected text to bold. The %rule clause signals the beginning of a
`new speech rule; the line containing bold contains the phrase that should be recognized;
`the %action clause signals the beginning of the action part of the rule; the lines from
`tell to end contain the script that should be executed when the rule’s phrase is
`recognized; and the %end clause signals the end of the rule.
`
`Page 360 of 500
`
`Page 360 of 500
`
`
`
`CHAPTER 8
`
`Speech Rules
`
`A speech rule can have any number of phrases—for example:
`
`%rule
`
`bold
`
`change to bold
`bold this
`
`make it bold
`
`%action
`
`tell application "MyApp"
`
`set style of selection to bold
`end
`
`%end
`
`Overview
`
`One problem with the foregoing rule is that it causes the MyApp application to change
`styles, no matter what application is currently active. So, if MacWrite® is the active
`application and the user says”bold,” MyApp will change styles (or worse, if MyApp is
`not running, it will be launched and then it will change styles). One way around this
`problem is to specify that the rule should be active only when MyApp is the active
`application:
`
`This is a valid speech rule, the effect of which would be to cause MyApp to change the
`style of the selection when any of the specified phrases is recognized.
`
`Note
`
`Avoid using the same speech string twice. If two speech commands are
`identical, Casper will use only the first macro it finds. The second macro
`will be ignored. 9
`
`%rule
`
`bold
`
`%context application "MyApp"
`%action
`
`tell application "MyApp"
`
`set style of selection to bold
`end
`
`%end
`
`In this case, the speech recognizer listens for the phrase bold only when the MyApp
`application is active. Spoken commands that make sense only when a particular
`application or window is active can be marked in this fashion.
`
`As you begin to build up larger vocabularies for your computer, you will want to avoid
`having to enumerate every utterance that the system should recognize. Speech rules can
`be used to construct entire grammars of What the user can say. For example, let’s say you
`
`Page 361 of 500
`
`Page 361 of 500
`
`
`
`CHAPTER 8
`
`Speech Rules
`
`want to define a rule that allows the user to change the selection to any style, without
`having to list every utterance separately:
`
`%define style
`bold
`
`italic
`
`underline
`
`%end
`
`The foregoing is called a category rule. It is similar to the command rule in that it defines
`a set of phrases that the user might say. However, it does not specify an action. Instead,
`this rule defines a token, <style>, which can be used in other rules instead of directly
`enumerating the category’s phrases.
`
`%rule
`
`<style>
`
`change to <style>
`%action
`
`tell application "MyApp"
`
`set style of selection to
`end
`
`%end
`
`Defining the rules this way lets you specify the syntax of the style command itself
`separately from the syntax for the style names. Other commands can also refer to the
`<style> category.
`
`Overview
`
`. ) was used in place of the
`Note that in the action for the preceding rule, an ellipsis (. .
`actual style. The initial example used a constant style, but in this case, the actual style
`depends on which style the user says. There is a way to pass that information from the
`category rule to the command rule, by attaching a script fragment to each phrase. The
`script fragment returns a value representing the meaning of that phrase—for example:
`
`%define style
`
`plain
`bold
`
`italic
`
`;
`;
`
`;
`
`{meaning: plain}
`
`{meaning: bold}
`
`{meaning:
`
`italic}
`
`%end
`
`For each phrase, the text to the left of the semicolon defines what the user can say, and
`the text to the right of the semicolon is the AppleScript expression. This technique allows
`
`Page 362 of 500
`
`Page 362 of 500
`
`
`
`CHAPTER 8
`
`Speech Rules
`
`the rule writer to assign a meaning to each of the possible phrases that the user may
`utter. The rule that uses this meaning, then, looks like this:
`
`%rule
`
`<s:style>
`
`change to <s:style>
`%action
`
`tell application "MyApp"
`
`set style of selection to meaning of s
`end
`
`%end
`
`Overview
`
`Note the use of the meaning property to access the value computed by the category.
`Whenever a phrase’s script is evaluated, the value returned is always coerced into an
`Apple event record. In the example just given, a record was used as the value of each
`of the category’s phrases. Since it was already a record, it was used as is. If the value is
`any other data type, it is stored as the meaning property of an Apple event record,
`and the record is used as the returned value. For example, the following two phrases
`are equivalent:
`
`Every reference to a category whose value is needed should be preceded with a variable
`name. When the subsequent script is executed, the variable will be bound to the value
`returned by the category rule. For example, if the user says "change to bold,” the style
`category matches the word bold, producing as its value the Apple event record
`{meaning : bold}. The above command rule then matches the entire utterance,
`
`executing its script with the variable 3 bound to the value produced by the corresponding
`category rule. Finally, the expression meaning of s retrieves the style constant from the
`meaning record.
`
`one
`
`one
`
`;
`
`1
`
`; {meaning: 1}
`
`Thus, when accessing the value bound to a variable in a category reference, it is usually
`necessary to get its meaning property.
`
`Here is another example of using categories to define a grammar for numeric digits:
`
`%define digit
`one
`;
`
`two
`
`three
`
`;
`
`;
`
`Page 363 of 500
`
`Page 363 of 500
`
`
`
`CHAPTER 8
`
`Speech Rules
`
`%rule
`
`what is <nzdigit> plus <mzdigit>
`%action
`
`set x to (meaning of n) + (meaning of m)
`
`%end
`
`Using the techniques described so far, you can define a category for recognizing whole
`numbers less than 100. First, define a rule to recognize the tens words:
`
`%define tens
`
`twenty
`
`thirty
`
`ninety
`%end
`
`This category defines a simple grammar that will recognize a single spoken digit and
`return the numeric value of that digit. A script can access the value returned by a
`category by preceding the category reference with a variable name:
`
`
`
`This rule is exactly like the definition of digit just given. Next, define a rule to recognize
`theteensvvords:
`
`%define teens
`
`ten
`
`eleven
`
`; 10
`
`; ll
`
`nineteen
`
`%end
`
`Overview
`
`Page 364 of 500
`
`Page 364 of 500
`
`
`
`CHAPTER 8
`
`Speech Rules
`
`Finally, define the rule that combines all the parts and returns the correct value for
`multiword phrases:
`
`%define uhundred
`
`<nzdigit>
`<nzteens>
`
`<n:tens>
`
`;
`; n
`
`; n
`
`<nztens> <d:digit>;
`%end
`
`(meaning of n) + (meaning of d)
`
`For the first three phrases, the value returned is simply the value recognized by the
`subordinate category. For example, a single digit is a valid number to be recognized by
`this rule, and its value is simply the value returned by the digit rule. In the case of the
`fourth phrase, you want to recognize spoken numbers such as twenty-five. The script for
`this phrase essentially computes the meaning of speaking these two words in sequence.
`This is a trivial example, but the general mechanism is a powerful one that can be used
`to associate meaning with a wide variety of spoken commands.
`
`Overview
`
`Note that in the fourth phrase above you did not write n + d as the script. This is because
`the values bound to n and d are Apple event records, not numbers. In the previous cases,
`you were simply passing on the values, so you could leave them as records; but when
`you want to do arithmetic, you need to access the meaning explicitly. An equivalent, if
`more verbose, expression is the following:
`
`<nztens> <d:digit>; {meaningz
`
`(meaning of n)
`
`+
`
`(meaning of d)}
`
`Sometimes the meaning of an utterance resides simply in the words spoken. For
`example, consider a phone-dialing application in which you want to acknowledge that
`the spoken command is being carried out:
`
`%define name
`
`John Doe
`
`; {phonez "555—7442"}
`
`Bob Strong
`
`;
`
`(phone:
`
`"SSS—3295"}
`
`%end
`
`%rule
`
`call <nzname>
`
`%action
`
`dial
`
`(phone of n)
`
`acknowledge saying "Now dialing"
`%end
`
`Page 365 of 500
`
`Page 365 of 500
`
`
`
`CHAPTER 8
`
`Speech Rules
`
`John Doe
`
`; {phonez "555—7442", name: "John Doe"}
`
`However, this would be redundant. As a convenience, the Speech Monitor always adds
`an utterance property to the value generated by a phrase script. The value assigned to
`this property is a string containing the words that were matched by the category rule. 50,
`you can rewrite the action script of your phone-dialing rule as follows:
`
`%rule
`
`call <nzname>
`
`%action
`
`dial
`
`(phone of n)
`
`acknowledge saying "Now dialing "
`%end
`
`&
`
`(utterance of n)
`
`In this example, each person’s phone number is attached as part of the meaning
`structure. Notice that a different property is used. This works fine; you can use any
`properties that you want as long as you return a record. With speech recognition,
`however, there is always the possibility of a mistaken recognition. It would be better
`to tell the user the name of the person that the system is dialing, so that if it fails to
`recognize correctly, the user has a chance to hang up before the call goes through.
`You could attach the person’s name to the meaning structure:
`
`Speech Rules Files
`
`Speech Rules Files
`
`Speech rules are data structures that determine how spoken commands are interpreted.
`They are stored as resources either in speech rules files or in the resource fork of an
`application. When speech is started, the System Folder and the Extensions folder are
`scanned for speech rules files. Any speech rules files found in these two locations are
`scanned, and the rules in those files become active and are used for spoken command
`recognition. Rules resources present in an application are loaded when an application is
`launched.
`
`There are actually two different file types used for speech rules files: one for speech rules
`files proper and one for macro files. A speech macro is a simplified kind of speech
`rule that can be created with the application Speech Macro Editor. Internally, these files
`have identical formats, and the speech recognition system does not distinguish between
`the two.
`
`The CompileRules MPW tool is used to generate rules files or rule resources from text
`files. The syntax to invoke CompileRules is
`
`CompileRules
`
`[ options
`
`]
`
`input-file
`
`Page 366 of 500
`
`Page 366 of 500
`
`
`
`CHAPTER 8
`
`Speech Rules
`
`Any number of input files may be specified. The valid options are as follows:
`
`—b
`
`—base integer
`
`The —b option causes all scripts in the file to be precompiled and
`stored in their binary format. If this option is not specified, the rules
`will be compiled by the Speech Monitor at run time, on demand.
`
`The —base option causes rule resources to be numbered beginning
`at the specified ID. If this option is not specified, resource IDs begin
`at O. This option is useful in order to prevent resource ID collision
`when the rule resources are going to be installed in an application
`file. The rule compiler currently generates resources of types
`'rule', 'glob',and 'scpt'.
`
`Speech Rules Files
`
`This option is used to specify the method of generating random
`utterances. The total method generates each utterance from the
`total set of possible utterances with equal probability. The phra se
`method generates utterances such that each phrase from a rule is
`equally likely to be chosen. The mixed method uses the phrase
`method to choose a top-level command at random and then uses
`the total method to expand any categories contained in that
`utterance. The default method is mixed.
`
`—c creator
`
`The —c option may be used to specify a creator for the output file. If
`not specified, the creator is set to ' ? ? ? ? ‘
`.
`
`— c a t e go it y category-name
`
`This option is used in conjunction with the —generate option to
`cause phrases generated by a particular category to be generated. If
`this option is not used, then phrases are generated from the set of all
`possible commands.
`
`—generate all
`
`—gene rate integer
`
`The —generate option is used to print to standard output a list of
`utterances generated by the grammar defined by the input files. If
`all is specified, then all possible utterances are listed. Otherwise,
`the number of utterances specified by integer is printed out,
`generated at random according to the method defined by the
`—method option.
`| phrase I mixed
`
`-method total
`
`—0 output-file
`
`—p
`
`-unique
`
`The output file is designated with the —0 option. If this option is not
`specified, the input files are read and checked for correct syntax, but
`no output is created.
`
`The —p option causes informative progress messages to be written
`to standard output as the rules are compiled.
`
`The —unique option is used in conjunction with the —generate
`option in order to force all generated utterances to be unique.
`
`The CompileRules tool must be run on a system that contains the scripting systems to
`be compiled (that is, AppleScript). Errors in the speech rules file will result in messages
`being written to standard output with the error and line of the file where each error
`occurred. The format for the text file is given in the next section.
`
`Page 367 of 500
`
`Page 367 of 500
`
`
`
`CHAPTER 8
`
`Speech Rules
`
`There are two kinds of speech rules: command rules and category rules. Command rules
`are like speech macros; in fact, speech macros are instances of command rules. They
`cause a specified action to occur when the Speech Monitor hears a particular phrase.
`Every command rule has the following parts:
`
`I A list of phrases, each of which defines a phrase that the user may utter to cause the
`action below to be carried out. Each phrase has an optional script that can define a
`semantic value to be associated with the phrase and can be accessed in the action’s
`script. The phrase itself consists of a list of tokens that are references to either words
`or categories. Words are like terminals in a grammar, and categories are like
`nonterminals.
`
`An optional context that defines when the command rule is active. If the context is
`empty, the rule is always active.
`
`An optional condition, which is a script that determines whether or not the rule should
`be considered active. This is like the context, except that it is evaluated rather than
`constant, and it is evaluated after the utterance has been recognized. It is useful for
`resolving ambiguities when more than one rule has matched the user’s utterance.
`
`An optional Boolean acknowledge flag that causes the command to be acknowledged in
`the standard (nonverbal) way. If it is desired to provide verbal acknowledgment, then
`the flag should be false, and the acknowledge AppleScript command should be
`used. Normally this flag is used for very short commands, such as menu items, dialog
`box buttons, and so on.
`
`An optional target clause, which indicates a default target for the condition and action
`scripts. If no target is specified, the default target for the scripts is the Speech Monitor
`itself. The target can be changed in the script by using the tell clause of AppleScript.
`
`An action, which is a script that is executed when one of the rule’s phrases has been
`uttered by the user and recognized. The action’s script may refer to variable bindings
`created by any of