`
`
`
`
`
`
`
`UNITED STATES PATENT AND TRADEMARK OFFICE
`
`___________
`
`BEFORE THE PATENT TRIAL AND APPEAL BOARD
`
`___________
`
`CaptionCall, LLC
`
`Petitioner
`
`v.
`
`Ultratec, Inc.
`
`Patent Owner
`
`
`
`U.S. Patent No. 10,742,805
`Filing Date: April 3, 2017
`Grant Date: August 11, 2020
`
`Title: SEMIAUTOMATED RELAY METHOD AND APPARATUS
`
`___________
`
`IPR2021-01337
`
`
`
`
`
`PETITION FOR INTER PARTES REVIEW
`
`
`
`U.S. Patent No. 10,742,805
`
`
`
`
`CONTENTS
`
`I.
`
`INTRODUCTION ........................................................................................... 1
`
`II. MANDATORY NOTICES, STANDING, AND FEES .................................. 1
`
`A.
`
`B.
`
`C.
`
`D.
`
`E.
`
`F.
`
`Real Party-In-Interest under 37 C.F.R. § 42.8(b)(1) ............................. 1
`
`Related Matters under 37 C.F.R. § 42.8(b)(2) ...................................... 2
`
`Lead and Backup Counsel under 37 C.F.R. § 42.8(b)(3) ...................... 2
`
`Service Information under 37 C.F.R. § 42.8(b)(4) ................................ 2
`
`Certification of Grounds for Standing under 37 C.F.R. § 42.104(a) .... 3
`
`Fees Under 37 C.F.R. § 42.103(a) ......................................................... 3
`
`II.
`
`STATEMENT OF PRECISE RELIEF REQUESTED ................................... 4
`
`III. OVERVIEW OF THE ’805 PATENT ............................................................ 5
`
`A. Disclosure .............................................................................................. 5
`
`B.
`
`C.
`
`D.
`
`Person of Ordinary Skill in the Art (POSITA) ...................................... 6
`
`Claim Construction ................................................................................ 6
`
`Prosecution ............................................................................................ 7
`
`IV. SUMMARY OF THE PRIOR ART ................................................................ 9
`
`1.
`
`Cloran teaches a hybrid transcription system using automatic
`speech recognition and human call analysts ............................... 9
`
`2. Madhavapeddl teaches using a data connection rather than an
`audio channel
`to
`transmit higher fidelity audio
`for
`transcription............................................................................... 14
`
`3.
`
`4.
`
`Geer teaches a system that displays a confidence level of
`correctness in a transcription of a telephone call ...................... 14
`
`Jaggi teaches tools to assist a transcriptionist, such as
`presenting word options after typing a letter, and presenting
`word options from which the transcriptionist may select ......... 14
`
`
`
`ii
`
`
`
`U.S. Patent No. 10,742,805
`
`
`
`
`5.
`
`Carus teaches a system where an ASR transcript is re-
`transcribed when the quality is too poor instead of simply
`undergoing error correction ...................................................... 16
`
`V.
`
`THE CHALLENGED CLAIMS ARE UNPATENTABLE .......................... 17
`
`A. Ground 1: Claims 1-4 and 7-11 are unpatentable as obvious under §
`103 over Cloran ................................................................................... 17
`
`1.
`
`2.
`
`3.
`
`4.
`
`5.
`
`6.
`
`7.
`
`8.
`
`9.
`
`Claim 1 ...................................................................................... 17
`
`Claim 2 ...................................................................................... 25
`
`Claim 3 ...................................................................................... 28
`
`Claim 4 ...................................................................................... 29
`
`Claim 7 ...................................................................................... 32
`
`Claim 8 ...................................................................................... 33
`
`Claim 9 ...................................................................................... 35
`
`Claim 10 .................................................................................... 37
`
`Claim 11 .................................................................................... 38
`
`B. Ground 2: Claims 2 and 3 are unpatentable as obvious under 35
`U.S.C. § 103 over Cloran in view of Madhavapeddl ......................... 39
`
`1.
`
`2.
`
`3.
`
`A POSITA would have been motivated to combine Cloran
`with the concept in Madhavapeddl of providing high-fidelity
`audio over a data connection ..................................................... 39
`
`Claim 2 ...................................................................................... 41
`
`Claim 3 ...................................................................................... 43
`
`C. Ground 3: Claim 4 is unpatentable as obvious under 35 U.S.C. § 103
`over Cloran in view of Carus ............................................................. 43
`
`1.
`
`A POSITA would have been motivated to combine Cloran
`with the teachings in Carus of correcting or re-transcribing the
`output of an ASR based on the confidence of the ASR ............ 44
`
`
`
`iii
`
`
`
`U.S. Patent No. 10,742,805
`
`
`
`
`2.
`
`Claim 4 ...................................................................................... 45
`
`D. Ground 4: Claim 5 is unpatentable as obvious under 35 U.S.C. § 103
`over Cloran in view of Geer ............................................................... 47
`
`1.
`
`A POSITA would have been motivated to combine Cloran
`with the teachings in Geer of indicating the confidence in an
`ASR generated transcription ..................................................... 47
`
`2.
`
`Claim 5 ...................................................................................... 49
`
`E.
`
`Ground 5: Claims 8 and 9 are unpatentable as obvious under § 103
`over Cloran in view of Jaggi .............................................................. 50
`
`1.
`
`2.
`
`3.
`
`A POSITA would have been motivated to combine Cloran
`with the transcription assistive features of Jaggi ...................... 51
`
`Claim 8 ...................................................................................... 54
`
`Claim 9 ...................................................................................... 55
`
`VI. CONCLUSION .............................................................................................. 57
`
`
`
`iv
`
`
`
`U.S. Patent No. 10,742,805
`
`
`
`
`EXHIBITS
`
`Exhibit No.
`
`Description
`
`1001
`
`U.S. Patent No. 10,742,805 to Engelke et al. (“the ’805 Patent”)
`
`1002
`
`Prosecution History of U.S. Patent No. 10,742,805
`
`1003
`
`Declaration of Mr. Benedict Occhiogrosso (“Occhiogrosso”)
`
`1004
`
`1005
`
`1006
`
`1007
`
`1008
`
`1009
`
`1010
`
`1011
`
`1012
`
`U.S. Patent Application Publication No. US 2010/0063815 A1 to
`Cloran et al. (“Cloran”)
`
`U.S. Patent Application Publication No. 2013/0045720 A1 to
`Madhavapeddl et al. (“Madhavapeddl”)
`
`U.S. Patent Application Publication No. 2010/0030738 A1 to Geer
`(“Geer”)
`
`U.S. Patent Application Publication No. 2012/0016671 A1 to Jaggi et
`al. (“Jaggi”)
`
`U.S. Patent Application Publication No. 2006/0026003 A1 to Carus et
`al. (“Carus”)
`
`The Effect of Bandwidth on Speech Intelligibility, Jeff Rodman,
`Polycom Whitepaper, January 16, 2003.
`
`Claim Language and Reference Numbers of U.S. Patent No.
`10,742,805
`
`Final Written Decision in IPR2013-00288 (Paper 63) concerning
`U.S. Patent No. 8,379,801 (“288 FWD”)
`
`Final Written Decision in IPR2015-01889 (Paper 119) concerning
`U.S. Patent No. 9,131,045 (“1889 FWD”)
`
`
`
`v
`
`
`
`U.S. Patent No. 10,742,805
`
`
`
`
`Exhibit No.
`
`Description
`
`1013
`
`Curriculum Vitae of Benedict J. Occhiogrosso
`
`
`
`vi
`
`
`
`U.S. Patent No. 10,742,805
`
`
`
`
`I.
`
`INTRODUCTION
`
`Petitioner CaptionCall, LLC (“CaptionCall”) requests inter partes review
`
`(“IPR”) under 35 U.S.C. §§ 311-319 and 37 C.F.R. § 42.100 et seq. of claims 1-5
`
`and 7-11 of U.S. Patent No. 10,742,805 (“the ’805 Patent”).
`
`The ’805 Patent “relates to relay systems for providing voice-to-text
`
`captioning for hearing impaired users and more specifically to a relay system that
`
`uses automated voice-to-text captioning software to transcribe voice-to-text.” ’805
`
`Patent, 1:22-25. The purported improvement claimed in the ’805 Patent relates to an
`
`alleged recognition “that a hybrid semi-automated system can be provided where,
`
`when acceptable accuracy can be achieved using automated transcription software,
`
`the system can automatically use the transcription software to transcribe [hearing
`
`user] HU voice messages to text and when accuracy is unacceptable, the system can
`
`patch in a human [call assistant] CA to transcribe voice messages to text.” Id. at
`
`3:66-4:5. However, the alleged “recognition” and systems that used that concept
`
`were well known in the prior art well before the earliest priority date of the ’805
`
`Patent, as explained in the grounds below. As such, the Board should find the
`
`challenged claims obvious.
`
`II. MANDATORY NOTICES, STANDING, AND FEES
`
`A. Real Party-In-Interest under 37 C.F.R. § 42.8(b)(1)
`
`CaptionCall is a wholly owned subsidiary of Sorenson Communications, LLC
`
`(“Sorenson”). CaptionCall and Sorenson are the real parties-in-interest.
`
`
`
`1
`
`
`
`
`
`U.S. Patent No. 10,742,805
`
`
`
`
`B. Related Matters under 37 C.F.R. § 42.8(b)(2)
`
`The ’805 patent claims priority to U.S. Patent No. 10,389,876 and U.S.
`
`Provisional Patent App. No. 61/946,072, so a decision in this proceeding may affect
`
`that patent.
`
`Also, U.S. Patent App. No. 16/911,691 claims priority to the ’805 patent, so
`
`a decision in this proceeding may affect that application.
`
`C.
`
`Lead and Backup Counsel under 37 C.F.R. § 42.8(b)(3)
`
`Petitioner provides the following designations of counsel:
`
`Lead Counsel
`
`Back-Up Counsel
`
`Back-Up Counsel
`
`Adam F. Smoot
`
`Brian Parke
`
`B. Lance Jensen
`
`Reg. No. 63,433
`
`Reg. No. 59,266
`
`Reg. No. 68,022
`
`asmoot@mabr.com
`
`bparke@mabr.com
`
`ljensen@mabr.com
`
`
`
`D.
`
`Service Information under 37 C.F.R. § 42.8(b)(4)
`
`A Power of Attorney accompanies this Petition pursuant to 37 C.F.R. §
`
`42.10(b). Please address all correspondence to:
`
`Maschoff Brennan
`1389 Center Drive, Suite 300
`Park City, UT 84098
`Phone: 435.252.1360
`Fax: 435.252.1361
`
`
`Petitioner consents to electronic service by email at:
`
`
`
`2
`
`
`
`
`
`U.S. Patent No. 10,742,805
`
`
`
`
`
`
`
`
`asmoot@mabr.com
`bparke@mabr.com
`ljensen@mabr.com
`anixon@mabr.com
`
`
`E. Certification of Grounds for Standing under 37 C.F.R. § 42.104(a)
`
`Petitioner hereby certifies that the ’805 Patent is available for IPR and that
`
`Petitioner is not barred or estopped from challenging the claims of the ’805 Patent
`
`because: (1) Petitioner is not the owner of the ’805 Patent; (2) Petitioner has not filed a
`
`civil action challenging the validity of any claim of the ’805 Patent; and (3) Petitioner
`
`has not been served with a complaint alleging infringement of the ’805 Patent.
`
`Additionally, it has been at least nine months since the issuing of the ’805 Patent
`
`(the nine months expiring May 11, 2021). See 35 U.S.C. § 311(c)(1). The ’805 Patent
`
`granted August 11, 2020 from U.S. Application No. 15/477,958 (“the ’958
`
`Application”), filed on April 3, 2017, which claims priority through U.S. Application
`
`No. 14/632,257 (“the ’257 Application”) to U.S. Provisional Application No.
`
`61/946,702 (“the ’702 Application”), filed on February 28, 2014. The ’702 Application,
`
`the ’257 Application, and the ’958 Application have virtually identical specifications.
`
`Because the earliest priority claim is February 28, 2014 (after March 16, 2013), the ’805
`
`Patent is a first-inventor-to-file patent (see 37 CFR § 42.102(a)(2)).
`
`F.
`
`Fees Under 37 C.F.R. § 42.103(a)
`
`Petitioner authorizes the Office to charge any additional fees due in connection
`
`with this petition to Deposit Account No. 50-5394.
`
`
`
`3
`
`
`
`
`
`U.S. Patent No. 10,742,805
`
`
`
`
`II.
`
`STATEMENT OF PRECISE RELIEF REQUESTED
`
`Petitioner requests IPR of claims 1-5 and 7-11 of the ’805 Patent on the
`
`following grounds:1
`
`Ground
`
`Proposed Statutory Rejections for the ’805 Patent
`
`1
`
`2
`
`3
`
`Claims 1-4 and 7-11 are obvious under 35 U.S.C. § 103 over Cloran2
`
`Claims 2 and 3 are obvious under 35 U.S.C. § 103 over Cloran in
`view of Madhavapeddl3
`
`Claim 4 is obvious under § 103 over Cloran in view of Carus4
`
`
`1 The earliest priority of the ’805 Patent is February 28, 2014.
`
`2 Cloran was published March 11, 2010; therefore it is prior art under 35
`
`U.S.C. § 102(a)(1).
`
`3 Madhavapeddl was published February 21, 2013; therefore it is prior art
`
`under 35 U.S.C. § 102(a)(1).
`
`4 Carus was published February 2, 2006; therefore it is prior art under 35
`
`U.S.C. § 102(a)(1).
`
`
`
`4
`
`
`
`
`
`U.S. Patent No. 10,742,805
`
`
`
`
`Ground
`
`Proposed Statutory Rejections for the ’805 Patent
`
`Claim 5 is obvious under § 103 over Cloran and in view of Geer5
`
`Claims 8 and 9 are unpatentable under § 103 over Cloran in view of
`Jaggi6
`
`4
`
`5
`
`
`
`III. OVERVIEW OF THE ’805 PATENT
`
`A. Disclosure
`
`As stated above, the ’805 Patent relates to “a hybrid semi-automated system
`
`. . . where, when acceptable accuracy can be achieved using automated transcription
`
`software, the system can automatically use the transcription software to transcribe
`
`[hearing user] HU voice messages to text and when accuracy is unacceptable, the
`
`system can patch in a human [call assistant] CA to transcribe voice messages to
`
`text.” Id. at 3:66-4:5. See Occhiogrosso, ¶¶ 29-35.
`
`Independent claim 1 of the ’805 Patent includes a feature to “generat[e] . . .
`
`first text captions from the [hearing user’s] voice signal using the [automatic speech
`
`
`5 Geer was published February 4, 2010; therefore it is prior art under 35
`
`U.S.C. § 102(a)(1).
`
`6 Jaggi was published January 9, 2012; therefore it is prior art under 35
`
`U.S.C. § 102(a)(1).
`
`
`
`5
`
`
`
`
`
`U.S. Patent No. 10,742,805
`
`
`
`
`recognition] engine,” and “automatically determin[e] . . . whether the generated first
`
`text captions meet a first accuracy threshold.” The claim includes optional treatment
`
`depending on the first accuracy threshold: “when the first text captions meet the first
`
`accuracy threshold,” the claim recites “sending the first text captions to an assisted
`
`user’s (AU’s) communications device for display.” In contrast, “when the first text
`
`captions fail to meet the first accuracy threshold,” the hearing user’s voice signal is
`
`presented to a human call assistant to generate second text captions, which are sent
`
`to the assisted user for display. Ochiogrosso, ¶¶ 34-42.
`
`B.
`
`Person of Ordinary Skill in the Art (POSITA)
`
`The ’805 Patent relates generally to telephone communications and text
`
`transcriptions thereof. A person of ordinary skill in the art (POSITA) of the ’805
`
`Patent, at the time of the alleged invention, would have had a bachelor’s degree in
`
`electrical engineering (or electrical and computer engineering) and a few years of
`
`experience with telephone communication system architecture, design and
`
`implementation, digitization of voice, and/or traditional relay systems. A person
`
`with less technical education but more experience, or vice versa, would have also
`
`met this standard. Ochiogrosso, ¶¶ 43-46.
`
`C. Claim Construction
`
`Petitioner has adopted the ordinary and customary meaning as understood by
`
`a POSITA for the claim terms. 37 C.F.R. § 42.100(b). Petitioner expressly reserves
`
`the right to adopt alternative claim constructions in other proceedings based on
`
`
`
`6
`
`
`
`
`
`U.S. Patent No. 10,742,805
`
`
`
`
`positions taken by Patent Owner. To the extent the Board determines that any
`
`particular term needs construction aside from recognizing that the ordinary meaning
`
`applies, Petitioner reserves the right to propose a particular construction at an
`
`appropriate time during the proceeding, such as in Petitioner’s Reply.
`
`D.
`
`Prosecution
`
`During the prosecution of the ’805 Patent Cloran was cited on the face of the
`
`patent and submitted in an IDS on June 9, 2007 (along with one hundred forty-five
`
`pages-worth of other references, see Ex. 1002, pp. 120 (Cloran) and 112-257 (the
`
`eighteen IDSs submitted on June 9)), however the Examiner never cited to or used
`
`Cloran in a rejection or mentioned it during prosecution, despite Cloran teaching
`
`nearly all of the elements of the challenged claims as explained below.
`
`The Board should decline to exercise its discretion under § 325(d) and should
`
`proceed to institution based on the merits. In particular, when considering the Becton
`
`Dickinson factors (see Becton, Dickinson & Co. v. B. Braun Melsungen AG,
`
`IPR2017-01586, Paper 8 at 17–18 (PTAB Dec. 15, 2017) (precedential)), there is
`
`every reason to proceed to institution. With respect to factors a) and b), the claims
`
`were repeatedly rejected as being anticipated by U.S. Patent No. 9,336,689 to
`
`Romriell. See, e.g., Ex. 1002, pp. 317-25, 542-51, and 598-603. To finally overcome
`
`the reference, Patent Owner argued that “[Automatic Speech Recognition] ASR
`
`engines do not automatically identify an accuracy level of generated text or compare
`
`
`
`7
`
`
`
`
`
`U.S. Patent No. 10,742,805
`
`
`
`
`a text accuracy level to an accuracy threshold of any type for any reason. Instead,
`
`ASR engine simply generate a sequence of hypotheses for each word in a voice
`
`signal where each is considered 100% accurate and there is no comparison
`
`whatsoever to an accuracy threshold.” Id. at pp. 571-73 (emphasis in original).
`
`However, as explained below in Sections IV and V, Petitioner explains and
`
`shows how those very features were taught in Cloran well before the date of alleged
`
`invention of the ’805 Patent.
`
`With respect to factors c) and d), Cloran is cited on the face of the ’805 Patent
`
`as having been considered by the examiner,7 however Cloran was never the basis
`
`for a rejection, and thus the arguments and manner in which Petitioner relies on
`
`Cloran has not been considered by the Office previously. With respect to factors e)
`
`and f), even if it were argued that the Examiner had substantively considered Cloran
`
`and still allowed the case, the Examiner would have acted in error as explained below
`
`in Sections IV and V.
`
`Therefore, there is every reason to proceed to institution, and Petitioner
`
`respectfully requests the Board refrain from exercising discretion under § 325(d) to
`
`deny institution.
`
`
`7 None of the other references relied upon in this Petition appear to have
`
`been before the Examiner.
`
`
`
`8
`
`
`
`
`
`U.S. Patent No. 10,742,805
`
`
`
`
`IV. SUMMARY OF THE PRIOR ART
`
`1.
`transcription system using
`teaches a hybrid
`Cloran
`automatic speech recognition and human call analysts
`
`Cloran discloses a system for providing real-time transcriptions. See, e.g.,
`
`Cloran, Abstract. For example, the system can be used to provide a real-time
`
`transcription to participants in a conference call. The general system 100 is described
`
`with respect to Figure 1. The system 100 connects users 110, 120, and 130 with a
`
`central system 140 and
`
`connects the users 110,
`
`120, and 130 together by
`
`way of their terminal
`
`equipment 112, 122, and
`
`132. See, e.g., id. at ¶
`
`[0010] and Figure 1.
`
`Participants have a voice
`
`connection to the central system, and may also have a data connection, such as
`
`through a laptop (e.g., the user 120 has a voice connection 125 through their cell
`
`phone 124 and a data connection 123 through their laptop 126). Id. See Ochiogrosso,
`
`¶¶ 49-50.
`
`Figures 3 and 4 of Cloran illustrate systems 300 and 400 and examples of
`
`flows through the central system 140 of Figure 1 based on various “modes” of
`
`
`
`9
`
`
`
`
`
`U.S. Patent No. 10,742,805
`
`
`
`
`operation of the central system 140. For example, Cloran shows the various voice
`
`connections and data connection of Figure 1 flowing into more detailed components
`
`of the central system 140 in Figures 3 and 4. See, e.g., id. at ¶¶ [0016] (“[s]ystem
`
`300 includes media server 310, into which voice connections 115, 125, and others
`
`run”); see also id. at ¶¶ [0027], [0030]. The system 300 of Figure 3 illustrates flows
`
`in which one or more human analysts 340 are involved in a “verbatim interpreting”
`
`fidelity mode, or in a “text interpreting” fidelity mode. See, e.g., id. at ¶ [0029]
`
`(Figure 3 illustrating “verbatim interpreting” and “text interpreting” fidelity modes).
`
`The system 400 of Figure 4 illustrates flows in which an automatic transcription of
`
`a conference call is generated in a third fidelity mode: “automatic transcription.” Id.
`
`at ¶ [0030] (Figure 4 illustrating an “automatic transcription” fidelity mode).
`
`Additionally, Cloran discloses embodiments
`
`that utilize both automatic
`
`transcription and human analysts, or in other words, the elements from both Figure
`
`3 and Figure 4. See, e.g., id. at ¶ [0032]. See Occhiogrosso, ¶¶ 51-52.
`
`
`
`10
`
`
`
`
`
`U.S. Patent No. 10,742,805
`
`
`
`
`
`
`Additionally, Cloran teaches that the various disclosed devices (such as those
`
`
`
`
`
`in Figures 1, 3, and/or 4) can be implemented by computing devices such as those
`
`illustrated in Figure 2. See, e.g., id. at ¶ [0012] (“The computers used as servers,
`
`clients, resources, interface components, and the like for the various embodiments
`
`described herein generally take the form shown in FIG. 2.”). See Occhiogrosso, ¶¶
`
`51-52.
`
`In system 400, voice connections 115, 125 between the terminal equipment
`
`112, 122 and the system 400 provide the voice audio of a conference call between
`
`the users 110, 120, which includes the speech of the conference call, to the media
`
`server 410. See, e.g., id. at ¶¶ [0027], [0030], and [0032]. The media server 410
`
`detects and, if necessary, digitizes the audio. Id. at ¶ [0030]. The digitized audio is
`
`provided to the automatic transcription subsystem 430. Id. The automatic
`
`transcription subsystem 430 generates a transcription of the digitized audio and an
`
`
`
`11
`
`
`
`
`
`U.S. Patent No. 10,742,805
`
`
`
`
`indication of the confidence of the transcription. Id. As the digitized audio is
`
`transcribed, the real-time transcription is provided to the back office 450. Id. The
`
`back office 450 provides the transcription to the web portal 480 that updates displays
`
`of client devices associated with the users 110, 120 with the real-time transcription.
`
`Id. at ¶¶ [0027] and [0033]. Updating the display of the client devices allows the
`
`users 110, 120 to view the real-time transcript while the conference call is occurring.
`
`Id. See Occhiogrosso, ¶¶ 51-53.
`
`Cloran further teaches that when a level of confidence in the automatically
`
`generated transcription is below a particular threshold, the automatic transcription
`
`subsystem 430 does not generate a transcript of the digitized audio from the voice
`
`connections 115, 125. Id. at ¶ [0032]. Rather, the audio from the voice connections
`
`115, 125 is provided to a human analyst 340 that repeats the speech. Id. The audio
`
`of the human analyst 340 repeating the speech from the call is provided as an input
`
`into the automatic transcription subsystem 430. Id. The automatic transcription
`
`subsystem 430 then generates a transcription of the audio of the human analyst 340
`
`and the transcription is provided to the displays of the client devices associated with
`
`the users 110, 120. See, e.g., id. at ¶¶ [0027], [0030], and [0032]. See Occhiogrosso,
`
`¶ 54.
`
`Cloran further teaches that “[i]f the system again fails to transcribe the text
`
`with a certain level of confidence, other methods are used for the transcription of
`
`
`
`12
`
`
`
`
`
`U.S. Patent No. 10,742,805
`
`
`
`
`that audio chunk as are described herein.” Id. at ¶ [0032]. Thus, Cloran teaches that
`
`a transcript of a telephone call is generated first with automatic speech recognition.
`
`If the level of confidence in the accuracy of the automatically generated transcription
`
`is below a threshold, the system incorporates a human analyst 340 to repeat the
`
`words of the telephone call and performs automatic speech recognition of the speech
`
`of the human analyst 340. If the system still fails to achieve the desired accuracy,
`
`the system reverts to another mode, such as verbatim transcription. Id. at ¶¶ [0032]
`
`and [0029]. See Occhiogrosso, ¶ 55.
`
`Cloran additionally teaches that errors in the transcription can be identified
`
`by humans, the system 140, or the call participants. Id. at ¶ [0034]. After
`
`identification of the errors, the errors can be corrected. Id. Specifically, Cloran
`
`teaches that human analysts 340 are used to correct automatically generated
`
`transcriptions. Id. at ¶ [0052]. Cloran also teaches that the analysts 340 use a spell
`
`checker and/or auto-complete features. Id. at [0054]. The issued corrections are
`
`provided as updates to the displays of client devices in substantially real-time. Id. at
`
`¶ [0035]. See Occhiogrosso, ¶ 56.
`
`Cloran also recognizes audio capture parameters of the audio, such as
`
`parameters of the audio capture device, affect the ability of the transcription system
`
`to transcribe the audio accurately. Id. at ¶ [0026]. In particular, Cloran describes that
`
`
`
`13
`
`
`
`
`
`U.S. Patent No. 10,742,805
`
`
`
`
`these parameters are automatically adjusted so that the transcription of the audio is
`
`more successful. Id. See Occhiogrosso, ¶ 57.
`
`2. Madhavapeddl teaches using a data connection rather than an
`audio channel to transmit higher fidelity audio for transcription
`
`Madhavapeddl relates to a system that provides a transcript of audio to a user,
`
`such as by
`
`transcribing voicemails. Madhavapeddl, Abstract. Furthermore,
`
`Madhavapeddl teaches that by using a data channel rather than only an audio channel
`
`to transmit audio, higher fidelity audio can be recorded and transmitted to be used in
`
`generating the transcript. See, e.g., id. at [0016], [0025]. See Occhiogrosso, ¶¶ 58-59.
`
`3.
`Geer teaches a system that displays a confidence level of
`correctness in a transcription of a telephone call
`
`Geer relates to a system that provides a transcript of audio to a user, including
`
`for telephone calls. Geer, Abstract and ¶ [0128]. Geer teaches that in generating the
`
`transcript, other information may be presented to the user, such as a timestamp and who
`
`is speaking. See, e.g., id. at ¶¶ [0138]-[0140]. One of the pieces of information that can
`
`be presented is a “confidence level of correctness,” such as “using yellow for medium-
`
`level-of-confidence words, and red for low-level-of-confidence words.” Id. at ¶¶
`
`[0145]-[0147]; see also ¶¶ [0138], [0162]. See Occhiogrosso, ¶¶ 60-61.
`
`4.
`Jaggi teaches tools to assist a transcriptionist, such as
`presenting word options after typing a letter, and presenting word
`options from which the transcriptionist may select
`
`Jaggi relates to a system that transcribes audio using an automatic speech
`
`recognition (ASR) word-lattice. Jaggi, Abstract. In particular, Jaggi teaches that a
`
`
`
`14
`
`
`
`
`
`U.S. Patent No. 10,742,805
`
`
`
`
`transcriptionist can type letters and/or select from options presented to the
`
`transcriptionist. See, e.g., id. at [0050] (“[A]s soon as the transcriptionist plays the first
`
`audio segment in step 102 and enters the first character of a word in step 104, all words
`
`starting with that character within the ASR word lattice are identified in step 106 and
`
`prompted to the user as word choices in step 108 as a prompt list and in step 109 as
`
`graphic prompt. In step 108, the LM (language model) probabilities of these words are
`
`used to rank the words in the prompt list which is displayed to the transcriptionist. In
`
`step 109 the LM probabilities of these words and subsequent words are displayed to the
`
`transcriptionist . . . . At this point, the transcriptionist either chooses an available word
`
`or types out the word if none of the alternatives were acceptable.”); see also Figures 7
`
`and 8 (reproduced and annotated below).
`
`
`
`15
`
`
`
`
`
`
`
`U.S. Patent No. 10,742,805
`
`
`
`
`Word
`Choices
`
`Graphic
`Prompt
`
`See Occhiogrosso, ¶¶ 62-63.
`
`
`
`
`
`5.
`Carus teaches a system where an ASR transcript is re-
`transcribed when the quality is too poor instead of simply
`undergoing error correction
`
`Carus relates to a system that predicts accuracy of text generated by an ASR
`
`
`
`process to determine how best to process the text. Carus, Abstract. For example, Carus
`
`teaches that “[i]n some cases, the error rate of a[n automated speech] recognizer may
`
`be too high and the amount of editing required for a given document with a low
`
`recognition accuracy may require more effort, time, and cost to edit than if the given
`
`document had been transcribed by a human transcriptionist in the first place.” Id. at
`
`
`
`16
`
`
`
`
`
`U.S. Patent No. 10,742,805
`
`
`
`
`¶ [0004]. Utilizing that feature, based on the predicted quality of text generated by an
`
`ASR system based on audio, the system of Carus would either route the text to a
`
`transcriptionist to either perform corrections or completely re-transcribe the audio. See,
`
`e.g., id. at ¶ [0031] and Figure 1. See Occhiogrosso, ¶¶ 64-65.
`
`V. THE CHALLENGED CLAIMS ARE UNPATENTABLE
`
`Pursuant to 37 C.F.R. § 42.104(b)(4)-(5), Claims 1-5 and 7-11 are
`
`unpatentable for the reasons set forth in detail below.
`
`A. Ground 1: Claims 1-4 and 7-11 are unpatentable as obvious under
`§ 103 over Cloran
`
`Petitioner provides claim charts demonstrating that each limitation in claims
`
`1-4 and 7-11 are rendered obvious by Cloran.
`
`1.
`
`Claim 1
`
`Cloran teaches or suggests every recitation of claim 1 as shown and explained
`
`below.
`
`Claim 1 uses the terms “hearing user” and “assisted user.” Cloran does not
`
`specifically refer to the participants of the conference call using the transcription as
`
`“assisted” users. The ’805 Patent uses the term “assisted user” generally to refer to
`
`any person receiving text during a call (i.e., receiving textual “assistance”). See, e.g.,
`
`’805 Patent, 1:49-60. In a few locations, the ’805 Patent refers to the “assisted user”
`
`as “hearing impaired.” See, e.g., ’805 Patent, 1:37-48. However, in describing the
`
`assisted user, the ’805 Patent uses an “e.g.” in articulating the relationship between
`
`
`
`17
`
`
`
`
`
`U.S. Patent No. 10,742,805
`
`
`
`
`an assisted user and a person who is “hearing impaired.” Id. at 1:37-39. Therefore,
`
`the term “assisted user” is not limited to a person who is “hearing impaired.” In
`
`Cloran, participants of the conference call (such as the participant 110) are receiving
`
`a text transcription of the conference call in addition to enjoying the audio of the
`
`conference call, thus the participants are receiving assistance. Therefore, those
`
`participants are “assisted” users. Lastly, even if the term “assisted user” does mean
`
`that the assisted user has a hearing impairment, the hearing level of the person
`
`utilizing the device would not change the operation of the system. See Occhiogrosso,
`
`¶¶ 72-74, 79.
`
`Additionally, Cloran does not specifically refer to the participants of the
`
`conference call not using the transcription as “hearing” users. When describing a
`
`“hearing user,” the ’805 Patent defines the “hearing user” as the person with whom
`
`the “assisted user” is communicating. ’805 Patent, 1:67-2:4. In Cloran, participants
`
`of a conference call (such as the participant 130 who only uses their telephone 134)
`
`communicate with other participants of the conference call (such as the participant
`
`110) that are receiving a text transcription of the conference call in addition to
`
`enjoying the audio of the conference call. Thus, these participants in Cloran are
`
`“hearing” users. See Occhiogrosso, ¶¶ 72-74, 79.
`
`Furthermore, while Cloran does not speak to the level of hearing of the
`
`participants, if the term “hearing” users suggests the participants have some level of
`
`
`
`18
`
`
`
`
`
`U.S. Patent No. 10,742,805
`
`
`
`
`hearing, a POSITA would recognize that the participants in a conference call over a
`
`“voice connection” (such as the participant 130 who only uses their telephone 134)
`
`would presumably have at least some level of hearing. Thus, the disclosure of Cloran
`
`teaches or suggests that the participant 130 of the conference call is a “hearing user.”
`
`See Occhiogrosso, ¶¶ 72-74, 79.
`
`Claim
`Limitations
`1[p] A
`captioning
`system
`comprising:
`
`1[a] one or more
`processors; and
`
`1[b1] a memory
`having stored
`thereon software
`such that, when
`
`
`
`Exhibit 1004 – Cloran
`
`Cloran discloses a system 100/300/400 that provides a real-
`time transcription of a call. See, e.g., ¶ [0011] (“Generally,
`participants 110, 120, and 130 conduct the voice portion of a
`conference call using techniques that will be understood by
`those skilled in the art. While the call is in progress, us