throbber
Timing Analysis of Keystrokes and Timing Attacks on SSH
`
`David Wagner
`Dawn Xiaodong Song
`University of California, Berkeley
`
`Xuqing Tian
`
`Abstract
`
`1 Introduction
`
`SSH is designed to provide a secure channel between
`two hosts. Despite the encryption and authentication
`mechanisms it uses, SSH has two weakness: First, the
`transmitted packets are padded only to an eight-byte
`boundary (if a block cipher is in use), which reveals the
`approximate size of the original data. Second, in inter-
`active mode, every individual keystroke that a user types
`is sent to the remote machine in a separate IP packet im-
`mediately after the key is pressed, which leaks the inter-
`keystroke timing information of users’ typing. In this
`paper, we show how these seemingly minor weaknesses
`result in serious security risks.
`
`First we show that even very simple statistical tech-
`niques suffice to reveal sensitive information such as the
`length of users’ passwords or even root passwords. More
`importantly, we further show that by using more ad-
`vanced statistical techniques on timing information col-
`lected from the network, the eavesdropper can learn sig-
`nificant information about what users type in SSH ses-
`sions.
`In particular, we perform a statistical study of
`users’ typing patterns and show that these patterns re-
`veal information about the keys typed. By developing a
`Hidden Markov Model and our key sequence prediction
`algorithm, we can predict key sequences from the inter-
`keystroke timings. We further develop an attacker sys-
`tem, Herbivore , which tries to learn users’ passwords by
`monitoring SSH sessions. By collecting timing informa-
`tion on the network, Herbivore can speed up exhaustive
`search for passwords by a factor of 50. We also propose
`some countermeasures.
`
`In general our results apply not only to SSH, but also
`to a general class of protocols for encrypting interactive
`traffic. We show that timing leaks open a new set of
`security risks, and hence caution must be taken when
`designing this type of protocol.
`
` This research was supported in part by the Defense Advanced Re-
`search Projects Agency under DARPA contract N6601-99-28913 (un-
`der supervision of the Space and Naval Warfare Systems Center San
`Diego) and by the National Science foundation under grants FD99-
`79852 and CCR-0093337.
`
`Just a few years ago, people commonly used astonish-
`ingly insecure networking applications such as tel-
`net, rlogin, or ftp, which simply pass all confi-
`dential information, including users’ passwords, in the
`clear over the network. This situation was aggravated
`through broadcast-based networks that were commonly
`used (e.g., Ethernet) which allowed a malicious user to
`eavesdrop on the network and to collect all communi-
`cated information [CB94, GS96].
`
`Fortunately, many users and system administrators have
`become aware of this issue and have taken counter-
`measures. To curb eavesdroppers, security researchers
`designed the Secure Shell (SSH), which offers an en-
`crypted channel between the two hosts and strong au-
`thentication of both the remote host and the user [Yl¨o96,
`SSL01, YKS 00b]. Today, SSH is quite popular, and it
`has largely replaced telnet and rlogin.
`
`Many users believe that they are secure against eaves-
`droppers if they use SSH. Unfortunately, in this paper
`we show that despite state-of-the-art encryption tech-
`niques and advanced password authentication protocols
`[YKS 00a], SSH connections can still leak significant
`information about sensitive data such as users’ pass-
`words. This problem is particularly serious because it
`means users may have a false confidence of security
`when they use SSH.
`
`In particular we identify that two seemingly minor weak-
`nesses of SSH lead to serious security risks. First, the
`transmitted packets are padded only to an eight-byte
`boundary (if a block cipher is in use). Therefore an
`eavesdropper can easily learn the approximate length of
`the original data. Second, in interactive mode, every
`individual keystroke that a user types is sent to the re-
`mote machine in a separate IP packet immediately af-
`ter the key is pressed (except for some meta keys such
`Shift or Ctrl). We show in the paper that this prop-
`erty can enable the eavesdropper to learn the exact length
`of users’ passwords. More importantly, as we have veri-
`fied, the time it takes the operating system to send out the
`packet after the key press is in general negligible com-
`paring to the inter-keystroke timing. Hence an eaves-
`
`ASSA ABLOY Ex. 1008 - Page 1
`ASSA ABLOY AB v. CPC Patent Technologies Pty Ltd.
`IPR2022-01045 - U.S. Patent No. 9,269,208
`
`

`

`dropper can learn the precise inter-keystroke timings of
`users’ typing from the arrival times of packets.
`
`Experience shows that users’ typing follows stable pat-
`terns1. Many researchers have proposed to use the du-
`ration of key strokes and latencies between key strokes
`as a biometric for user authentication [GLPS80, UW85,
`LW88, LWU89, JG90, BSH90, MR97, RLCM98,
`MRW99]. A more challenging question which has not
`yet been addressed in the literature is whether we can
`use timing information about key strokes to infer the key
`sequences being typed. If we can, can we estimate quan-
`titatively how many bits of information are revealed by
`the timing information? Experience seems to indicate
`that the timing information of keystrokes reveals some
`information about the key sequences being typed. For
`example, we might have all experienced that the elapsed
`time between typing the two letters “er” can be much
`smaller than between typing “qz”. This observation is
`particularly relevant to security. Since as we show the
`attacker can get precise inter-keystroke timings of users’
`typing in a SSH session by recording the packet arrival
`times, if the attacker can infer what users type from the
`inter-keystroke timings, then he could learn what users
`type in a SSH session from the packet arrival times.
`
`In this paper we study users’ keyboard dynamics and
`show that the timing information of keystrokes does leak
`information about the key sequences typed. Through
`more detailed analysis we show that the timing informa-
`tion leaks about 1 bit of information about the content
`per keystroke pair. Because the entropy of passwords
`is only 4–8 bits per character, this 1 bit per keystroke
`pair information can reveal significant information about
`the content typed. In order to use inter-keystroke tim-
`ings to infer keystroke sequences, we build a Hidden
`Markov Model and develop a n-Viterbi algorithm for the
`keystroke sequence inference. To evaluate the effective-
`ness of the attack, we further build an attacker system,
`Herbivore, which monitors the network and collects tim-
`ing information about keystrokes of users’ passwords.
`Herbivore then uses our key sequence prediction algo-
`rithm for password prediction. Our experiments show
`that, for passwords that are chosen uniformly at random
`with length of 7 to 8 characters, Herbivore can reduce the
`cost of password cracking by a factor of 50 and hence
`speed up exhaustive search dramatically. We also pro-
`pose some countermeasures to mitigate the problem.
`
`We emphasize that the attacks described in this paper are
`a general issue for any protocol that encrypts interactive
`traffic. For concreteness, we study primarily SSH, but
`these issues affect not only SSH 1 and SSH 2, but also
`
`1In this paper we only consider users who are familiar with key-
`board typing and use touch typing.
`
`any other protocol for encrypting typed data.
`
`In Section 2
`The outline of this paper is as follows.
`we discuss in more details about the vulnerabilities
`of SSH and various simple techniques an attacker can
`use to learn sensitive information such as the length
`of users’ passwords and the inter-keystroke timings of
`users’ passwords typed.
`In Section 3 we present our
`statistical study on users’ typing patterns and show that
`inter-keystroke timings reveal about 1 bit of information
`per keystroke pair. In Section 4 we describe how we can
`infer key sequences using a Hidden Markov Model and
`a n-Viterbi algorithm. In Section 5 we describe the de-
`sign, development and evaluation of an attacker system,
`Herbivore, which learns users’ passwords by monitoring
`SSH sessions. We propose countermeasures to prevent
`these attacks in Section 7, and conclude in Section 8.
`
`2 Eavesdropping SSH
`
`The Secure Shell SSH [SSL01, YKS  00b] is used to en-
`crypt the communication link between a local host and a
`remote machine. Despite the use of strong cryptographic
`algorithms, SSH still leaks information in two ways:
`
`First, the transmitted packets are padded only to an
`eight-byte boundary (if a block cipher is in use),
`which leaks the approximate size of the original
`data.
`
`in interactive mode, every individual
`Second,
`keystroke that a user types is sent to the remote
`machine in a separate IP packet immediately after
`the key is pressed (except for some meta keys such
`Shift or Ctrl). Because the time it takes the op-
`erating system to send out the packet after the key
`press is in general negligible comparing to the inter-
`keystroke timing (as we have verified), this also
`enables an eavesdropper to learn the precise inter-
`keystroke timings of users’ typing from the arrival
`times of packets.
`
`The first weakness poses some obvious security risks.
`For example, when one logs into a remote site R in
`SSH, all the characters of the initial login password
`are batched up, padded to an eight-byte boundary if a
`block cipher is in use, encrypted, and transmitted to R.
`Due to the way padding is done, an eavesdropper can
`learn one bit of information on the initial login pass-
`word, namely, whether it is at least 7 characters long
`or not. The second weakness can lead to some potential
`anonymity risks since, as many researchers have found
`previously, inter-keystroke timings can reveal the iden-
`
`ASSA ABLOY Ex. 1008 - Page 2
`ASSA ABLOY AB v. CPC Patent Technologies Pty Ltd.
`IPR2022-01045 - U.S. Patent No. 9,269,208
`
`

`

`SSH
`Server B
`
`,I
`
`20
`
`l
`
`20
`
`"Password: "
`j~ ,U
`A
`
`28
`
`jl
`
`~
`
`jl
`
`Prompt
`
`N
`
`time
`
`-
`-
`
`20
`
`"s"
`
`20
`
`'
`
`Client
`Host A
`Figure 1: The traffic signature associated with running SU in a SSH session. The numbers in the figure are the size
`(in bytes) of the corresponding packet payloads.
`
`,,
`
`20
`
`,,
`
`20 20 20 20 20
`
`20
`
`"u"
`
`Return
`
`"J""u""l"
`
`"i"
`
`"a"
`
`Return
`
`,,
`
`time
`
`-
`
`tity of the user [GLPS80, UW85, LW88, LWU89, JG90,
`BSH90, MR97, RLCM98, MRW99].
`
`In this section, we show that several simple and practical
`attacks exploiting these two weaknesses. In particular,
`an attacker can identify which transmitted packets corre-
`spond to keystrokes of sensitive data such as passwords
`in a SSH session. Using this information, the attacker
`can easily find out the exact length of users’ passwords
`and even the precise inter-keystroke timings of the typed
`passwords. Learning the exact length of users’ pass-
`words allows eavesdroppers to target users with short
`passwords. Learning the inter-keystroke timing infor-
`mation of the typed passwords allows eavesdroppers to
`infer the content of the passwords as we will show in
`Section 3 and 4.
`
`Traffic Signature Attack We can often exploit prop-
`erties of applications to identify which packets corre-
`spond to the typing of a password. Consider, for in-
`stance, the SU command. Assume the user has already
`established a SSH connection from local host A to re-
`mote host B. When the user types the command SU
`in the established SSH connection A
`B, we obtain a
`peculiar traffic signature as shown in Figure 1.
`If the
`SSH session uses SSH 1.x2 and a block cipher such
`as DES for the encryption [NBS77, NIS99], as is com-
`mon, then the local host A sends three 20-byte pack-
`ets: “s”, “u”, “Return”. The remote host B echoes the
`“s” and “u” in two 20-byte packets and sends a 28-byte
`packet for the “Password: ” prompt. Then A sends 20-
`byte packets, one for each of the password characters,
`without receiving any echo data packets. B then sends
`some final packets containing the root prompt if SU suc-
`ceeds, otherwise some failure messages. Thus by check-
`ing the traffic against this “su” signature, the attacker
`can identify when the user issues the SU command and
`
`2The attack also works when ssh 2.x is in use. Only the packet
`sizes are slightly different.
`
`hence learn which packets correspond to the password
`keystrokes. Note that similar techniques can be used to
`identify when users type passwords to authenticate to
`other applications such as PGP [Zim95] in a SSH ses-
`sion.
`
`Multi-User Attack Even more powerful attacks exist
`when the attacker also has an account on the remote
`machine where the user is logging into through SSH.
`For example, the process status command ps can list
`all the processes running on a system. This allows the
`attacker to observe each command that any user is run-
`ning. Again, if the user is running any command that re-
`quires a password input (such as su or pgp) the attacker
`can identify the packets corresponding to the password
`keystrokes.
`
`Nested SSH Attack Assume the user has already es-
`tablished a SSH session between the local host A and
`remote host B. Then the user wants to open another SSH
`session from B to another remote host C as shown in Fig-
`ure 2. In this case, the user’s password for C is transmit-
`ted, one keystroke at a time, across the SSH-encrypted
`B from the user to B, even though the SSH
`link A
`client on machine B patiently waits for all characters of
`the password before it sends them all in one packet to
`host C for authentication (as designed in the SSH proto-
`col [YKS 00a]). It is easy to identify such a nested SSH
`connection using techniques developed by Zhang and
`Paxson [ZP00b, ZP00a]. Hence in this case the eaves-
`dropper can easily identify the packets corresponding to
`the user’s password on link A B, and from this learn
`the length and the inter-keystroke timings of the users’
`password on host C.
`
`ASSA ABLOY Ex. 1008 - Page 3
`ASSA ABLOY AB v. CPC Patent Technologies Pty Ltd.
`IPR2022-01045 - U.S. Patent No. 9,269,208
`
`

`

`Our focus on passwords creates many challenges. Pass-
`words are entered very differently from other text: pass-
`words are typed frequently enough that, for many users,
`the keystroke pattern is memorized and often typed al-
`most without conscious thought. Furthermore, well-
`chosen passwords should be random and have little or
`no structure (for instance, they should not be based on
`dictionary words). As a consequence, naive measure-
`ments of keystroke timings will not be representative of
`how users type passwords unless great care is taken in
`the design of the experimental methodology.
`
`Our experimental methodology is carefully designed to
`address these issues. Due to security and privacy consid-
`erations, we chose not to gather data on real passwords;
`therefore, we have chosen a data collection procedure
`intended to mimic how users type real passwords. A
`conservative method is to pick a random password for
`the user (where each character of the password is cho-
`sen uniformly at random from a set of 10 letter keys and
`5 number keys, independently of all other characters in
`the password), have the user practice typing this pass-
`word many times without collecting any measurements,
`and then measure inter-keystroke timing information on
`this password once the user has had a chance to practice
`it at length.
`
`However, we found that, when the goal is to try to
`identify potentially relevant timing properties (rather
`than verify conjectured properties), this conservative ap-
`proach is inefficient. In particular, users typically type
`passwords in groups of 3–4 characters, with fairly long
`pauses between each group. This distorts the digraph
`statistics for the pair of characters that spans the group
`boundary and artificially inflates the variance of our
`measurements. As a result we would need to collect
`a great deal of data for many random passwords be-
`fore this effect would average out. In addition, it takes
`quite a while for users to become familiar with long ran-
`dom passwords. This makes the conservative approach a
`rather blunt tool for understanding inter-keystroke statis-
`tics.
`
`Fortunately, there is a less costly way to gather inter-
`keystroke timing statistics: we gather training data on
`each pair of characters
`as typed in isolation. We
`ka
`kb
`pick a character pair and ask the user to type this pair 30–
`40 times, returning to the home row each time between
`repetitions. For each user, we repeat this for many pos-
`sible pairs (142 pairs, in our experiments) and we gather
`data on inter-keystroke timings for each such pair. We
`collected the latency of each character pair measurement
`and computed the mean value and the standard devia-
`tion. In our experience, this gives better results.
`
`SSH2
`
`password
`
`B
`
`password
`
`SSH1
`
`A
`
`eavesdrop
`
`C
`
`Adversary
`
`Figure 2: The nested SSH attack.
`
`3 Statistical Analysis of Inter-keystroke
`Timings
`
`As a first study towards inferring key sequences from
`timing information, we develop techniques for statistical
`analysis of the inter-keystroke timings. In this section,
`we first describe how we collect training data and show
`some simple timing characteristics of character pairs.
`We then show how we model the inter-keystroke timing
`of a given character pair as a Gaussian distribution. We
`then describe how to estimate quantitatively the amount
`of information about the character pair that one can learn
`using the inter-keystroke timing information. Denote the
`set of character pairs of interest as Q, and let Q denote
`the cardinality of the set Q.
`
`3.1 Data Collection
`
`The two keystrokes of a pair of characters
`gen-
`kb
`ka
`erates four events: the press of ka, the release of ka, the
`press of kb, and the release of kb. However, because
`only key presses (not key releases) trigger packet trans-
`mission, an eavesdropper can only learn timing informa-
`tion about the key-press events. Since the main focus of
`our study is in the scenario where an adversary learns
`timing information on keystrokes by simply monitoring
`the network, we focus only on key-press events. The
`time difference between two key presses is called the la-
`tency between the two keystrokes. We also use the term
`inter-keystroke timing to refer to the latency between two
`keystrokes.
`
`In order to characterize how much information is leaked
`by inter-keystroke timings, we have performed a number
`of empirical tests to measure the typing patterns of real
`users. Because passwords are probably the most sen-
`sitive data that a user will ever type, we focus only on
`information revealed about passwords (rather than other
`forms of interactive traffic).
`
`ASSA ABLOY Ex. 1008 - Page 4
`ASSA ABLOY AB v. CPC Patent Technologies Pty Ltd.
`IPR2022-01045 - U.S. Patent No. 9,269,208
`
`
`
`
`
`
`
`

`

`15
`
`10
`
`5
`
`Frequency
`
`15
`
`10
`
`5
`
`Frequency
`
`0
`
`300
`200
`100
`0
`Inter−keystroke Timing for v−o (milliseconds)
`
`0
`
`300
`200
`100
`0
`Inter−keystroke Timing for v−b (milliseconds)
`
`Figure 3: The distribution of inter-keystroke timings for two sample character pairs.
`
`As an example, Figure 3 shows the latency histogram
`of two sample character pairs. The left model corre-
`v, o
`, and the
`sponds to the latency between the pair
`right model corresponds to
`v, b
`. We can see that the
`latency between
`v, o
`is clearly shorter than the la-
`tency between
`v, b
`, and the latency distributions of
`these two sample character pairs are almost entirely non-
`overlapping.
`
`The optimized data collection approach gives us a more
`efficient way to study fine-grained details of inter-
`keystroke statistics without requiring collecting an enor-
`mous amount of data. We used data collected in this way
`to quickly identify plausible conjectures, develop poten-
`tial attacks, and to train our attack models. As far as
`we are aware, collecting data on keystroke pairs in iso-
`lation does not seem to bias the data in any obvious way.
`Nonetheless, we also validate all our results using the
`conservative measurement method (see Section 5).
`
`3.2 Simple Timing Characteristics
`
`Next, we divide the test character pairs into five cate-
`gories, based on whether they are typed using the same
`hand, the same finger, and whether they involve a num-
`ber key:
`
` Two letter keys typed with alternating hands, i.e.,
`
`one with left hand and one with right hand;
`
` Two characters containing one letter key and one
`number key typed with alternating hands;
`
` Two letter keys, both typed with the same hand but
`with two different fingers;
`
` Two letter keys typed with the same finger of the
`same hand;
`
` Two characters containing one letter key and one
`number key, both typed with the same hand.
`
`Figure 4 shows the histogram of latency distribution of
`character pairs for each category. We split the whole la-
`tency range into six bins as shown in the x-axis. Within
`each category, we put each character pair into the cor-
`responding bin if its mean latency value is within the
`range of the bin. Each bar in the histogram of a cate-
`gory represents the ratio of the number of character pairs
`in the associated bin over the total number of character
`pairs in the category.3 We can see that all the character
`pairs that are typed using two different hands take less
`than 150 milliseconds, while pairs typed using the same
`hand and particularly the same finger take substantially
`longer. Character pairs that alternate between one letter
`key and one number key, but are typed using the same
`3Hence the sum of all bars within one category is 1.
`
`ASSA ABLOY Ex. 1008 - Page 5
`ASSA ABLOY AB v. CPC Patent Technologies Pty Ltd.
`IPR2022-01045 - U.S. Patent No. 9,269,208
`
`
`
`
`
`
`
`
`
`

`

`Histogram of the latency of character pairs
`
`Two letter keys, alternating hands
`A letter and a number, alternating hands
`Two letters, same hand, different fingers
`Two letters, same finger
`A letter and a number, same hand
`
`< 100
`
`100-150
`
`150-200
`
`200-250
`
`250-300
`
`> 300
`
`Latency (milliseconds)
`
`1
`
`0.9
`
`0.8
`
`0.7
`
`0.6
`
`0.5
`
`0.4
`
`0.3
`
`0.2
`
`0.1
`
`0
`
`Ratio of character pairs
`
`Figure 4: Inter-keystroke timings for character pairs in five different categories. Note that some bars at some positions
`disappear because the corresponding height is zero.
`
`hand, take the longest time to type. This is simply be-
`cause two hands offer a certain amount of parallelism,
`while character pairs typed with one hand require a cer-
`tain degree of sequential movements and hence tend to
`take longer. This is especially obvious in the case of one
`letter and one number pairs typed using one hand. They
`in general require more hand movement and hence the
`longest time.4
`
`So, if the attacker observes a character pair typed with
`latency more than 150 milliseconds, he can guess with
`high probability of success that the character pair is not
`typed using two different hands and hence can infer
`about 1 bit of information about the content of the char-
`acter pair. Because the 142 character pairs are formed
`from randomly selected letter keys and number keys,
`they seem likely to form a representative sample of the
`whole keyboard. Hence this simple classification ex-
`tends to the whole keyboard, and already indicates that
`the inter-keystroke timing leaks substantial information
`about what is typed.
`
`The properties described above are unlikely to be ex-
`haustive. For instance, earlier work on timing attacks
`on multi-user machines suggested that inter-keystroke
`timings may additionally reveal which characters in the
`
`4Note that here we only consider users that use touch typing.
`
`password are upper-case [Tro98].
`
`3.3 Gaussian Modeling
`
`From the plot of the latency distribution of a given char-
`acter pair, such as the ones shown in Figure 3, we can see
`that the latency between the two key strokes of a given
`character pair forms a Gaussian-like unimodal distribu-
`tion. Hence a natural assumption (which is confirmed
`by our empirical observations) is that the probability of
`the latency y between two keystrokes of a character pair
`Q, Pr
`, forms a univariate Gaussian distribution
`q
`y q
`m q
`s q
`, meaning
`
`
`
`Pr
`
`1
`
`2
`
`m q
`2s 2
`q
`
`e
`
`  y
`
`y q
` 2ps
`
`q
`where m q is the mean value of the latency for character
`pair q and s q is the standard deviation. Given a set of
`training data
`N, where qi is the i-th charac-
`qi
`yi
`1
`i
`
`ter pair and yi is the corresponding latency in the data
`

`m q
`s q
`collection, we can derive the parameters
`q
`Q
`
`based on maximum likelihood estimation, i.e., we com-
` 

`pute the mean and the standard deviation for each char-
`acter pair.
`
`Figure 5 shows the estimated Gaussian models of the la-
`tencies of the 142 character pairs. Our empirical result
`
`ASSA ABLOY Ex. 1008 - Page 6
`ASSA ABLOY AB v. CPC Patent Technologies Pty Ltd.
`IPR2022-01045 - U.S. Patent No. 9,269,208
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`

`

`shows that most of the latencies of the character pairs
`lie between 50 and 250 milliseconds. The average of
`the standard deviation of the 142 character pairs is about
`30 milliseconds. The graph also indicates that the la-
`tency distributions of the character pairs severely over-
`lap, which means the inference of character pairs using
`just latency information is a challenging task.
`
`3.4
`
`Information Gain Estimation
`
`Q Pr
`
`where Pr
`
`H1
`
`y0 
`
`We would like to estimate quantitatively how much
`information the latency information reveals about the
`character pairs typed. This will be an upper bound
`of how much information an attacker can extract from
`the timing information using any particular method.
`We estimate it by computing the information gain in-
`duced by the latency information. If we select a char-
`acter pair uniformly at random from the character-
`pair space, and if the attacker does not get any addi-
`tional information, the entropy of the probability dis-
`tribution of character pairs to the attacker is H0
`q
`  (cid:229) q
`If the attacker learns
`Q Pr
`log2 Pr
`log2 Q 
`q
`q
` 
`the latency y0 between the two keystrokes of the char-
`acter pair, the estimated entropy of the probability dis-
`tribution of character pairs to the attacker is H1
`q y
`y0  (cid:229) q
`log2 Pr
`q y0
`q y0 
`q y0 
`Pr y0
`Pr q
`q
`and Pr
`is computed using the
`y0 q
`  (cid:229)
`
`Pr q
`Q Pr y0
`q
`q
` 
`
`Gaussian distribution obtained in the parameter estima-
`tion phase in the previous subsection. The information
`gain induced by the observation of latency y0 is the dif-
`ference between the two entropies, H0
`q
`q y
`
`Using the parameter estimation of the 142 character
`pairs obtained in the previous section, we can compute
`and H0
`as shown in Fig-
`q y
`q y
`H1
`y0
`H1
`y0
`q
`
`ure 6(a) and Figure 6(b).
`The estimated information gain, also called mutual in-
`formation, is I
`q; y
`Pr
`y0

`q y
`q
`q
`H0
`H1
`
`
`y0  (cid:229) q
` 
` 
`where Pr
`Q Pr
`Pr
`q
`dy0
`y0 q
`q y
` 
`From the numerical computation we obtain I
`q; y
`1
`2.
` 
`This means the estimated information gain available
`2 bits per charac-
`from latency information is about 1
`ter pair when the character pair has uniform distribution.
`Hence the attacker could potentially extract 1
`2 bits of
`information per character pair by using the latency in-
`formation in this case. Because the character pairs in
`our experiments are selected uniformly at random from
`all letter and number keys, we expect that they will be
`representative of the whole keyboard. Intuitively, Fig-
`ure 5 is a sufficiently-large random sampling of a much
`denser graph containing the latency distributions of all
`possible character pairs. More detailed analysis shows
`that the estimated information gain computed using 142
`sample character pairs is a good estimate of the infor-
`
`H1
`
`y0
`
`
`
`H0
`
`0.035
`
`0.03
`
`0.025
`
`0.02
`
`0.015
`
`0.01
`
`0.005
`
`Probability
`
`0
`
`0
`
`50
`
`100
`
`150
`
`200
`Latency (millisecond)
`
`250
`
`300
`
`350
`
`400
`
`Figure 5: Estimated Gaussian distributions of all 142
`character pairs collected from a user.
`
`7
`
`6.5
`
`6
`
`5.5
`
`5
`
`4.5
`
`Entropy (bits)
`
`4
`
`0
`
`50
`
`100
`
`150
`Latency (milliseconds)
`
`200
`
`250
`
`300
`
`(a) Entropy of character pairs given a latency obser-
`vation
`
`50
`
`100
`
`150
`Latency (milliseconds)
`
`200
`
`250
`
`300
`
`3.5
`
`3
`
`2.5
`
`2
`
`1.5
`
`Information Gain (bits)
`
`1
`
`0.5
`
`0
`
`(b) Information gain induced by a latency observation
`
`Figure 6: Entropy and information gain as a function of
`the inter-keystroke latency.
`
`ASSA ABLOY Ex. 1008 - Page 7
`ASSA ABLOY AB v. CPC Patent Technologies Pty Ltd.
`IPR2022-01045 - U.S. Patent No. 9,269,208
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`

`

`mation gain when the character-pair space includes all
`letter and number character pairs. This estimate is com-
`parable to the back-of-the-envelope calculation in Sec-
`tion 3.2 based on our classification into five categories
`of keystroke pairs.
`
`Because the entropy of written English is so low (about
`0.6–1.3 bits per character [Sha50]), the 1
`2-bit informa-
`tion gain per character pair leaked through the latency in-
`formation seems to be significant. 5 For example, we can
`expect that users’ PGP passphrases will often contain
`only 1 bit of entropy per character. Hence the latency in-
`formation may reveal significant information about PGP
`passphrases.
`
`The information gain curve in Figure 6(b) shows a con-
`vex shape. Note that latencies greater than 175 millisec-
`onds are relatively rare; however, whenever we see such
`a long time between keystrokes, we learn a lot of in-
`formation about what was typed, because there are not
`many possibilities that would lead to such a large la-
`tency. The character pairs that take longer than 175
`milliseconds to type are mostly pairs containing number
`keys or pairs typed with one finger. Hence this analysis
`suggests that passwords containing number keys or char-
`acter pairs that are typed with one finger are particularly
`vulnerable to such timing attacks.
`
`Another interesting observation is that the mean of the
`standard deviations of the character pairs is only about
`30 milliseconds as shown in our experiments, while the
`standard deviation of round-trip time on the Internet in
`many cases is less than 10 milliseconds [Bel93]. There-
`fore even when the attacker is far from the SSH client
`host, he can still get sufficiently-precise inter-keystroke
`timing information. This makes the timing attack even
`more severe.
`
`4 Inferring Character Sequences From
`Inter-Keystroke Timing Information
`
`In this section, we describe how we can infer charac-
`ter sequences using the latency information. In partic-
`ular, we model the relationship of latencies and charac-
`ter sequences as a Hidden Markov Model [RN95]. We
`extend the standard Viterbi algorithm to an n-Viterbi al-
`gorithm that outputs the n most likely candidate char-
`acter sequences. We further estimate how many bits of
`information about the real character sequence this algo-
`
`5Note that the 1
`2-bit information gain is estimated for the case of
`randomly selected passwords where the sequence of characters have
`a uniform distribution. However, this is not the case for texts. More
`careful calculation is needed to estimate the information gain in the
`case of natual text.
`
`rithm extracts from the latency information and show it
`is nearly optimal.
`
`4.1 Hidden Markov Model
`
`In general, a Markov Model is a way of describing a
`finite-state stochastic process with the property that the
`probability of transitioning from the current state to an-
`other state depends only on the current state, not on any
`prior state of the process [RN95]. In a Hidden Markov
`Model (HMM), the current state of the process cannot be
`directly observed. Instead, some outputs from the state
`are observed, and the probability distribution of possible
`outputs given the state is dependent only on the state.
`Using a HMM, one can infer information about the prior
`path the process has taken from the sequence of observed
`outputs of the states, and efficient algorithms are known
`for working with HMM’s. Because of this, HMM’s have
`been widely used in areas such as speech recognition and
`text modeling.
`
`In our setting, we consider each character pair of inter-
`est as a hidden (non-observable) state, and the latency
`between the two keystrokes of the character pair as the
`output observation from the character-pair state. Each
`state corresponds to a pair of characters, so that the typ-
`KT , is a process that
`ing of a character sequence K0
`    
`goes through T states, q1
`1
`qT , where qt
`t T
`    
`represents the t-th character pair
`typed. Let
`Kt
`Kt
`1
`denote the observed latency of state qt.
`1
`t T
`yt
`Then we model the typing of a character sequence as a
`HMM. This means we make two assumptions. First, the
`probability of transition from the current state to another
`state is only dependent on the current state, not on the
`prior path of the process. If the character sequence is a
`password chosen uniformly at random, this assumption
`obviously hol

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket