by William Shoaff with lots of help
You can download a postscript version of this file (which is prettier) at
One method to compute an algorithm's average case time complexity is to partition the sample space of problem instances into disjoint sets where solving any instance in the set takes the same number of steps.
Let's pretend the number of steps goes from j = 0 to j = m, that is, some instances may require no work and the hardest instances require j = m steps. Let's also denote by Pj the probability that an instance takes j steps. Then we can define the average case time complexity Ta(n) of a problem with input size n as the weighted sum:
To see how this formula applies, consider the pattern matching problem (PMP) where the pattern p, of length m, is from a k symbol alphabet. There are km possible patterns and p is one of them. It helps to keep an example in mind, so let {a, b, c} be the alphabet (k=3) and let p = abac be one of the 34 = 81 possible patterns.
Now suppose we are trying to match p[0..m - 1] against text
t[i..i + m - 1].
There 1 unsuccessful compare (
p[0]
t[i]), or
1 successful compare followed by an unsuccessful compare
(
p[0] = t[i] and
p[1]
t[i + 1]), or
2 successful compares followed by an unsuccessful compare
(
p[0] = t[i] and
p[1] = t[i + 1] and
p[2]
t[i + 2]),
and so on up to m successful compares or m - 1 successful ones
and 1 unsuccessful compare. We want to compute the probabilities
for each of these cases.
The first compare is unsuccessful k - 1 out of k times. Thus the probability of exactly one (unsuccessful) compare is P1 = (k - 1)/k. For our example, P1 = (3 - 1)/3 = 2/3, which corresponds to the 54 out of 81 four letter patterns that start with b or c.
The first compare is successful 1 out of k times and the second compare is unsuccessful k - 1 out of k times. Thus the probability of exactly two compares (one successful followed by an unsuccessful one) is P2 = (k - 1)/k2. For our example, P2 = 2/9, which corresponds to the 18 out of 81 four letter patterns that start as aa or ac.
The first and second compares are successful 1 out of k2 times and the third compare is unsuccessful k - 1 out of k times. Thus the probability of exactly three compares (two successful followed by one unsuccessful) is P3 = (k - 1)/k3. For our example, P3 = 2/27, which corresponds to the 6 out of 81 four letter patterns that start as abb or abc.
There are three successful compares 1 out of k3 times followed by a fourth unsuccessful compare k - 1 out of k times. There are four successful compares 1 out of k4 times. Thus the probability of exactly four compares (3 successful, 1 unsuccessful or 4 successful) is P4 = (k - 1)/k4 + 1/k4. For our example, P4 = 3/81, which corresponds to the 2 out of 81 four letter patterns that start as abaa or abab and the one abac where 4 successful compares occur.
Thus, for a fixed text position i, the average number of compares is
With some careful algebraic manipulation, one can derive the formula
which never gets bigger than 2. That is, at any position in the text, the average number of compares is less than two. It follows that the average number of compares is at most 2n, and this bound is independent of the alphabet size and pattern length.