Problem 7.2
Quicksort with equal element values
The analysis of the expected running time of randomized quicksort in section 7.4.2 assumes that all element values are distinct. In this problem. we examine what happens when they are not.
- Suppose that all element values are equal. What would be randomized quick-sort's running time in this case?
- The
PARTITION
procedure returns an index $q$ such that each element of $A[p \ldots q - 1]$ is less than or equal to $A[q]$ and each element of $A[q + 1 \ldots r]$ is greater than $A[q]$. Modify thePARTITION
procedure to produce a procedurePARTITION'(A, p, r)
which permutes the elements of $A[p \ldots r]$ and returns two indices $q$ and $t$ where $p \le q \le t \le r$, such that:Like
- all elements of $A[q \ldots t]$ are equal,
- each element of $A[p \ldots q - 1]$ is less than $A[q]$, and
- each element of $A[t + 1 \ldots r]$ is greater than $A[q]$.
PARTITION
, yourPARTITION'
procedure should take $\Theta(r - p)$ time.- Modify the
RANDOMIZED-QUICKSORT
procedure to callPARTITION'
, and name the new procedureRANDOMIZED-QUICKSORT'
. Then modify theQUICKSORT
procedure to produce a procedureQUICKSORT'(p, r)
that callsRANDOMIZED-PARTITION'
and recurses only on partitions of elements not know to be equal to each other.- Using
QUICKSORT'
, how would you adjust the analysis of section 7.4.2 to avoid the assumption that all elements are distinct?
Running time
It will be $\Theta(n^2)$, because each split will be (n-1)-to-1 (see exercise 7.1-2).
Implementation
The code is below.
PARTITION'
is very similar to PARTITION
, except that after it completes
arranging the elements around a pivot $q$, it moves all elements $t > q: A[t] =
x$ right after $q$. That way we get a chuck of equal elements after the pivot.
The procedure makes another pass at the array, which is at most $n$ more time and becuse $\Theta(n) + \Theta(n) = \Theta(n) = \Theta(r - p)$ we fulfill the condition.
Analysis
The analysis does not change much. Section 7.4.2 uses the knowledge that the
elements are distinct in order to determine when two elements cannot be
compared. It will still be true that in any interval $Z_{ij}$, two elements
will get compared only if $z_i$ or $z_j$ gets picked as a pivot first. This
would not hold with PARTITION'
if there are repeated elements.
Note that with this implementation, the number of comparisons increases, but only by a constant factor. The results from the analysis are the same.
C code
#include <stdlib.h> #define EXCHANGE(a, b) tmp = a; a = b; b = tmp; typedef struct { int q; int t; } pivot_t; pivot_t partition(int[], int, int); pivot_t randomized_partition(int[], int, int); void quicksort(int A[], int p, int r) { if (p < r - 1) { pivot_t pivot = randomized_partition(A, p, r); quicksort(A, p, pivot.q); quicksort(A, pivot.t, r); } } pivot_t randomized_partition(int A[], int p, int r) { int i = rand() % (r - p) + p, tmp; EXCHANGE(A[i], A[r-1]); return partition(A, p, r); } pivot_t partition(int A[], int p, int r) { int x = A[r - 1], q = p, t, tmp; for (int i = p; i < r - 1; i++) { if (A[i] < x) { EXCHANGE(A[q], A[i]); q++; } } for (t = q; t < r && A[t] == x; t++); for (int i = r - 1; i >= t; i--) { if (A[i] == x) { EXCHANGE(A[t], A[i]); t++; } } pivot_t result = {q, t}; return result; }