Problem 7.6

Fuzzy sorting of intervals

Consider the problem in which we do not know the numbers exactly. Instead, for each number, we know an interval on the real line to which it belongs. That is, we are given $n$ closed intervals of the form $[a_i, b_i]$, where $a_i \le b_i$. We wish to fuzzy-sort these intervals, i.e., to produce a permutation $\langle i_1, i_2, \ldots, i_n \rangle$ of the intervals such that for $j = 1, 2, \ldots, n$, there exists $c_j \in [a_{i_j}, b_{i_j}]$ satisfying $c_1 \le c_2 \le \cdots \le c_n$.

Design a randomized algorithm for fuzzy-sorting $n$ intervals. Your algorithm should have the general structure of an algorithm that quicksorts the left endpoints (the $a_i$ values), but it should take advantage of overlapping intervals to improve the running time. (As the intervals overlap more and more, the problem of fuzzy-sorting the intervals becoes progressively easier. Your algorithm should take advantage of such overlapping, to the extend that it exists).

Argue that your algorithm runs in expected time $\Theta(n\lg{n})$ in general, but runs in expected time $\Theta(n)$ when all of the intervals overlap (i.e., when there exists a value $x$ such that $x \in [a_i, b_i]$ for all $i$). Your algorithm should not be checking for this case explicitly; rather, its performance should naturally improve as the amount of overlap increases.

The algorithm

The approach is very similar to problem 7.2. After we (randomly) choose a pivot interval, we check if it intersects with the other intervals. More precisely, we accumulate an intersection of the pivot and the other intervals. Afterwards we use this interval for comparison instead of the pivot.

When comparing, we can treat intervals containing the intersection as equal to each other. Thus after we have arranged all the smaller intervals on the left of the pivot, we can put all the equal ones immediatelly to the right of the pivot. Like in problem 7.2, we return two points (an interval) to use as a for recursive calls.

Even if partition does (worst-case) three passes over the array, it is still linear.

Expected time

If we assume that no two intervals have intersections, the analysis is identical to quicksort. If, however, all the intervals share a common point, the partitioning function would solve it in one go.

C code

#include <stdbool.h>
#include <stdlib.h>

typedef struct {
    int left;
    int right;
} interval;

bool intersects(interval a, interval b) { return a.left <= b.right && b.left <= a.right; }
bool before(interval a, interval b)     { return a.right < b.left; }
bool after(interval a, interval b)      { return a.left > b.right; }

#define EXCHANGE(a, b) tmp = a; a = b; b = tmp;

interval partition(interval A[], int p, int r) {
    int pick, s, t, i;
    interval intersection, tmp;

    // Pick a random interval as a pivot
    pick = p + rand() % (r - p);
    EXCHANGE(A[pick], A[r-1]);
    intersection = A[r-1];

    // Find an intersection of the pivot and other intervals
    for (i = p; i < r - 1; i++) {
        if (intersects(intersection, A[i])) {
            if (A[i].left > intersection.left)
                intersection.left = A[i].left;
            if (A[i].right < intersection.right)
                intersection.right = A[i].right;
        }
    }

    // Classic partition around the intersection
    for (i = s = p; i < r - 1; i++) {
        if (before(A[i], intersection)) {
            EXCHANGE(A[i], A[s]);
            s++;
        }
    }
    EXCHANGE(A[r-1], A[s]);

    // Group intervals including the intersection
    for (t = s + 1, i = r - 1; t <= i;) {
        if (intersects(A[i], intersection)) {
            EXCHANGE(A[t], A[i]);
            t++;
        } else {
            i--;
        }
    }

    return (interval) {s, t};
}

void fuzzy_sort(interval array[], int p, int r) {
    if (p < r - 1) {
        interval pivot = partition(array, p, r);
        fuzzy_sort(array, p, pivot.left);
        fuzzy_sort(array, pivot.right, r);
    }
}