This problem is CLR problem 9-2, part c.
Hint: Use select() to find the k-th smallest of the x's for various k's. Use binary search to home in on the k you want.
As possibly useful stuff on this problem and as example of how I would write up exercise solutions, below is my solution to parts a and b of problem 9-2. I haven't restated the problem. This will not make sense without reading 9-2 at the same time.
let A be a dataset containing the pairs elements xi and associated weights wiwhich sum to 1. let n be the number of elements.
In this discussion I will use 1-based indexing so i runs from 1 to n and also the rank of an xi runs from 1 (smallest) to n (largest).
Part (a):
If xj is the (lower) median of the x's, then
k = ⌊((n-1)/2)⌋ of the x's are less than it,
and
K = ⌈((n-1)/2)⌉ are greater than it.
We must show that the sum of the k weights of the lesser x's is less than 1/2, and the sum of the K weights of the greater x's is no more than 1/2. Since all the weights are 1/n, the sum of any j weights is j/n.
If n is odd then k = K = n/2 - 1/2, so the sums are k/n = K/n = 1/2 - (1/2n). [corrected Sept 29 from "... 1/2 - (1/2)n."] Both are strictly less than 1/2, meeting the requirement.
If n is even then k+1 = K = n/2, the sum of the smaller is k/n = 1/2 - 1/n, which is is less than 1/2, and the sum of the larger is K/n = 1/2, which is no more than 1/2, as required.
Part (b):[with corrections of Sept 30]
Create a datatype consisting of (x, w) pairs. Such pairs can be
compared according to their x field, but when moved around, the w value
goes along with the x value. For instance as sketched in this C++ pseudocode.
templateFor any k, let sumk denote the value of sum after the k-th iteration of the for loop. We have that sumi-1 < 1/2 <= sumi, for the index i of the returned value. it follows from the second inequality that the sum of the larger elements is 1 - sumi <= 1/2. This demonstrates that the returned value is the weighted median. The cost of the algorithm is the cost of sorting, O(n*lg(n)), plus O(n) for step 2. Thus the overall cost is O(n*lg(n)). Remark: the loop terminates after no more than n iterations, because the sum of all n weights is 1 which is greater than the stopping condition.//Let T be the type of the xi and R be the //type of the numeric weights, wi (such as double). typedef pair<T, R> item; //Define a less-than predicate function object on items: struct less_item { bool operator()(const item& a, const item& b) { return a.first < b.first; } //... } typedef vector - D; T weightedMedian(D) { // 1. Sort the dataset D according to the less_item compare function. sort(D.begin(),D.end(), less_item); // 2. find where the weighted sum goes over 1/2. R sum = D[0].second; int i; for (i = 1; sum < 1/2; ++i) sum += D[i].second; return D[i].first; }