The median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value, and the median is the mean of the two middle values.
arr = [2,3,4]
, the median is 3
.arr = [2,3]
, the median is (2 + 3) / 2 = 2.5
.Implement the MedianFinder class:
MedianFinder()
initializes the MedianFinder
object.void addNum(int num)
adds the integer num
from the data stream to the data structure.double findMedian()
returns the median of all elements so far. Answers within 10-5
of the actual answer will be accepted.
Example 1:
Input ["MedianFinder", "addNum", "addNum", "findMedian", "addNum", "findMedian"] [[], [1], [2], [], [3], []] Output [null, null, null, 1.5, null, 2.0] Explanation MedianFinder medianFinder = new MedianFinder(); medianFinder.addNum(1); // arr = [1] medianFinder.addNum(2); // arr = [1, 2] medianFinder.findMedian(); // return 1.5 (i.e., (1 + 2) / 2) medianFinder.addNum(3); // arr[1, 2, 3] medianFinder.findMedian(); // return 2.0
Constraints:
-105 <= num <= 105
findMedian
.5 * 104
calls will be made to addNum
and findMedian
.
Follow up:
[0, 100]
, how would you optimize your solution?99%
of all integer numbers from the stream are in the range [0, 100]
, how would you optimize your solution?This problem can be efficiently solved using two heaps: a min-heap and a max-heap. The min-heap (minQ
) will store the larger half of the numbers added, and the max-heap (maxQ
) will store the smaller half. This setup allows us to quickly find the median.
Algorithm:
Initialization: Create an empty min-heap (minQ
) and an empty max-heap (maxQ
).
addNum(num)
:
num
to the max-heap (maxQ
).maxQ
becomes greater than the size of minQ
by more than 1, move the largest element from maxQ
to minQ
.minQ
becomes greater than the size of maxQ
, move the smallest element from minQ
to maxQ
. This ensures that the heaps remain balanced, with their sizes differing by at most 1.findMedian()
:
minQ
and maxQ
are equal, the median is the average of the top elements of both heaps.minQ
is greater than the size of maxQ
, the median is the top element of minQ
.Time Complexity:
addNum(num)
: O(log n), where n is the number of elements added so far. Heap operations (insertion and deletion) take logarithmic time.findMedian()
: O(1). Accessing the top element of a heap is a constant-time operation.Space Complexity: O(n), as we store all the numbers added in the heaps.
Code Examples:
The code examples below demonstrate the implementation in various programming languages. Note that the specific heap implementation might vary depending on the language's standard library (e.g., PriorityQueue
in Java, heapq
in Python).
Python:
import heapq
class MedianFinder:
def __init__(self):
self.min_heap = [] # Min-heap to store larger half
self.max_heap = [] # Max-heap to store smaller half
def addNum(self, num: int) -> None:
heapq.heappush(self.max_heap, -num) # Use negative for max-heap
if len(self.max_heap) > len(self.min_heap) + 1:
heapq.heappush(self.min_heap, -heapq.heappop(self.max_heap))
if len(self.min_heap) > len(self.max_heap):
heapq.heappush(self.max_heap, -heapq.heappop(self.min_heap))
def findMedian(self) -> float:
if len(self.min_heap) == len(self.max_heap):
return (-self.max_heap[0] + self.min_heap[0]) / 2
else:
return -self.max_heap[0]
Java:
import java.util.PriorityQueue;
import java.util.Collections;
class MedianFinder {
private PriorityQueue<Integer> minHeap;
private PriorityQueue<Integer> maxHeap;
public MedianFinder() {
minHeap = new PriorityQueue<>();
maxHeap = new PriorityQueue<>(Collections.reverseOrder());
}
public void addNum(int num) {
maxHeap.offer(num);
if (maxHeap.size() > minHeap.size() + 1) {
minHeap.offer(maxHeap.poll());
}
if (minHeap.size() > maxHeap.size()) {
maxHeap.offer(minHeap.poll());
}
}
public double findMedian() {
if (maxHeap.size() == minHeap.size()) {
return (double)(maxHeap.peek() + minHeap.peek()) / 2;
} else {
return (double)maxHeap.peek();
}
}
}
(Other languages would follow a similar structure, using their respective heap implementations.)
This approach provides an efficient and elegant solution to the problem of finding the median from a data stream. The use of two heaps keeps the data balanced and allows for logarithmic time complexity for addNum
operations.