Python - Find Median in a Stream (Running Median Problem)
Problem Statement
Given a stream of numbers, we need to find the median of the numbers at any given time.
The median is the middle element when numbers are sorted.
If the count of numbers is odd, the median is the middle element.
If the count of numbers is even, the median is the average of the two middle elements.
Example
Input: [5, 15, 1, 3]
Output: [5, 10.0, 5, 4.0]
Explanation:
Median after 5 → 5
Median after 15 → (5 + 15)/2 = 10.0
Median after 1 → 5
Median after 3 → (3 + 5)/2 = 4.0
Approach: Min Heap + Max Heap (O(log N) per insertion)
Algorithm:
Use two heaps:
Max Heap (left) → Stores the smaller half of the numbers.
Min Heap (right) → Stores the larger half of the numbers.
Balancing:
If left has more elements, move the largest element from left to right.
If right has more elements, move the smallest element from right to left.
Finding Median:
If left has more elements → Median = max(left)
If both heaps have equal elements → Median = (max(left) + min(right)) / 2
Implementation
import heapq
class MedianFinder:
def __init__(self):
self.left = [] # Max Heap (negative values to simulate max heap)
self.right = [] # Min Heap
def add_num(self, num):
heapq.heappush(self.left, -num) # Push into max heap (invert sign)
# Balance: Ensure max(left) ≤ min(right)
if self.left and self.right and (-self.left[0] > self.right[0]):
heapq.heappush(self.right, -heapq.heappop(self.left))
# Balance size: left can have at most one extra element
if len(self.left) > len(self.right) + 1:
heapq.heappush(self.right, -heapq.heappop(self.left))
elif len(self.right) > len(self.left):
heapq.heappush(self.left, -heapq.heappop(self.right))
def find_median(self):
if len(self.left) > len(self.right):
return float(-self.left[0])
return (-self.left[0] + self.right[0]) / 2.0
# Example Usage
stream = [5, 15, 1, 3]
median_finder = MedianFinder()
for num in stream:
median_finder.add_num(num)
print(median_finder.find_median()) # Output: 5, 10.0, 5, 4.0