Interview Question Series | Season 3

Problem Overview

You are designing a data structure that receives numbers one by one from a stream.

At any moment, you must be able to return the median of all numbers seen so far.

Definition of median:

  • If the count is odd → the middle value

  • If the count is even → the average of the two middle values

This problem is frequently asked by Google because it tests data structures, balancing logic, and real-time computation.


Example

Operations:

addNum(1)
addNum(2)
findMedian()
addNum(3)
findMedian()

State evolution:

[1]           → median = 1
[1, 2]        → median = (1 + 2) / 2 = 1.5
[1, 2, 3]     → median = 2

Approach 1: Sort After Every Insertion

Idea:

  • Keep all numbers in a list

  • Sort the list after each insertion

  • Pick the middle value

Disadvantages:

  • Sorting costs O(n log n)

  • Too slow for up to 50,000 operations

  • Repeated sorting wastes time

This approach is unacceptable for large data streams.


Approach 2: Keep a Sorted Data Structure

Idea:

  • Insert numbers in sorted order

  • Use binary search to find the correct position

Disadvantages:

  • Insertion costs O(n) due to shifting elements

  • Still too slow overall

This improves sorting but still fails performance constraints.


Best Approach: Two Heaps Strategy

Core Idea

Split the numbers into two halves:

  • Left half → smaller numbers

  • Right half → larger numbers

Maintain the following rule:

  • Left side has equal or one more element than the right side

  • All elements in left ≤ all elements in right

This allows the median to be read instantly.


Data Structures Used

  • Max-heap (left)
    Stores the smaller half, largest element on top

  • Min-heap (right)
    Stores the larger half, smallest element on top


Insertion Logic

When a new number arrives:

  1. If it is smaller than or equal to the top of the left heap → push to left

  2. Otherwise → push to right

  3. Rebalance:

    • Left heap can have at most one extra element

    • If imbalance happens, move elements between heaps


Median Calculation

  • If both heaps have the same size
    → median is the average of both top elements

  • If left heap has one extra element
    → median is the top of left heap

This works because left always contains the middle element.


Why This Works

  • Insertion takes O(log n)

  • Median retrieval takes O(1)

  • No sorting required

  • Works perfectly for streaming data

This is the optimal solution.


Time and Space Complexity

  • addNum → O(log n)

  • findMedian → O(1)

  • Space → O(n)


Final Code (Two Heaps)

class MedianFinder {
    priority_queue<int> left;
    priority_queue<int, vector<int>, greater<int>> right;

public:
    MedianFinder() {}

    void addNum(int num) {
        if (left.empty() || num <= left.top())
            left.push(num);
        else
            right.push(num);

        if (left.size() > right.size() + 1) {
            right.push(left.top());
            left.pop();
        } else if (right.size() > left.size()) {
            left.push(right.top());
            right.pop();
        }
    }

    double findMedian() {
        if (left.size() == right.size())
            return (left.top() + right.top()) / 2.0;
        return left.top();
    }
};
Scroll to Top