{x}
blog image

Kth Largest Element in a Stream

You are part of a university admissions office and need to keep track of the kth highest test score from applicants in real-time. This helps to determine cut-off marks for interviews and admissions dynamically as new applicants submit their scores.

You are tasked to implement a class which, for a given integer k, maintains a stream of test scores and continuously returns the kth highest test score after a new score has been submitted. More specifically, we are looking for the kth highest score in the sorted list of all scores.

Implement the KthLargest class:

  • KthLargest(int k, int[] nums) Initializes the object with the integer k and the stream of test scores nums.
  • int add(int val) Adds a new test score val to the stream and returns the element representing the kth largest element in the pool of test scores so far.

 

Example 1:

Input:
["KthLargest", "add", "add", "add", "add", "add"]
[[3, [4, 5, 8, 2]], [3], [5], [10], [9], [4]]

Output: [null, 4, 5, 5, 8, 8]

Explanation:

KthLargest kthLargest = new KthLargest(3, [4, 5, 8, 2]);
kthLargest.add(3); // return 4
kthLargest.add(5); // return 5
kthLargest.add(10); // return 5
kthLargest.add(9); // return 8
kthLargest.add(4); // return 8

Example 2:

Input:
["KthLargest", "add", "add", "add", "add"]
[[4, [7, 7, 7, 7, 8, 3]], [2], [10], [9], [9]]

Output: [null, 7, 7, 7, 8]

Explanation:

KthLargest kthLargest = new KthLargest(4, [7, 7, 7, 7, 8, 3]);
kthLargest.add(2); // return 7
kthLargest.add(10); // return 7
kthLargest.add(9); // return 7
kthLargest.add(9); // return 8

 

Constraints:

  • 0 <= nums.length <= 104
  • 1 <= k <= nums.length + 1
  • -104 <= nums[i] <= 104
  • -104 <= val <= 104
  • At most 104 calls will be made to add.

Solution Explanation: 703. Kth Largest Element in a Stream

This problem requires maintaining a data stream and efficiently retrieving the kth largest element after each addition. The optimal solution utilizes a min-heap (priority queue) data structure.

Approach:

  1. Initialization (__init__ or constructor): The KthLargest class is initialized with an integer k (representing the desired kth largest element) and an initial array nums. We create a min-heap, minQ. The initial nums array is processed; each element is added to the min-heap. Crucially, the heap only maintains a maximum size of k. If the heap's size exceeds k after adding an element, the smallest element (the root of the min-heap) is removed, ensuring that only the k largest elements are stored.

  2. Adding an Element (add): The add method takes a new element val as input. val is added to the min-heap. Again, if the heap's size exceeds k, the smallest element is removed. The method then returns the smallest element currently in the heap (which represents the kth largest element overall).

Time Complexity Analysis:

  • __init__ / constructor: The initial insertion of elements from nums into the min-heap takes O(n log k) time, where n is the length of nums. Building a heap from n elements takes O(n) time, and each insertion/deletion on a heap of size k takes O(log k) time in the worst case.

  • add: Each call to add involves inserting an element (O(log k)) and potentially removing the smallest element (O(log k)). Therefore, the time complexity of add is O(log k).

Space Complexity Analysis:

The space complexity is O(k) because the min-heap stores at most k elements.

Code Examples (with explanations inline):

The provided code solutions in Python, Java, C++, Go, TypeScript, and JavaScript all follow this approach. Let's analyze a representative example (Python):

import heapq  # Import the heapq module for heap operations
 
class KthLargest:
    def __init__(self, k: int, nums: List[int]):
        self.k = k
        self.min_q = []  # Initialize an empty min-heap
        for x in nums:
            self.add(x)  # Add initial numbers to heap
 
    def add(self, val: int) -> int:
        heapq.heappush(self.min_q, val)  # Push the new value onto the heap
        if len(self.min_q) > self.k:  # If heap size exceeds k
            heapq.heappop(self.min_q)  # Remove the smallest element
        return self.min_q[0]  # Return the kth largest (smallest in min-heap)
 

The other languages use similar logic with their respective priority queue implementations. The core idea remains the same: use a min-heap to efficiently maintain and access the kth largest element in a stream.