Given a data stream input of non-negative integers a1, a2, ..., an
, summarize the numbers seen so far as a list of disjoint intervals.
Implement the SummaryRanges
class:
SummaryRanges()
Initializes the object with an empty stream.void addNum(int value)
Adds the integer value
to the stream.int[][] getIntervals()
Returns a summary of the integers in the stream currently as a list of disjoint intervals [starti, endi]
. The answer should be sorted by starti
.
Example 1:
Input ["SummaryRanges", "addNum", "getIntervals", "addNum", "getIntervals", "addNum", "getIntervals", "addNum", "getIntervals", "addNum", "getIntervals"] [[], [1], [], [3], [], [7], [], [2], [], [6], []] Output [null, null, [[1, 1]], null, [[1, 1], [3, 3]], null, [[1, 1], [3, 3], [7, 7]], null, [[1, 3], [7, 7]], null, [[1, 3], [6, 7]]] Explanation SummaryRanges summaryRanges = new SummaryRanges(); summaryRanges.addNum(1); // arr = [1] summaryRanges.getIntervals(); // return [[1, 1]] summaryRanges.addNum(3); // arr = [1, 3] summaryRanges.getIntervals(); // return [[1, 1], [3, 3]] summaryRanges.addNum(7); // arr = [1, 3, 7] summaryRanges.getIntervals(); // return [[1, 1], [3, 3], [7, 7]] summaryRanges.addNum(2); // arr = [1, 2, 3, 7] summaryRanges.getIntervals(); // return [[1, 3], [7, 7]] summaryRanges.addNum(6); // arr = [1, 2, 3, 6, 7] summaryRanges.getIntervals(); // return [[1, 3], [6, 7]]
Constraints:
0 <= value <= 104
3 * 104
calls will be made to addNum
and getIntervals
.102
calls will be made to getIntervals
.
Follow up: What if there are lots of merges and the number of disjoint intervals is small compared to the size of the data stream?
This problem requires designing a data structure that efficiently handles adding numbers to a stream and returning the summarized disjoint intervals. The optimal approach utilizes a TreeMap
(or SortedDict
in Python) to store the intervals. A TreeMap
maintains sorted order, making it easy to find overlapping intervals.
Algorithm:
Initialization: The SummaryRanges
class is initialized with an empty TreeMap
(or SortedDict
). The key in the map is the starting number of an interval, and the value is a 2-element array/list representing the [start, end]
of the interval.
addNum(val)
: This method adds a number val
to the stream. The core logic involves finding the appropriate position of val
within the existing intervals in the TreeMap
:
Find Adjacent Intervals: Using floorKey()
and ceilingKey()
(or equivalent methods in Python's SortedDict
), we locate the intervals immediately before and after val
in the sorted map.
Merge Overlapping Intervals: If val
falls between existing intervals (l
and r
) and creates an overlap, merge them into a single, larger interval.
Extend Existing Intervals: If val
is adjacent to an existing interval (either before or after), extend that interval to include val
.
Add New Interval: If val
doesn't overlap or extend any existing interval, add a new entry [val, val]
to the TreeMap
.
getIntervals()
: This method simply retrieves and returns the intervals stored in the TreeMap
as a 2D array (or list of lists). Because the TreeMap
is sorted by the start of each interval, the output is already sorted as required.
Time Complexity Analysis:
addNum(val)
: The floorKey()
and ceilingKey()
operations on a TreeMap
have a time complexity of O(log n), where n is the number of intervals. All other operations within addNum
(comparisons, merges, updates) are O(1). Therefore, the overall time complexity of addNum
is O(log n).
getIntervals()
: Iterating through the TreeMap
and copying its values takes O(n) time, where n is the number of intervals. Therefore, the time complexity of getIntervals
is O(n).
Space Complexity Analysis:
The space complexity is O(n) in the worst case, where n is the number of intervals. This is because we store all the intervals in the TreeMap
. In the best-case scenario (where the numbers are added such that no merging happens), the space complexity would be O(n).
Code Examples:
The provided code examples demonstrate the implementation in Python, Java, and C++. They all follow the same algorithm using their respective sorted map implementations. The use of SortedDict
in python adds some complexity because it is a third party library and not the in built library. If you prefer to use built-in libraries, you can use a list and sort it before returning the intervals but this would change the time complexity of the getIntervals
function to O(nlogn).
The core idea is to leverage the efficient sorted map data structures to minimize the time spent searching for intervals, leading to a logarithmic time complexity for adding numbers. The space used scales linearly with the number of disjoint intervals.