{x}
blog image

Find the Duplicate Number

Given an array of integers nums containing n + 1 integers where each integer is in the range [1, n] inclusive.

There is only one repeated number in nums, return this repeated number.

You must solve the problem without modifying the array nums and using only constant extra space.

 

Example 1:

Input: nums = [1,3,4,2,2]
Output: 2

Example 2:

Input: nums = [3,1,3,4,2]
Output: 3

Example 3:

Input: nums = [3,3,3,3,3]
Output: 3

 

Constraints:

  • 1 <= n <= 105
  • nums.length == n + 1
  • 1 <= nums[i] <= n
  • All the integers in nums appear only once except for precisely one integer which appears two or more times.

 

Follow up:

  • How can we prove that at least one duplicate number must exist in nums?
  • Can you solve the problem in linear runtime complexity?

Solution Explanation: Find the Duplicate Number

This problem asks to find the duplicate number in an array of integers where each integer is within the range [1, n] (inclusive), and the array has n+1 elements. The constraint is to solve this without modifying the input array and using only constant extra space.

The most efficient solution uses Binary Search. Here's a breakdown:

Core Idea:

The solution leverages the fact that if a number x is the duplicate, there must be more numbers less than or equal to x than x itself. This allows us to use binary search to efficiently find the duplicate.

Algorithm:

  1. Initialization: Set left to 1 and right to n (the upper bound of the range).

  2. Binary Search Iteration: While left is less than right:

    • Calculate the middle point mid as (left + right) / 2.
    • Count the number of elements in the array that are less than or equal to mid. Let's call this count count.
    • Decision:
      • If count > mid, it means there are more numbers less than or equal to mid than the possible unique values (1 to mid). This implies the duplicate must be within the range [1, mid], so we update right to mid.
      • Otherwise, count <= mid, meaning the duplicate is in the range [mid + 1, n], so we update left to mid + 1.
  3. Return Result: Once the loop terminates (left == right), left (or right) will hold the duplicate number.

Time Complexity: O(n log n)

The binary search performs log n iterations. In each iteration, we need to scan the entire array (size n) to count the elements. Therefore, the overall time complexity is O(n log n).

Space Complexity: O(1)

The algorithm uses only a few variables to store left, right, mid, and count. It does not use any additional data structures that scale with the input size; therefore, the space complexity is constant.

Code Examples (Python):

import bisect
 
class Solution:
    def findDuplicate(self, nums: List[int]) -> int:
        def count_less_equal(x):
            return sum(1 for num in nums if num <= x)
 
        left, right = 1, len(nums) - 1
        while left < right:
            mid = (left + right) // 2
            if count_less_equal(mid) > mid:
                right = mid
            else:
                left = mid + 1
        return left
 
#Alternative using bisect_left for slightly improved performance in python
 
class Solution:
    def findDuplicate(self, nums: List[int]) -> int:
        def f(x: int) -> bool:
            return sum(v <= x for v in nums) > x
 
        return bisect_left(range(len(nums)), True, key=f)
 

The code in other languages (Java, C++, Go, TypeScript, Rust, Javascript) provided earlier follow the same logic with minor syntactic differences for each language. They all maintain the same O(n log n) time and O(1) space complexities.