Given an integer array nums
and two integers k
and p
, return the number of distinct subarrays, which have at most k
elements that are divisible by p
.
Two arrays nums1
and nums2
are said to be distinct if:
i
where nums1[i] != nums2[i]
.A subarray is defined as a non-empty contiguous sequence of elements in an array.
Example 1:
Input: nums = [2,3,3,2,2], k = 2, p = 2 Output: 11 Explanation: The elements at indices 0, 3, and 4 are divisible by p = 2. The 11 distinct subarrays which have at most k = 2 elements divisible by 2 are: [2], [2,3], [2,3,3], [2,3,3,2], [3], [3,3], [3,3,2], [3,3,2,2], [3,2], [3,2,2], and [2,2]. Note that the subarrays [2] and [3] occur more than once in nums, but they should each be counted only once. The subarray [2,3,3,2,2] should not be counted because it has 3 elements that are divisible by 2.
Example 2:
Input: nums = [1,2,3,4], k = 4, p = 1 Output: 10 Explanation: All element of nums are divisible by p = 1. Also, every subarray of nums will have at most 4 elements that are divisible by 1. Since all subarrays are distinct, the total number of subarrays satisfying all the constraints is 10.
Constraints:
1 <= nums.length <= 200
1 <= nums[i], p <= 200
1 <= k <= nums.length
Follow up:
Can you solve this problem in O(n2) time complexity?
This problem asks to find the number of distinct subarrays containing at most k
elements divisible by p
. The key challenge lies in efficiently handling the distinctness constraint and the condition on divisible elements.
Solution 1: Enumeration + String Hashing
This approach uses a brute-force enumeration of all possible subarrays and a string hashing technique to efficiently check for distinctness.
Approach:
Nested Loops for Subarray Generation: The outer loop iterates through each possible starting index i
of a subarray. The inner loop iterates through all possible ending indices j
(from i
to the end of the array).
Counting Divisible Elements: Inside the inner loop, we count the number of elements in the current subarray (nums[i:j+1]
) that are divisible by p
.
Distinctness Check with Hashing: To avoid redundant counting of identical subarrays, we use double hashing. This involves using two different hash functions (here, using two different bases and modulo operations) to create a unique hash value for each subarray. These hashes are stored in a set. Because of the probability of collision in hash function , the usage of two hashing function will reduce the probability of collision. The set automatically handles the distinctness constraint.
Return the Count: Finally, the size of the set (which represents the number of distinct subarrays satisfying the condition) is returned.
Time Complexity: O(n^2), where n is the length of the input array. This is because of the nested loops that enumerate all possible subarrays.
Space Complexity: O(n^2) in the worst case. This is because the set storing the hash values of distinct subarrays could potentially store up to O(n^2) elements if all subarrays are distinct.
Code (Python):
class Solution:
def countDistinct(self, nums: List[int], k: int, p: int) -> int:
s = set()
n = len(nums)
base1, base2 = 131, 13331 # Two different bases for double hashing
mod1, mod2 = 10**9 + 7, 10**9 + 9 # Two different modulo values
for i in range(n):
h1 = h2 = cnt = 0 # Initialize hash values and divisible element count
for j in range(i, n):
cnt += nums[j] % p == 0 #Increment if divisible by p
if cnt > k:
break #If greater than k then break from the loop
h1 = (h1 * base1 + nums[j]) % mod1 #First hash function
h2 = (h2 * base2 + nums[j]) % mod2 #Second hash function
s.add(h1 << 32 | h2) #Combine hashes and add to set
return len(s)
Solution 2 (Simpler, less efficient):
This solution is conceptually simpler but less efficient. It uses string concatenation to represent subarrays and a set to check for distinctness. It is less efficient because string concatenation has a higher time complexity than the hash function in solution 1.
Approach:
Nested loops to generate subarrays: Same as in Solution 1.
Counting divisible elements: Same as in Solution 1.
Subarray representation: The subarray is converted into a string using commas as separators. This is then added to the set to ensure distinctness.
Time Complexity: O(n^3) due to string concatenation which takes O(n) time complexity inside the nested loop
Space Complexity: O(n^2) because of the set that could potentially store all the distinct subarrays in the worst case.
Code (Python):
class Solution:
def countDistinct(self, nums: List[int], k: int, p: int) -> int:
n = len(nums)
s = set()
for i in range(n):
cnt = 0
t = "" # String to represent the subarray
for x in nums[i:]:
cnt += x % p == 0
if cnt > k:
break
t += str(x) + ","
s.add(t)
return len(s)
In summary: Solution 1 (using double hashing) is the preferred approach due to its better time complexity. Solution 2 provides a simpler, but less efficient alternative. Both solutions correctly address the distinct subarray requirement. Remember that the hash function used in solution 1 is probabilistic, meaning there's a tiny chance of hash collisions; however, the use of double hashing mitigates this risk significantly.