{x}
blog image

Count Subarrays With More Ones Than Zeros

Solution Explanation for Counting Subarrays with More Ones Than Zeros

This problem asks to count the number of subarrays within a binary array (nums containing only 0s and 1s) where the number of 1s exceeds the number of 0s. The solution presented utilizes a Binary Indexed Tree (BIT) for efficient counting. Let's break down the approach and the code.

Approach: Prefix Sum and Binary Indexed Tree

  1. Transforming the Problem: Instead of directly comparing ones and zeros, we convert the problem to a sum calculation. We treat each 0 as -1 and each 1 as 1. Now, the condition "more 1s than 0s" becomes "the sum of the subarray is greater than 0".

  2. Prefix Sum: A prefix sum array stores the cumulative sum up to each index. We don't explicitly create a prefix sum array; instead, we maintain the current prefix sum (s) iteratively.

  3. Binary Indexed Tree (BIT): The BIT is a data structure that efficiently handles cumulative frequency updates and queries. In this case, we use it to count the number of times each prefix sum occurs.

  4. Algorithm:

    • Initialize a BIT with a size large enough to accommodate all possible prefix sums (the size of nums +1 is sufficient to handle negative sums). Initially, the prefix sum 0 has a count of 1 (the empty subarray).
    • Iterate through nums:
      • Update the prefix sum s.
      • Query the BIT to find the number of previous prefix sums that are strictly less than the current s. This represents the number of subarrays ending at the current index that have more 1s than 0s. Add this count to the total count (ans).
      • Update the BIT to increment the count of the current prefix sum s.
    • Finally, return ans (modulo 109 + 7 to handle large results).

Time and Space Complexity Analysis

  • Time Complexity: O(n log n). The dominant operations are the BIT updates and queries, each taking O(log n) time. We perform these operations for each element in nums (n elements), leading to O(n log n) overall.
  • Space Complexity: O(n). The BIT requires O(n) space to store cumulative frequencies, and other variables use constant space.

Code Explanations (Python3 example)

class BinaryIndexedTree:
    # ... (Binary Indexed Tree implementation - see code above) ...
 
class Solution:
    def subarraysWithMoreZerosThanOnes(self, nums: List[int]) -> int:
        n = len(nums)
        base = n + 1  # Adjust for negative prefix sums
        tree = BinaryIndexedTree(n + base)  # Initialize BIT
        tree.update(base, 1)  # Empty subarray has sum 0
        mod = 10**9 + 7
        ans = s = 0
        for x in nums:
            s += x or -1  # Treat 0 as -1, 1 as 1
            ans += tree.query(s - 1 + base)  # Query for prefix sums < s
            ans %= mod
            tree.update(s + base, 1)  # Update BIT with current prefix sum
        return ans

The Python code first defines a BinaryIndexedTree class (implementation detailed in the original response). The subarraysWithMoreZerosThanOnes function then uses this BIT to efficiently count subarrays as described in the algorithm above. The base variable handles potential negative prefix sums by shifting the indices in the BIT.

The other languages (Java, C++, Go, TypeScript) follow a very similar structure, adapting the BIT implementation and syntax to each language's specifics but maintaining the core algorithm. The SortedList solution offers an alternative approach with a different time complexity. However, the BIT approach provides a balance of efficiency and clarity for this problem.