{x}
blog image

Reduce Array Size to The Half

You are given an integer array arr. You can choose a set of integers and remove all the occurrences of these integers in the array.

Return the minimum size of the set so that at least half of the integers of the array are removed.

 

Example 1:

Input: arr = [3,3,3,3,5,5,5,2,2,7]
Output: 2
Explanation: Choosing {3,7} will make the new array [5,5,5,2,2] which has size 5 (i.e equal to half of the size of the old array).
Possible sets of size 2 are {3,5},{3,2},{5,2}.
Choosing set {2,7} is not possible as it will make the new array [3,3,3,3,5,5,5] which has a size greater than half of the size of the old array.

Example 2:

Input: arr = [7,7,7,7,7,7]
Output: 1
Explanation: The only possible set you can choose is {7}. This will make the new array empty.

 

Constraints:

  • 2 <= arr.length <= 105
  • arr.length is even.
  • 1 <= arr[i] <= 105

Solution Explanation: Reduce Array Size to The Half

This problem asks us to find the minimum number of unique elements to remove from an array such that at least half of the array's elements are removed. The optimal approach is a greedy algorithm leveraging frequency counting and sorting.

Approach:

  1. Frequency Counting: First, we need to determine how many times each element appears in the input array arr. We can efficiently do this using a hash map (dictionary in Python, Map in JavaScript, etc.) or an array if the elements are within a known range. This step counts the occurrences of each unique element.

  2. Sorting by Frequency: Next, we sort the unique elements based on their frequencies in descending order. Elements that appear more frequently are more valuable to remove because they contribute to removing a larger portion of the array with fewer unique element removals.

  3. Greedy Selection: We iterate through the sorted frequencies. In each iteration, we remove the most frequent element. We keep track of the total number of elements removed (m). We continue this process until m is at least half the size of the original array.

  4. Return the Count: The number of iterations it took to reach or exceed half the array's size represents the minimum number of unique elements needed to remove to satisfy the problem's condition. This is our final answer.

Time and Space Complexity:

  • Time Complexity: O(n log n), dominated by the sorting step (if using a comparison-based sorting algorithm). The frequency counting step is O(n) in the average case for hash maps.

  • Space Complexity: O(n) in the worst case to store the frequencies of elements. If the input array has many unique elements, the hash map or frequency array could potentially use linear space.

Code Implementation (Python):

from collections import Counter
 
class Solution:
    def minSetSize(self, arr: List[int]) -> int:
        n = len(arr)
        counts = Counter(arr)  # Count element frequencies
        frequencies = list(counts.values())
        frequencies.sort(reverse=True)  # Sort frequencies in descending order
 
        removed_count = 0
        unique_elements_count = 0
        for freq in frequencies:
            removed_count += freq
            unique_elements_count += 1
            if removed_count >= n // 2:  # Check if at least half removed
                break
 
        return unique_elements_count
 

Code Implementation (Java):

import java.util.*;
class Solution {
    public int minSetSize(int[] arr) {
        Map<Integer, Integer> counts = new HashMap<>();
        for (int num : arr) {
            counts.put(num, counts.getOrDefault(num, 0) + 1);
        }
 
        List<Integer> frequencies = new ArrayList<>(counts.values());
        Collections.sort(frequencies, Collections.reverseOrder());
 
        int removedCount = 0;
        int uniqueElementsCount = 0;
        for (int freq : frequencies) {
            removedCount += freq;
            uniqueElementsCount++;
            if (removedCount >= arr.length / 2) {
                break;
            }
        }
        return uniqueElementsCount;
    }
}

Code Implementation (C++):

#include <iostream>
#include <vector>
#include <map>
#include <algorithm>
 
using namespace std;
 
class Solution {
public:
    int minSetSize(vector<int>& arr) {
        map<int, int> counts;
        for (int num : arr) {
            counts[num]++;
        }
 
        vector<int> frequencies;
        for (auto const& [key, val] : counts) {
            frequencies.push_back(val);
        }
        sort(frequencies.begin(), frequencies.end(), greater<int>());
 
        int removedCount = 0;
        int uniqueElementsCount = 0;
        for (int freq : frequencies) {
            removedCount += freq;
            uniqueElementsCount++;
            if (removedCount >= arr.size() / 2) {
                break;
            }
        }
        return uniqueElementsCount;
    }
};

The other languages (Go, TypeScript etc.) would follow a very similar structure, adapting the data structures and syntax appropriately. The core algorithmic idea remains consistent across all implementations.