{x}
blog image

H-Index

Given an array of integers citations where citations[i] is the number of citations a researcher received for their ith paper, return the researcher's h-index.

According to the definition of h-index on Wikipedia: The h-index is defined as the maximum value of h such that the given researcher has published at least h papers that have each been cited at least h times.

 

Example 1:

Input: citations = [3,0,6,1,5]
Output: 3
Explanation: [3,0,6,1,5] means the researcher has 5 papers in total and each of them had received 3, 0, 6, 1, 5 citations respectively.
Since the researcher has 3 papers with at least 3 citations each and the remaining two with no more than 3 citations each, their h-index is 3.

Example 2:

Input: citations = [1,3,1]
Output: 1

 

Constraints:

  • n == citations.length
  • 1 <= n <= 5000
  • 0 <= citations[i] <= 1000

274. H-Index

This problem asks to find the h-index of a researcher given an array of citations for their papers. The h-index is the maximum value h such that the researcher has at least h papers with at least h citations each.

We'll explore three solutions with varying time and space complexities.

Solution 1: Sorting

This approach sorts the citations in descending order. Then, it iterates through the sorted array. For each index h, it checks if the citation at that index (citations[h-1]) is greater than or equal to h. If it is, then h is the h-index because at least h papers have at least h citations.

Time Complexity: O(n log n) due to sorting. Space Complexity: O(log n) for sorting (in-place sorting algorithms like merge sort or quicksort have logarithmic space complexity).

Code (Python):

class Solution:
    def hIndex(self, citations: List[int]) -> int:
        citations.sort(reverse=True)
        for h in range(len(citations), 0, -1):
            if citations[h - 1] >= h:
                return h
        return 0

Code (Java):

import java.util.Arrays;
 
class Solution {
    public int hIndex(int[] citations) {
        Arrays.sort(citations);
        int n = citations.length;
        for (int h = n; h > 0; --h) {
            if (citations[n - h] >= h) {
                return h;
            }
        }
        return 0;
    }
}

Solution 2: Counting Sort

This method uses a counting array to count the frequency of citations. It then iterates through the counting array from the end, accumulating the count of papers with at least h citations. If the accumulated count is greater than or equal to h, then h is the h-index.

Time Complexity: O(n) - Single pass through citations and a single pass through the counting array. Space Complexity: O(n) to store the counting array.

Code (Python):

class Solution:
    def hIndex(self, citations: List[int]) -> int:
        n = len(citations)
        cnt = [0] * (n + 1)
        for x in citations:
            cnt[min(x, n)] += 1
        s = 0
        for h in range(n, -1, -1):
            s += cnt[h]
            if s >= h:
                return h

Code (Java):

class Solution {
    public int hIndex(int[] citations) {
        int n = citations.length;
        int[] cnt = new int[n + 1];
        for (int x : citations) {
            cnt[Math.min(x, n)]++;
        }
        int s = 0;
        for (int h = n; h >= 0; h--) {
            s += cnt[h];
            if (s >= h) return h;
        }
        return 0;
    }
}

This approach uses binary search to efficiently find the h-index. It leverages the fact that if there's an h-index of h, then all values less than h are also valid h-indices (though not the maximum h-index).

Time Complexity: O(n log n) because of the repeated counting in the binary search loop. Each counting step takes O(n) time. Space Complexity: O(1) - Constant extra space.

Code (Python):

class Solution:
    def hIndex(self, citations: List[int]) -> int:
        l, r = 0, len(citations)
        while l < r:
            mid = (l + r + 1) // 2
            if sum(1 for x in citations if x >= mid) >= mid:
                l = mid
            else:
                r = mid - 1
        return l

Code (Java):

class Solution {
    public int hIndex(int[] citations) {
        int l = 0, r = citations.length;
        while (l < r) {
            int mid = (l + r + 1) / 2;
            int count = 0;
            for (int citation : citations) {
                if (citation >= mid) {
                    count++;
                }
            }
            if (count >= mid) {
                l = mid;
            } else {
                r = mid - 1;
            }
        }
        return l;
    }
}

In summary, the counting sort approach (Solution 2) offers the best time complexity (O(n)) if space is not a major constraint. If space is a concern or the input size is very large, the sorting approach (Solution 1) might be preferable due to its better space efficiency. Binary Search (Solution 3) provides a compromise but doesn't offer significant performance advantages over the other methods in practice.