{x}
blog image

Reorder Data in Log Files

You are given an array of logs. Each log is a space-delimited string of words, where the first word is the identifier.

There are two types of logs:

  • Letter-logs: All words (except the identifier) consist of lowercase English letters.
  • Digit-logs: All words (except the identifier) consist of digits.

Reorder these logs so that:

  1. The letter-logs come before all digit-logs.
  2. The letter-logs are sorted lexicographically by their contents. If their contents are the same, then sort them lexicographically by their identifiers.
  3. The digit-logs maintain their relative ordering.

Return the final order of the logs.

 

Example 1:

Input: logs = ["dig1 8 1 5 1","let1 art can","dig2 3 6","let2 own kit dig","let3 art zero"]
Output: ["let1 art can","let3 art zero","let2 own kit dig","dig1 8 1 5 1","dig2 3 6"]
Explanation:
The letter-log contents are all different, so their ordering is "art can", "art zero", "own kit dig".
The digit-logs have a relative order of "dig1 8 1 5 1", "dig2 3 6".

Example 2:

Input: logs = ["a1 9 2 3 1","g1 act car","zo4 4 7","ab1 off key dog","a8 act zoo"]
Output: ["g1 act car","a8 act zoo","ab1 off key dog","a1 9 2 3 1","zo4 4 7"]

 

Constraints:

  • 1 <= logs.length <= 100
  • 3 <= logs[i].length <= 100
  • All the tokens of logs[i] are separated by a single space.
  • logs[i] is guaranteed to have an identifier and at least one word after the identifier.

Solution Explanation

This problem requires sorting log files based on specific rules. The solution involves categorizing logs into letter-logs and digit-logs and then sorting them accordingly.

Approach:

The core idea is to use a custom comparison function (or lambda in Python) within a sorting algorithm. This function will determine the order of two log entries based on whether they are letter-logs or digit-logs and their content.

  1. Categorization: We identify whether a log is a letter-log or a digit-log based on the second word (after the identifier). If the second word's first character is an alphabet, it's a letter-log; otherwise, it's a digit-log.

  2. Sorting:

    • Letter-logs: These are sorted lexicographically first by their content (all words after the identifier), and if the contents are the same, then by their identifier.
    • Digit-logs: These maintain their original order from the input array.
  3. Combining: Finally, the sorted letter-logs and the original order of digit-logs are combined, ensuring letter-logs come before digit-logs.

Time Complexity Analysis:

The dominant operation is the sorting of the logs array. The provided solutions use built-in sorting functions (like sorted() in Python, Arrays.sort() in Java, or logs.sort() in Rust and TypeScript), which typically have a time complexity of O(N log N), where N is the number of logs. The comparison function within the sort has a time complexity of O(M) in the worst case for letter logs (where M is the maximum length of a log's content), but this is dwarfed by the O(N log N) of the sort itself. Therefore, the overall time complexity is O(N log N).

Space Complexity Analysis:

The space complexity depends on the implementation of the sorting algorithm. In-place sorting algorithms would have a space complexity of O(1) (constant space), while merge sort (used by some implementations) might require O(N) extra space. The extra space used by our comparison function is constant, O(1), as it just stores some pointers or temporary string variables. Thus, the overall space complexity is either O(1) or O(N), depending on the sorting algorithm used by the specific language's implementation.

Code Explanation (Python):

The Python solution uses the sorted() function with a custom key function (cmp). The cmp function returns a tuple. The first element of the tuple is used for primary sorting: 0 for letter logs, 1 for digit logs. Then, for letter-logs, it uses the content and identifier for secondary and tertiary sorting respectively. This cleverly leverages Python's tuple comparison rules to achieve the required sorting order in a concise way.

Code Explanation (Java):

The Java solution employs Arrays.sort() with a custom comparator. The cmp method compares two log strings. The logic is similar to Python's: it determines the log type, and sorts letter-logs lexicographically first by content, then by identifier.

Code Explanation (TypeScript):

The TypeScript solution uses logs.sort() with an inline comparison function. It checks for digit logs and sorts letter logs using content1 and content2.

Code Explanation (Rust):

The Rust solution utilizes logs.sort_by() with a closure as the comparator. Similar logic to the other languages is used to categorize logs and order them according to the problem's requirements. The then() method chains comparisons, ensuring that identifier sorting happens only when content is equal.

In summary, all solutions efficiently address the problem using a custom comparison function within a standard sorting algorithm, resulting in a time complexity of O(N log N) and a space complexity of O(1) or O(N) depending on the underlying sorting algorithm.