{x}
blog image

Lexicographically Smallest Equivalent String

You are given two strings of the same length s1 and s2 and a string baseStr.

We say s1[i] and s2[i] are equivalent characters.

  • For example, if s1 = "abc" and s2 = "cde", then we have 'a' == 'c', 'b' == 'd', and 'c' == 'e'.

Equivalent characters follow the usual rules of any equivalence relation:

  • Reflexivity: 'a' == 'a'.
  • Symmetry: 'a' == 'b' implies 'b' == 'a'.
  • Transitivity: 'a' == 'b' and 'b' == 'c' implies 'a' == 'c'.

For example, given the equivalency information from s1 = "abc" and s2 = "cde", "acd" and "aab" are equivalent strings of baseStr = "eed", and "aab" is the lexicographically smallest equivalent string of baseStr.

Return the lexicographically smallest equivalent string of baseStr by using the equivalency information from s1 and s2.

 

Example 1:

Input: s1 = "parker", s2 = "morris", baseStr = "parser"
Output: "makkek"
Explanation: Based on the equivalency information in s1 and s2, we can group their characters as [m,p], [a,o], [k,r,s], [e,i].
The characters in each group are equivalent and sorted in lexicographical order.
So the answer is "makkek".

Example 2:

Input: s1 = "hello", s2 = "world", baseStr = "hold"
Output: "hdld"
Explanation: Based on the equivalency information in s1 and s2, we can group their characters as [h,w], [d,e,o], [l,r].
So only the second letter 'o' in baseStr is changed to 'd', the answer is "hdld".

Example 3:

Input: s1 = "leetcode", s2 = "programs", baseStr = "sourcecode"
Output: "aauaaaaada"
Explanation: We group the equivalent characters in s1 and s2 as [a,o,e,r,s,c], [l,p], [g,t] and [d,m], thus all letters in baseStr except 'u' and 'd' are transformed to 'a', the answer is "aauaaaaada".

 

Constraints:

  • 1 <= s1.length, s2.length, baseStr <= 1000
  • s1.length == s2.length
  • s1, s2, and baseStr consist of lowercase English letters.

Solution Explanation

This problem asks to find the lexicographically smallest equivalent string based on equivalence relations defined by two strings s1 and s2. The solution leverages the Union-Find (Disjoint-Set) data structure to efficiently manage equivalence classes.

Approach:

  1. Union-Find Initialization: A parent array p of size 26 is created to represent the equivalence classes for lowercase English letters (a-z). Initially, each letter is its own parent (representing disjoint sets).

  2. Building Equivalence Classes: The algorithm iterates through s1 and s2. For each pair of characters s1[i] and s2[i], their corresponding indices (0-25) are used to find their respective root parents using the find() function (path compression optimization is used for efficiency). The roots of the sets are then merged – the parent of the root with the larger index is set to the parent of the root with the smaller index. This ensures that all equivalent characters end up in the same set.

  3. Finding the Smallest Equivalent String: The algorithm iterates through baseStr. For each character, it finds its root parent using find(). The character corresponding to this root parent (the lexicographically smallest character in its equivalence class) is appended to the result string.

  4. Return Result: The function returns the constructed string which contains the lexicographically smallest equivalent string of baseStr.

Time Complexity Analysis:

  • Union-Find Initialization: O(1) - A constant-time operation creating the parent array.
  • Building Equivalence Classes: O(N*α(N)), where N is the length of s1 and s2, and α(N) is the inverse Ackermann function. In practice, α(N) is very small and can be considered a constant. The overall time complexity becomes effectively O(N). Path compression within the find() function ensures near-constant-time performance.
  • Finding Smallest Equivalent String: O(M), where M is the length of baseStr. The loop iterates through each character once, and find() takes effectively constant time (due to path compression).
  • Total Time Complexity: O(N + M). The dominant factor is the lengths of the input strings.

Space Complexity Analysis:

  • Union-Find Array: O(1) - The parent array has a fixed size of 26.
  • Result String: O(M) - The space used to store the resulting string is proportional to the length of baseStr.
  • Total Space Complexity: O(M)

Code Explanation (Python):

The Python code directly implements the above steps. The find() function performs path compression for efficiency in Union-Find. The main function iterates through the baseStr and efficiently finds the smallest equivalent character using the pre-computed parent array.

Code Explanation (Java, C++, Go):

The Java, C++, and Go implementations follow the same logic as the Python code, but use the respective language's syntax and data structures. The core Union-Find logic remains identical, ensuring the same time and space complexity.