You are given two strings of the same length s1
and s2
and a string baseStr
.
We say s1[i]
and s2[i]
are equivalent characters.
s1 = "abc"
and s2 = "cde"
, then we have 'a' == 'c'
, 'b' == 'd'
, and 'c' == 'e'
.Equivalent characters follow the usual rules of any equivalence relation:
'a' == 'a'
.'a' == 'b'
implies 'b' == 'a'
.'a' == 'b'
and 'b' == 'c'
implies 'a' == 'c'
.For example, given the equivalency information from s1 = "abc"
and s2 = "cde"
, "acd"
and "aab"
are equivalent strings of baseStr = "eed"
, and "aab"
is the lexicographically smallest equivalent string of baseStr
.
Return the lexicographically smallest equivalent string of baseStr
by using the equivalency information from s1
and s2
.
Example 1:
Input: s1 = "parker", s2 = "morris", baseStr = "parser" Output: "makkek" Explanation: Based on the equivalency information in s1 and s2, we can group their characters as [m,p], [a,o], [k,r,s], [e,i]. The characters in each group are equivalent and sorted in lexicographical order. So the answer is "makkek".
Example 2:
Input: s1 = "hello", s2 = "world", baseStr = "hold" Output: "hdld" Explanation: Based on the equivalency information in s1 and s2, we can group their characters as [h,w], [d,e,o], [l,r]. So only the second letter 'o' in baseStr is changed to 'd', the answer is "hdld".
Example 3:
Input: s1 = "leetcode", s2 = "programs", baseStr = "sourcecode" Output: "aauaaaaada" Explanation: We group the equivalent characters in s1 and s2 as [a,o,e,r,s,c], [l,p], [g,t] and [d,m], thus all letters in baseStr except 'u' and 'd' are transformed to 'a', the answer is "aauaaaaada".
Constraints:
1 <= s1.length, s2.length, baseStr <= 1000
s1.length == s2.length
s1
, s2
, and baseStr
consist of lowercase English letters.This problem asks to find the lexicographically smallest equivalent string based on equivalence relations defined by two strings s1
and s2
. The solution leverages the Union-Find (Disjoint-Set) data structure to efficiently manage equivalence classes.
Approach:
Union-Find Initialization: A parent
array p
of size 26 is created to represent the equivalence classes for lowercase English letters (a-z). Initially, each letter is its own parent (representing disjoint sets).
Building Equivalence Classes: The algorithm iterates through s1
and s2
. For each pair of characters s1[i]
and s2[i]
, their corresponding indices (0-25) are used to find their respective root parents using the find()
function (path compression optimization is used for efficiency). The roots of the sets are then merged – the parent of the root with the larger index is set to the parent of the root with the smaller index. This ensures that all equivalent characters end up in the same set.
Finding the Smallest Equivalent String: The algorithm iterates through baseStr
. For each character, it finds its root parent using find()
. The character corresponding to this root parent (the lexicographically smallest character in its equivalence class) is appended to the result string.
Return Result: The function returns the constructed string which contains the lexicographically smallest equivalent string of baseStr
.
Time Complexity Analysis:
parent
array.s1
and s2
, and α(N) is the inverse Ackermann function. In practice, α(N) is very small and can be considered a constant. The overall time complexity becomes effectively O(N). Path compression within the find()
function ensures near-constant-time performance.baseStr
. The loop iterates through each character once, and find()
takes effectively constant time (due to path compression).Space Complexity Analysis:
parent
array has a fixed size of 26.baseStr
.Code Explanation (Python):
The Python code directly implements the above steps. The find()
function performs path compression for efficiency in Union-Find. The main function iterates through the baseStr
and efficiently finds the smallest equivalent character using the pre-computed parent
array.
Code Explanation (Java, C++, Go):
The Java, C++, and Go implementations follow the same logic as the Python code, but use the respective language's syntax and data structures. The core Union-Find logic remains identical, ensuring the same time and space complexity.