Given the root
of a binary tree, return all duplicate subtrees.
For each kind of duplicate subtrees, you only need to return the root node of any one of them.
Two trees are duplicate if they have the same structure with the same node values.
Example 1:
Input: root = [1,2,3,4,null,2,4,null,null,4] Output: [[2,4],[4]]
Example 2:
Input: root = [2,1,1] Output: [[1]]
Example 3:
Input: root = [2,2,2,3,null,3,null] Output: [[2,3],[3]]
Constraints:
[1, 5000]
-200 <= Node.val <= 200
This problem asks to find all duplicate subtrees within a given binary tree. Two trees are considered duplicates if they have the same structure and node values. The solution should return a list containing the root node of each unique duplicate subtree.
The most efficient approach uses Depth-First Search (DFS) and a hash table (or map) to identify duplicate subtrees.
Tree Serialization: We need a way to represent each subtree as a unique string. A common technique is to serialize the tree using a pre-order traversal, representing each node as node.val,left_subtree_serialization,right_subtree_serialization
. This string uniquely identifies the subtree's structure and values.
DFS Traversal: Perform a DFS traversal of the binary tree. For each node, recursively serialize its subtree using the method above.
Hash Table (Map): Use a hash table (map) to store the serialized strings as keys and their counts as values. For each serialized string encountered during DFS, increment its count in the hash table. If the count becomes 2, it indicates a duplicate subtree, and its root node is added to the result list.
Time Complexity: O(N), where N is the number of nodes in the tree. The DFS traversal visits each node once, and serialization/hash table operations take constant time per node.
Space Complexity: O(N) in the worst case, to store the serialized strings in the hash table. In the best case, if there are no duplicates, the space complexity is O(log N) (for balanced trees) or O(N) (for skewed trees) to maintain the recursion stack during DFS.
# Definition for a binary tree node.
# class TreeNode:
# def __init__(self, val=0, left=None, right=None):
# self.val = val
# self.left = left
# self.right = right
class Solution:
def findDuplicateSubtrees(self, root: TreeNode) -> List[TreeNode]:
count = collections.Counter() # Hash table to store serialized subtrees and counts
result = [] # list to store the root of the duplicate subtrees
def dfs(node):
if not node:
return "#" # Empty subtree represented by "#"
subtree_str = f"{node.val},{dfs(node.left)},{dfs(node.right)}" # subtree serialization using preorder traversal
count[subtree_str] += 1
if count[subtree_str] == 2: # If count becomes 2, it's a duplicate
result.append(node)
return subtree_str
dfs(root)
return result
The Python code efficiently implements the described approach. The dfs
function recursively serializes subtrees and updates the count
dictionary. The result
list stores the root nodes of duplicate subtrees as soon as they are identified.
The approach is similar in other languages; the primary difference lies in the syntax for hash tables/maps and string manipulation. The provided code examples in Java, C++, Go, TypeScript, and Rust showcase these adaptations. All of them adhere to the same algorithmic strategy.