{x}
blog image

Transpose File

Given a text file file.txt, transpose its content.

You may assume that each row has the same number of columns, and each field is separated by the ' ' character.

Example:

If file.txt has the following content:

name age
alice 21
ryan 30

Output the following:

name alice ryan
age 21 30

Solution Explanation for Transpose File

The problem requires transposing a text file where each line has the same number of space-separated fields. The solution uses awk, a powerful text processing tool, to efficiently achieve this.

Approach

The awk script iterates through the input file line by line (NR represents the current line number). For each line, it iterates through the fields (NF represents the number of fields in the current line).

  • Initialization: If it's the first line (NR == 1), it initializes an array res where each element res[i] will store the i-th column's values. The value of re$i is implicitly the i-th field.

  • Concatenation: For subsequent lines (NR > 1), it appends the current field's value ($i) to the corresponding element in the res array, preceded by a space to maintain separation.

  • Output: After processing all lines (END block), it iterates through the res array and prints each element to the standard output, effectively creating the transposed output.

Code Explanation (Awk)

awk '
{
  for (i=1; i<=NF; i++) {
    if(NR == 1) {
      res[i] = $i
    } else {
      res[i] = res[i]" "$i
    }
  }
}END {
  for (i=1;i<=NF;i++) {
    print res[i]
  }
}
' file.txt
  • awk '...' file.txt: This invokes the awk interpreter with the provided script and specifies file.txt as the input file.

  • { ... }: This is an awk block that executes for each line of the input file.

  • for (i=1; i<=NF; i++): This loop iterates through each field in the current line. NF is a built-in variable in awk that holds the number of fields in the current record (line).

  • if(NR == 1) { res[i] = $i } else { res[i] = res[i]" "$i }: This is the core logic. If it's the first line, it initializes the res[i] array element with the value of the i-th field. Otherwise, it appends the current field's value to res[i] with a space as a separator.

  • END { ... }: This awk block is executed after processing all lines of the input file.

  • for (i=1;i<=NF;i++) { print res[i] }: This loop iterates through the res array and prints each element (representing a column in the transposed output) to a new line.

Time and Space Complexity

  • Time Complexity: O(M*N), where M is the number of lines and N is the number of fields per line. The script iterates through each cell of the input matrix (file).

  • Space Complexity: O(N), where N is the number of fields per line. The res array stores one element per field. The space used is proportional to the number of columns in the input file. This is because we are storing at most N values in the res array.

The awk solution provides an elegant and efficient way to transpose the file using built-in functionalities, avoiding the need for more complex programming constructs.