What defines a "column" in bash? In awk?

I was looking through this question: Bash - Take the nth column in a text file

I want to create a function that writes to a text file that I can parse using the method above. So, for example, I want my function to write "dates" in the first column, "ID" in the second column, and "addresses" in the third column. Then, once I have that, the user can, for example, see if a specific id is present in the file by querying the second column and then looking at each item. The user can do this using the method discussed in the question above.

What defines a column? Is it just a space separator? Is this a tab?

If I want to output this information as above, what does the method in which I write to the file look like? So far I have:

cat "$DATE $ID $ADDRESS \n" > myfile.data

      

+3


source to share


3 answers


In bash , unlike awk, columns are separated by characters in IFS

.

That is, if you install:

IFS=$'\t'

      

... then the columns, as understood by bash built-in functions like read first second rest

, will be tab-separated. The output side printf '%s\n' "${array[*]}"

will print the elements in the array array

, separated by the first character IFS

.

The default is IFS

equivalent $' \t\n'

, that is, space, tab, and newline character.




To write a file with a delimiter of your choice and (presumably) more than one line (replace while read

with what you actually get your data for, or only use the inner part of the loop if you're only writing one line):

while read -r date id address; do
  printf '%s\t' "$date" "$id" "$address" >&3; printf '\n' >&3
done 3>filename

      

... or if you don't want the final tab to stay above:

IFS=$'\t' # use a tab as the field separator for output
while IFS=$' \t\n' read -r date id address; do
  entry=( "$date" "$id" "$address" )
  printf '%s\n' "${entry[*]}" >&3
done 3>filename

      

Putting 3>filename

outside the loop is significantly more efficient than putting it >>filename

on each line, which reopens the output file once per line.

+6


source


If you are going to use awk

, the columns are separated by a field separator. For details, see. FS

In man awk

.

Most tools support some way to change the column separator:

cut -f
sort -t

      



bash

itself uses the IFS

variable (Internal Field Separator) for the splitting word.

cat

expects a file as an argument. To display a string, use echo

.

+2


source


If we're talking about awk

, then the space character is the default column delimiter.

It is easy to change, that is used as a "field separator" (FS), when awk

analyzing the file: awk '{FS=",";print $2}'

. Will use comma as separator (note: ignore quotes and stuff like csv parser).

To write to a file, I would use echo

double carrots as well >>

.

>>

adds while >

overwriting the file. echo -e

will allow echo to recognize \n

similar special characters

So the command will be

echo -e "$DATE $ID $ADDRESS \n" >> myfile.data

      

+1


source







All Articles