Find lines starting with "t", continue vowel and total length 4

I have a file that contains over 300 words. I need to find lines starting with "t", continue with vowel and total length 4. Then I need to convert them to a format where each line has one word.

tr -s "[[:blank:]]" "\n" < file | grep .

      

With this I can format the file, but I cannot figure out how I can select the words with the above requirement. I am stuck:/

i.e. I have a file that includes "terra train chair tol mourn". I need to format this file like this:

tera  
train  
chair  
tola  
mourn

      

and find the ones that start with "t" and continue with a vowel with a total length of 4. So it should be like this:

tera 
tola

      

+3


source to share


2 answers


You can use grep for this. If you just want the first word from a string:

grep -Eow '^t[aeiou]\S{2}' file > formatted_file

      

If you need to match the entire string:

grep -Eow '^t[aeiou]\S{2}$' file > formatted_file

      



  • ^

    searches at the beginning of a string.
  • t

    matches exactly the letter "t".
  • [aeiou]

    matches any of the characters between [

    and ]

    .
  • \S{2}

    matches 2 characters without spaces
  • $

    matches end of line
  • -w

    means grep will match whole words, which effectively limits your search to the exact number of characters specified in PATTERN

    .
  • -o

    means that you only output an exact match (in this case your 4-letter word)

EDIT

You can also use a parameter -i

if you want to grep

ignore case (upper and lower case)

+5


source


Following perl oneliner

perl -nle 'push @A,$_ for /\bt[aeiou]..\b/gi;END{print"@A"}' <file

      

It is not clear if a single line of input can contain many words or if all words of the output must be on one line.



perl -nle 'print for /\bt[aeiou]..\b/gi' <file

      

The following grep updates are enough for this to work

grep -i '^t[eaiou][a-z][a-z]$' <file

      

+1


source







All Articles