Grep matching a specific position in strings using words from another file

Question

Grep matching a specific position in strings using words from another file

I have 2 files

file1:

12342015010198765hello
12342015010188765hello
12342015010178765hello

each line of which contains fields in fixed positions, for example, a position 13 - 17

foraccount_id

file2:

98765
88765

which contains a list of account_id

s.

In Korn shell, I want to print lines from file1, position 13 - 17

matches one of account_id

in file2.

I can not do

grep -f file2 file1

because account_id

in file2 may match other fields in different positions.

I tried using template in file2:

^.{12}98765.*

but doesn't work.

+3

unix shell grep

asinkxcoswt 10 jul. At 4:33 am

source to share

2 answers

Usage sed

with extended regex:

sed -r 's@.*@/^.{12}&/p@' file2 |sed -nr -f- file1

Using a basic regex:

sed 's@.*@/^.\\{12\\}&/p@' file1 |sed -n -f- file

Explanation:

sed -r 's@.*@/^.{12}&/p@' file2

will generate output:

/.{12}98765/p
/.{12}88765/p

which is then used as a sed

script for the next one sed

after the pipe, which outputs:

12342015010198765hello
12342015010188765hello

+1

Jahid 10 jul. 15 at 5:28 am

source to share

John1024 · Accepted Answer · 2015-07-10T04:41:46+0000

Using awk

$ awk 'NR==FNR{a[$1]=1;next;} substr($0,13,5) in a' file2 file1
12342015010198765hello
12342015010188765hello

How it works

NR==FNR{a[$1]=1;next;}

FNR is the number of lines read so far from the current file, and NR is the total number of lines read so far. Thus, if FNR==NR

, we read the first file, which is file2

.

Each identifier in file2 is stored in an array a

. Then we will skip the rest of the commands and move on to the line next

.
substr($0,13,5) in a

If we reach this command, we are working on the second file file1

.

This condition is true if the 5-character substring starting at position 13 is in the array a

. If the condition is true, awk performs the default action, which is to print the string.

Using grep

You mentioned trying

grep '^.{12}98765.*' file2

This uses the extended regex syntax, which means it is required -E

. Also, there is no value in the match .*

at the end: it will always match. Thus, try:

$ grep -E '^.{12}98765' file1
12342015010198765hello

To get both lines:

$ grep -E '^.{12}[89]8765' file1
12342015010198765hello
12342015010188765hello

This works because it [89]8765

just matches the IDs of interest in file2. Of course, awk's solution provides more flexibility when it comes to comparing identifiers.

Grep matching a specific position in strings using words from another file

Using awk

How it works

Using grep

More articles: