Grep: find lines matching only unknown character

I have a list with hex strings. For example:

0b 5a 3f 5a 7d d0 5d e6 2b c4 7e 7d c2 c0 e6 9a 
84 bd aa 74 f3 85 da 9d ac b6 e0 b6 62 0f b5 d5
c0 b0 f5 60 02 8b 1c a4 41 7c 53 f2 85 20 a0 d1
...

      

I am trying to find all lines with grep where there is a character that only occurs once per line.

For example: there is only one 'd' in the third line.

I tried this but it doesn't work:

egrep '^.*([a-f0-9])[^\1]*$'

      

+3


source to share


3 answers


This can be done with a regular expression, but it must be verbose.
It cannot be generalized.

 # ^(?:[^a]*a[^a]*|[^b]*b[^b]*|[^c]*c[^c]*|[^d]*d[^d]*|[^e]*e[^e]*|[^f]*f[^f]*|[^0]*0[^0]*|[^1]*1[^1]*|[^2]*2[^2]*|[^3]*3[^3]*|[^4]*4[^4]*|[^5]*5[^5]*|[^6]*6[^6]*|[^7]*7[^7]*|[^8]*8[^8]*|[^9]*9[^9]*)$

 ^ 
 (?:
      [^a]* a [^a]* 
   |  [^b]* b [^b]* 
   |  [^c]* c [^c]* 
   |  [^d]* d [^d]* 
   |  [^e]* e [^e]* 
   |  [^f]* f [^f]* 

   |  [^0]* 0 [^0]* 
   |  [^1]* 1 [^1]* 
   |  [^2]* 2 [^2]* 
   |  [^3]* 3 [^3]* 
   |  [^4]* 4 [^4]* 
   |  [^5]* 5 [^5]* 
   |  [^6]* 6 [^6]* 
   |  [^7]* 7 [^7]* 
   |  [^8]* 8 [^8]* 
   |  [^9]* 9 [^9]* 
 )
 $ 

      

To detect if you put capture groups around letters and numbers
and use the reset keychain:



 ^ 
 (?|
      [^a]* (a) [^a]* 
   |  [^b]* (b) [^b]* 
   |  [^c]* (c) [^c]* 
   |  [^d]* (d) [^d]* 
   |  [^e]* (e) [^e]* 
   |  [^f]* (f) [^f]* 

   |  [^0]* (0) [^0]* 
   |  [^1]* (1) [^1]* 
   |  [^2]* (2) [^2]* 
   |  [^3]* (3) [^3]* 
   |  [^4]* (4) [^4]* 
   |  [^5]* (5) [^5]* 
   |  [^6]* (6) [^6]* 
   |  [^7]* (7) [^7]* 
   |  [^8]* (8) [^8]* 
   |  [^9]* (9) [^9]* 
 )
 $ 

      

This is the conclusion:

 **  Grp 0 -  ( pos 0 , len 50 ) 
0b 5a 3f 5a 7d d0 5d e6 2b c4 7e 7d c2 c0 e6 9a 

 **  Grp 1 -  ( pos 7 , len 1 ) 
f  

-----------------------

 **  Grp 0 -  ( pos 50 , len 51 ) 

84 bd aa 74 f3 85 da 9d ac b6 e0 b6 62 0f b5 d5

 **  Grp 1 -  ( pos 77 , len 1 ) 
c  

-----------------------

 **  Grp 0 -  ( pos 101 , len 51 ) 

c0 b0 f5 60 02 8b 1c a4 41 7c 53 f2 85 20 a0 d1

 **  Grp 1 -  ( pos 148 , len 1 ) 
d  

      

+3


source


I don't know how to do this with a regex. However, you can use this silly awk

script:

awk -F '' '{for(i=1;i<=NF;i++){a[$i]++};for(i in a){if(a[i]==1){print;next}}}' input

      



The scripts count the number of occurrences of each character in the string. At the end of the line, it checks all totals and prints the line if at least one of those totals is equal 1

.

+1


source


Here is a code snippet that uses several shell tools outside grep

. It reads the input line by line. Creates a frequency table. When searching for an element with a frequency of 1, it outputs the unique character and the entire string.

cat input | while read line ; do 
     export line ; 
     echo $line | grep -o . | sort | uniq -c | \
         awk '/[ ]+1[ ]/ {print $2 ":" ENVIRON["line"] ; exit }' ; 
done

      

Please note: if you are only interested in numbers, you can replace grep -o .

withgrep -o "[a-f]"

0


source







All Articles