How do I find non-printable characters in a file?

I tried to figure out non-printable characters in a data file on unix. Code:

#!/bin/ksh
export SRCFILE='/data/temp1.dat'
while read line 
do
len=lenght($line)
for( $i = 0; $i < $len; $i++ ) {

        if( ord(substr($line, $i, 1)) > 127 )
        {
            print "$line\n";
            last;
        }
done < $SRCFILE

      

The code is not working, please help me in getting a solution for the above request.

+5


source to share


3 answers


You can use grep

to find non-printable characters in a file, something like the following, which finds all non-printable-ASCII and all non-ASCII:

grep -P -n "[\x00-\x1F\7F-\xFF]" input_file

      

-P

gives you more powerful Perl Regular Expressions (PCRE) and -n

shows you line numbers.



If yours grep

doesn't support PCRE, I would just use Perl to do this:

perl -ne '$x++;if($_=~/[\x00-\x1F\x7F-\xFF]/){print"$x:$_"}' input_file

      

+6


source


You can try something like this:



grep '[^[:print:]]' filePath

      

0


source


It sounds pretty trite, but I was not sure how to do it now. I love "od" depending on what you are doing you may want something suitable for printing arbitrary characters. The awk code is not very elegant, but it is flexible if you are looking for specifics, however, the purpose is simply to show the use of od. Note the problems with awk, compares and spaces, etc.

cat filename | od -A n -t x1z | awk '{ p=0; i=1; if ( NF>16) { while (i<17) {if ( $i!="0d"){ if ( $i!="0a") {if ( $i" " < "20 " ) {print $i ; p=1;}  if ( $i" "> "7f "){print $i;   p=1;}}}  i=i+1} if (p==1) print $0; }}' | more

      

0


source







All Articles