Bash: how to extract numbers preceded by _ and then

I have the following format for filenames: filename_1234.svg

How can I get numbers preceded by an underscore character and then a period. Between characters.svg

can be from one to four digits,

I tried:

width=${fileName//[^0-9]/}

      

but if filename also contains a number, it will return all numbers in the filename, eg.

file6name_1234.svg

      

I found solutions for two underscores (and split them into an array), but I'm looking for a way to check both the underscore and the period.

+3


source to share


6 answers


You can use simple parameter expansion with substring removal to simply trim right up to and including '.'

and then trim left and up including '_'

, leaving the number you want, for example

$ width=filename_1234.svg; val="${width%.*}"; val="${val##*_}"; echo $val
1234

      

note: #

aligns from left to the first occurrence, and ##

aligns to the last. %

and %%

work the same on the right side.

Clarifications:



  • width=filename_1234.svg

    - width

    contains your filename

  • val="${width%.*}"

    - val

    containsfilename_1234

  • val="${val##*_}"

    - finally val

    contains1234

Of course, there is no need to use a temporary value, for example val

if your intent is width

to keep the width. I just used temp to protect the original content from being modified width

. If you want to get the final number in width

, just replace val

with width

everywhere above and act directly on width

.

note 2: using shell capabilities like parameter expansion prevents a separate subshell from being created and creates a separate process that occurs when using a utility like sed

, grep

or awk

(or anything else that is not part of the shell for that matter).

+1


source


Try using the following code:

filename="filename_6_1234.svg"
if [[ "$filename" =~ ^(.*)_([^.]*)\..*$ ]];
then
    echo "${BASH_REMATCH[0]}" #will display 'filename_6_1234.svg'
    echo "${BASH_REMATCH[1]}" #will display 'filename_6'
    echo "${BASH_REMATCH[2]}" #will display '1234'
fi

      



Explanation:

  • =~

    : bash operator for regular expression comparison
  • ^(.*)_([^.])\..*$

    : we are looking for any character, followed by an underscore, followed by any character, followed by a period and an extension. We create 2 capture groups, one for before the last underscore, one for after
  • BASH_REMATCH

    : an array containing the captured groups
+2


source


Another way

[akshay@localhost tmp]$ filename=file1b2aname_1234.svg
[akshay@localhost tmp]$ after=${filename##*_}
[akshay@localhost tmp]$ echo ${after//[^0-9]}
1234

      

Using awk

[akshay@localhost tmp]$ awk -F'[_.]' '{print $2}' <<< "$filename"
1234

      

+1


source


I would use

sed 's!_! !g' | awk '{print "_" $NF}' 

      

to go from filename_1234.svg to _1234.svg then

sed 's!svg!!g' 

      

to get rid of the extension.

0


source


If you install IFS

, you can use Bash build-in read

.

This splits the filename into underscores and periods and stores the result in an array a

.

IFS='_.' read -a a <<<'file1b2aname_1234.svg'

      

And that takes the second last element from the array.

echo ${a[-2]}

      

0


source


There is a cut-away solution:

name="file6name_1234.svg"
num=$(echo "$name" | cut -d '_' -f 2 | cut -d '.' -f 1)
echo "$num"

      

-d

is intended to indicate a separator.

-f

refers to the desired field.

I don't know anything about performance, but it's simple to understand and easy to maintain.

0


source







All Articles