Extract date from filename in unix using shell scripts

Question

Extract date from filename in unix using shell scripts

I am working on a shell script. I want to extract the date from the filename.

File name: abcd_2014-05-20.tar.gz

I want to extract the date from it: 2014-05-20

+3

linux unix bash shell grep

priyanka 12 nov. At 7:04 am

source to share

6 answers

Using awk with a custom field separator, it's pretty straightforward:

echo 'abcd_2014-05-20.tar.gz' | awk -F '[_.]' '{print $2}'
2014-05-20

+9

anubhava 12 nov. '14 at 7:15

source to share

Use grep

:

$ ls -1 abcd_2014-05-20.tar.gz | grep -oP '[\d]+-[\d]+-[\d]+'
2014-05-20

-o

calls grep to print only the relevant part
-P

interprets pattern as perl regex
[\d]+-[\d]+-[\d]+

: denotes one or more digits followed by a dash (3 times) that corresponds to your date.

+5

chaos 12 nov. 14 at 7:12

source to share

I will use some kind of regex with the grep command, depending on how your filename is being generated.

If your date is always after "_" char, I'll use something like this.

ls -l | grep ‘_[REGEXP]’

Where REGEXP is your regular expression according to your date format.

Take a look here http://www.linuxnix.com/2011/07/regular-expressions-linux-i.html

+1

acostela 12 nov. '14 at 7:16

source to share

Several ways to do this:

echo abcd_2014-05-20.tar.gz | sed -n 's/.*_\(.*\).tar.gz/\1/p'

sed will extract the date and print it out.

Another way:

filename=abcd_2014-05-20.tar.gz
temp=${filename#*_}
date=${temp%.tar.gz}

Here temp will contain the line in the filename "_", i.e. 2014-05-20.tar.gz You can then extract the date by removing the .tar.gz from the end.

+1

SMA 12 nov. '14 at 7:17

source to share

Here are some examples,

Command usage cut

(cut gives more readability, like command awk

)

echo "abcd_2014-05-20.tar.gz" | cut -d "_" -f2 | cut -d "." -f1

Output:

2014-05-20

using grep

commnad

echo "abcd_2014-05-20.tar.gz" | grep -Eo "[0-9]{4}\-[0-9]{2}\-[0-9]{2}"

Output:

2014-05-20

Another advantage of using the command format grep

is that it can also help get multiple dates, for example:

echo "ab2014-15-12_cd_2014-05-20.tar.gz" | grep -Eo "[0-9]{4}\-[0-9]{2}\-[0-9]{2}"

Output:

2014-15-12
2014-05-20

0

Balaji reddy 12 nov. 14 at 14:38

source to share

Skynet · Accepted Answer · 2014-11-12T07:26:40+0000

echo abcd_2014-05-20.tar.gz |grep -Eo '[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}'

Output:

2014-05-20

grep

got the input as echo

stdin or you can also use the command cat

if you have those lines in the file.

-E

Interpret PATTERN as an extended regular expression.

-o

Show only the part of the matching line that matches the PATTERN.

[[:digit:]]

It will only read numbers from input.

{N}

It will check for N number of digits in a given string, ie: 4 for 2 years over a period of months and days

Most importantly, it will be retrieved without using any separators like "_" and ".". and that is why it is the most flexible solution.

Extract date from filename in unix using shell scripts

More articles: