Regex to extract file extension from url
I'm looking for a regular expression that will match .js
in the following URI:
/foo/bar/file.js?cache_key=123
I am writing a function that tries to determine which file is being passed as a parameter. In this case, the file ends with an extension .js
and is a javascript file. I'm working with PHP and preg_match, so I'm assuming it's a PCRE compliant regex. Ultimately I will describe this expression and be able to check several types of files that are passed as URI, which is not limited to js only, but possibly css, images, etc.
source to share
You can use combination pathinfo
and regex. pathinfo
will give you a plus extension ?cache_key=123
and you can remove ?cache_key=123
with a regex that matches ?
and everything after it:
$url = '/foo/bar/file.js?cache_key=123';
echo preg_replace("#\?.*#", "", pathinfo($url, PATHINFO_EXTENSION)) . "\n";
Output:
js
Input:
$url = 'my_style.css?cache_key=123';
Output:
css
Obviously, if you need it .
, it's trivial to add it to the file extension string.
ETA: If you want to use regex this will do the trick:
function parseurl($url) {
# takes the last dot it can find and grabs the text after it
echo preg_replace("#(.+)?\.(\w+)(\?.+)?#", "$2", $url) . "\n";
}
parseurl('my_style.css');
parseurl('my_style.css?cache=123');
parseurl('/foo/bar/file.js?cache_key=123');
parseurl('/my.dir.name/has/dots/boo.html?cache=123');
Output:
css
css
js
html
source to share
code
$input_line = '/foo/bar/file.js?cache_key=123';
// lets grab the part part between filename and ?
preg_match("/\w+\/\w+\/\w+(.*)\?/", $input_line, $output_array);
var_dump($matches);
echo $matches[0];
Output
Array
(
[0] => foo/bar/file.js?
[1] => .js
)
.js
If you know the extensions in advance (the "whitelisting" method), you can switch from matching all (.*)
to matching specific extensions/.*\.(js|jpg|jpeg|png|gif)/
preg_match("/.*\.(js|jpg|jpeg|png|gif)/", $input_line, $matches);
echo $matches[1]; // js
source to share