PHP filename encoding conversion error

I am trying to rename files to a folder from PHP. This mostly works, although I'm having issues with accented characters.

An example of a filename with accented characters is ÅRE_GRÖN.JPG

.

I would like to rename this file to ARE_GRON.JPG

.

If I read the files like this:

<?php
$path = __DIR__;
$dir_handle = opendir($path);

while ($file = readdir($dir_handle)) {
    echo $file . "\n";
}

closedir($dir_handle);

      

... The page is displayed ÅRE_GRÖN.JPG

.

If I add header('Content-Type: text/html; charset=UTF-8');

to the beginning of my script it displays the correct filename, but the function rename()

doesn't seem to have any effect.

Here's what I've tried:

while ($file = readdir($dir_handle)) {
    rename($file, str_replace('Ö', 'O', $file)); # No effect
    rename($file, str_replace('Ö', 'O', $file)); # No effect
}

      

Where am I going wrong?


Tell me if you think I am using the wrong tool for the job. If anyone knows how to achieve this with a Bash script, show me. I don't have Bash snippets.

+3


source to share


2 answers


I figured out how to do it.

I first ran urlencode()

through the filename. This will convert the string:

MÖRKGRÅ.JPG

      

To the url:

MO%CC%88RKGRA%CC%8A.JPG

      

Then I ran str_replace()

in a url-encoded string providing needles and haystacks in arrays. I only needed this for a few Swedish characters, so my solution looked like this:



<?php

header('Content-Type: text/html; charset=UTF-8');

$path = __DIR__;

$dir_handle = opendir($path);

while ($file = readdir($dir_handle)) {
    $search = array('A%CC%8A', 'A%CC%88', 'O%CC%88');
    $replace = array('A', 'A', 'O');
    rename($file, str_replace($search, $replace, urlencode($file)));
}

closedir($dir_handle);

      

The task is completed :)


I realized that this is more versatile than I expected. Running another script url_encode()

gave me a slightly different result, but it's easy to change accordingly.

$search = array('%26Aring%3B', '%26Auml%3B', '%26Ouml%3B', '+');
$replace = array('A', 'A', 'O', '_');

      

+1


source


If you have a limited number of characters that you want to replace, you can do so with

for f in *; do mv "$f" "${f//Ö/O/}" 2> /dev/null; done

      

In GNU, you can usually use



expr=""
for char in {A..Z}
do 
    expr+="s/[[=$char=]]/$char/g; "; 
done; 

for f in *; do 
    mv "$f" "$(sed -e "$expr" <<< "$f")" 2> /dev/null; 
done

      

replace all A-like accented characters with ascii A for every character in the alphabet, but without any guarantees for OS X sed. Beware that this has the side effect of capitalizing all filenames.

0


source







All Articles