How do I create a Persian .txt file and then blow it up?

I have a lot of Persian text and I want to blow it up, I save the text in file.txt

. (So ​​I have file.text containing Persian text). Now my problem is encoding. When I save the text to file.text

, it gives me an error:

This file contains Unicode characters that will be lost if you save this file as an ANSI-encoded text file. To keep the information in Unicode, click Undo below and then select one of the Unicode options from the Encoding drop-down list. Proceed?

I continue. Now when I open file.text

all the symbols are fine and when they are exploded all the symbols are crashing.

Note: when I put text into php variable everything is fine, actually my problem is with file.text.

What should I do?

My code: (for bang)

$text=file_get_contents('file.txt');
$var = explode("\n", $text);
foreach ($var  as $sentence) {
        echo $sentence.'<br>';  // or save into databse
    }

      

+3


source to share


1 answer


Be sure to save the text file in UTF-8 encoding. (Use UTF-8 for HTML output and database connection to match.)

If you save the file as an encoding, which Microsoft misleads as "Unicode", you end up with UTF-16LE, a double-byte, non-ASCII-compatible encoding, which is generally a bad idea.

PHP baseline string operations such as explode

work on a byte basis, so if you split UTF-16 by one byte \n

, you end up splitting the double-byte character in the middle and messing up the byte order of the next line (and each alternate string).



Use a decent text editor that gives you the ability to save as UTF-8 without BOM, because Notepad will provide you with UTF-8-faux-BOM at the beginning of the file, which means when you read it into PHP your first line (but none of the other lines) will have a U + FEFF byte order mark at the beginning of the line, which will result in wide output.

Prefer a text editor that saves to BOM-free-UTF-8 by default. Notepad's choice of ANSI, UTF-16LE and faux-BOM makes it a pretty terrible choice for editing on the web.

+1


source







All Articles