PHP, MSSQL2005 and code pages

I have a php script that accesses an MSSQL2005 database, reads some data from it, and mails the results.

There are special characters in both the column names and the fields themselves.

When I access the script through my browser (iis web server), the request is correct and the mail content is correctly (for my audience) encoded. However, when I execute php from the console, the request fails (due to special characters in the column names). If I replace the special characters in the request with chr () calls and the character code in latin-1, the query will execute correctly, but the results will also be encoded in latin-1 and therefore will not display correctly in the mail. Why does PHP / MSSQL driver / ... use a different encoding in two scenarios? Is there a way to get around this?

If you're wondering, I need a console because I want to schedule a script using SQLAgent (or Task Manager or whatever).

+1


source to share


3 answers


Depending on the type of characters you have in your database, it might be a console limitation. If you type chcp

in the console, you will see what the active code page is, which might look like CP437 , also known as Extended ASCII. If you have characters from this code page, for example in UTF8, you might have problems. You can change the currently active code page by typing chcp 65001

to switch to UTF8.

You can also change the default Raster font to Lucida Console depending on the characters you want, since not all fonts support extended characters (right click on the title of the tooltip box, properties, font).



As said, PHP unicode support is not ideal, but you can do it in PHP5 with a few well-spaced calls to the utf8_decode function . The secret of character encoding is to have a good understanding of what the current encoding of all the tools you use is: database, database connection, current bytes in your PHP variable, console output, email body encoding, your email client, and etc.

For anything that has special characters, something like UTF8 is often recommended in our modern times. Make sure everything along the path is set to UTF8 and only converts where needed.

+2


source


Poor PHP support for the non-English world is well known. I've never used a database with characters outside of the main ASCII scope, but obviously you already have a job and feel like you just need to live with it.

If you want to take it even further, you can: 1. Write an array containing all the special characters and their CHR equivalents 2. run the array and str_replace in the query



But if the request is hardcoded then I think you are fine. Also, make sure you are using the latest PHP, at least 4.4.x, there are always changes that have been fixed, but I pulled the 4.xx release notes and I don't see anything related to your problem.

+1


source


The thing to remember about PHP strings is that they are streams of bytes. If you want to get the data in the correct character set (for what you are doing), you have to do it explicitly through some function or filter. This is all pretty low-level.

Depending on your setup, you may need to know the internal character set of the database, but at least you need to know what character the database specifies to send to PHP (because, remember, for PHP it is just a stream of bytes).

Then you need to know the target character set (and possibly specify it, which is what you really should anyway). For example, let's say that you are receiving utf-8 from the database, but want to send latin-1 (and therefore base64 or q-printable encoded as "Content-transfer-encoding"):

$send_string = base64_encode(utf8_decode($database_string));

      

Of course, in this case, you would need to know that all utf-8 characters exist in the latin-1 character set, and you probably would not want base64 (PHP unfortunately does not have a good q- but curiously , it does it for decoding), and if you're not talking about utf-8 <=> latin-1, you want to output the mbstring functions instead.

As far as the console is concerned, you need to know what PHP does when you enter special characters from the console, which probably depends on your shell and / or PHP settings. But remember that PHP only understands strings as byte byte bytes and you should be able to work it out.

+1


source







All Articles