Lamp stack / user input and character encoding
Is there a one-stop solution to solve all character encoding problems? I always have problems somewhere along the line between user input, database storage and data retrieval (html forms. I want all my data and web pages to be utf-8 encoded, but it seems that I always get invalid utf is 8 characters.
I am not very good at coding character encoding, but since I started working with French characters, I always get problems. One of the other developers urlencodes everything before submitting it to the database and then urldecodes everything that makes me shake.
As I understand it, the html form will accept any characters depending on the user environment, and up to the server side to try to convert it to UTF-8 or whatever?
Any additional information would be greatly appreciated!
source to share
Using UTF-8 throughout is the only solution. Unfortunately, it comes with an understanding of the problems that arise in practice. If you have a specific issue, please ask a specific question on SO.
Wrt. HTML Forms: No, it doesn't really depend on the user environment. The browser will (or should - really really) send data in the same encoding as the page on which the form was created. Make sure every HTML page you submit to the user has a charset = field in the HTTP Content-type header; for good measure, also put the http-equiv meta tag in the HTML file itself (which helps if the user is caching or saving the HTML page). So when the HTML page is in UTF-8, the data sent by the browser is also in UTF-8.
source to share
In my projects, the first request that is sent to my database is
SET NAMES 'utf8';
Just after establishing a MySQL connection.
The same goes for data dumps. When I dump the database in the .sql file, I insert the above query first.
It has worked for me for several years now without issue on many hosting companies and dedicated servers.
source to share