How to work with characters like "" in Ruby
I would like to convert "HELLO" to "HELLO", removing all special characters that cause problems when inserting into the database. They don't seem to be part of UTF8.
I'm trying to figure out Iconv , but I'm kind of stuck here:
str = "A string with " to "A string with "
some_format = "I have no clue what format this is"
Iconv.conv(some_format, 'UTF-8//IGNORE', str)
Doing this action:
Iconv.conv('UTF-16', 'UTF-8//IGNORE', str)
... returns ...
\376\377\000H\000E\000L\000L\000O?G?`?`?`?`?`?`?`?`?`?`?`?`?`?`?`?`?`?`?`?`?`?`?`?`?`?`?????\342
I don't want to convert to something other than UTF8 because I have to deal with Arabic characters, Chinese, Japanese, Korean ...
Any help / pointers would be appreciated. I am using Ruby 1.8.7, but I need to upgrade to 1.9.3 very soon. A solution that works in both situations is best, but only for 1.9.3 is also good.
source to share
Here is a way to remove characters that are not in a particular encoding (when converting a string to a different encoding)
# -*- coding: utf-8 -*-
a = "⚒og"
p a => ⚒og
p a.encode('iso-8859-1', :undef => :replace, :replace => '') => og
However, your problem may be different. Because it is very unlikely that these problem characters are not part of utf-8. Possible problems:
-
Perhaps it's just that the font you are using doesn't know how to display those characters. Very few fonts have full utf-8 character coverage. I don't know how you are trying to display these lines, but make sure to use a font with good character coverage. For example, for example DejaVu, http://dejavu-fonts.org/wiki/Main_Page
-
Are you sure your database is configured correctly to use utf-8?
-
Also be careful, because your string might be fine but not show up in your terminal or database application due to incomplete utf-8 support (with me before). So sometimes it can be tricky to debug when your debug screen is listening ... (does this make sense?)
source to share