9 votes

Problem with the "encoding": Characters correctly stored in MySQL appear "rare" in Java

Dear all, I have a small problem with a Java application that I am writing.

It is an application that connects via JDBC to MySQL. When you save the record, the characters are saved correctly (if I type "Mexico" in a box JText and send it to the database, it is saved as "Mexico"); but when you read them back using the method ResultSet.getString() the characters "special" (with accents and "ñ") appear ill ("Mexico" is read as "M��mexico").

I think that is something that has to do with the "encoding" of the character, but I do not know specifically what it is.. The database MySQL uses the encoding utf8_spanish_ciand the function Charset.defaultCharset() returns UTF-8.

My specific question is, then: How do you get the strings read from MySQL that contain characters special (that were correctly stored) is displayed properly in the Java application?


Update (partial solution):

After searching a while more, I found this question and your answer that helped me. Specifically, what that says is that the moment you open the connection you need to specify the set of characters that will be used; in my case:

DriverManager.getConnection(
           "jdbc:mysql://" + host + "/" + dbName 
           + "?useUnicode=true&characterEncoding=utf8_spanish_ci", 
    user, pass);

However, only solves partially the problem:

The read data fields VARCHAR the character special are read correctly. However, the reading of fields JSON, values that have special characters are still displayed as "rare".


Update (final):

The problem has to do with the encoding that MySQL uses to store the JSON data; The response below illustrates the procedure that I followed to fix it.

6voto

Barranka Points 1174

After you scratch the touch a bit more the problem, I found this reference in the user manual of MySQL:

MySQL handles strings used in JSON context using the utf8mb4 character set and utf8mb4_bin collation. Strings in other character set are converted to utf8mb4 as necessary.

So, regardless of the character encoding I use, MySQL automatically converts the JSON string to utf8mb4... which is not a problem when saving, but when reading back :(

My solution (and I suspect that is not the best), was the following: Writing, into the query, the conversion to the encoding required. Something like this:

strSQL = "select convert(cast(a.json_data as char) using 'utf8') as json_data "
       + "from mi_tabla "
       + "where id = ?";
try(PreparedStatement ps = conn.prepareStatement(strSQL)) {
    /*
       Mi código para leer la tabla
     */
}

With this "adjustment", the data is read perfectly (with all the correct accented characters).

I guess it can be a easiest way to troubleshoot this problem, but until now this solution has served me.

HolaDevs.com

HolaDevs is an online community of programmers and software lovers.
You can check other people responses or create a new question if you don't find a solution

Powered by:

X