contact us

Type the characters you see in this picture. (verify using audio)
Type the characters you see in the picture above; if you can't read them, submit the form and a new image will be generated. Not case sensitive.

Blog

Ritesh Gurung

Converting Latin1 charset tables with UTF8 data set

CommentsMar 15, 2011

Objective

To migrate of TYPO3 records [pages, tt_news and tt_content] to Drupal. The TYPO3 site was multilingual - English, Danish and Greenlandic. The TYPO3 DB had Latin1 charset tables with UTF8 data stored (Are you sure about this? How do you know?) which needed to be converted to UTF8 for a Drupal database.

Initial Approach

Change the DB and table charset to UTF8, which should convert latin1 data to UTF8 with command like

  1. ALTER TABLE {tablename} MODIFY {table column} CHAR(20) CHARACTER SET utf8
  2. ALTER TABLE {tablename} DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci

Problem

Which is the correct one?

While fetching the records from phpmyadmin in the browser the text showed some junk characters.

Results shown in Browser

But checking the same record through terminal mysql client displayed the correct results

Results shown in mysql client

Final Solution

The data was already in UTF8 and converting the table or columns from Latin1 to UTF8 will display junk characters, while the strange thing was that the MY-SQL client via terminal was displaying it correctly.

After searching through Google we came upon this particular post - http://bit.ly/1RAqTO which gave us the breakthrough. The solution was to convert the fields to BLOB and then BLOB field to UTF8. Following this pattern, we solved out problems. The Drupal site is going to be live at www.knr.gl shortly.

CHAR BINARY
VARCHAR VARBINARY
TINYTEXT TINYBLOB
TEXT BLOB
MEDIUMTEXT MEDIUMBLOB
LONGTEXT LONGBLOB