Home utf8mb4 character set for Rails application
Reply: 1

utf8mb4 character set for Rails application

Donato
1#
Donato Published in 2017-12-07 21:13:20Z

I am using MySQL version 5.7.12; I am trying to add emoji support in a text field. I found this article. It states to alter the table and column to enable utf8mb4, which supports true unicode, including 4 byte unicode characters. I have a table comments and text field content, so I perform the following operation:

class ConvertCommentsToUtf8mb4 < ActiveRecord::Migration
  def change
    # for each table that will store unicode execute:
    execute "ALTER TABLE comments CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin"
    # for each string/text column with unicode content execute:
    execute "ALTER TABLE comments CHANGE content TEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_bin"
  end
end

When I try to run a migration, I get the following error:

Mysql2::Error: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'SET utf8mb4 COLLATE utf8mb4_bin' at line 1: ALTER TABLE comments CHANGE content TEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_bin/Users/myuser/.rvm/gems/ruby-2.1.2@core/gems/mysql2-0.3.21/lib/mysql2/client.rb:80:in `_query'

What might I be doing wrong?

Bill Karwin
2#
Bill Karwin Reply to 2017-12-07 22:23:35Z

When you use ALTER TABLE...CHANGE you need to give the column name twice, because CHANGE allows you to change the column name.

ALTER TABLE comments CHANGE content TEXT CHARACTER SET ...

In this case, it thought you were changing the column name to TEXT, with data type CHARACTER (which is a legit data type, it's a synonym for CHAR(1)), and then it got confused when it found the word SET.

So you can fix it in either of two ways:

ALTER TABLE comments CHANGE content content TEXT CHARACTER SET ...

ALTER TABLE comments MODIFY content TEXT CHARACTER SET ...

MODIFY is just like CHANGE but can't change the name of the column, so you don't need to spell out the column name twice.

That's the explanation for the syntax error, but FWIW you don't need to modify your text column after using CONVERT TO CHARACTER SET for the whole table. MySQL automatically changes the character set and collation for all string columns in the table.


Demo:

mysql> create table comments ( content text ) character set utf8;
Query OK, 0 rows affected (0.03 sec)

mysql> alter table comments convert to character set utf8mb4;
Query OK, 0 rows affected (0.03 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> insert into comments set content = 0xF09F92A9;
Query OK, 1 row affected (0.01 sec)

mysql> select * from comments;
+---------+
| content |
+---------+
| 💩        |
+---------+
You need to login account before you can post.

About| Privacy statement| Terms of Service| Advertising| Contact us| Help| Sitemap|
Processed in 0.356321 second(s) , Gzip On .

© 2016 Powered by mzan.com design MATCHINFO