码迷,mamicode.com
首页 > 数据库 > 详细

MySQL utf8 和 utf8mb4 的区别

时间:2018-05-26 20:24:24      阅读:157      评论:0      收藏:0      [点我收藏+]

标签:nbsp   encoding   flow   maximum   basic   mon   hat   image   poi   

utf-8 时变化长度的编码,储存一个code point 需要1~4个字节.

然而,mysql的utf8只存储最多3个字节per code point.

 

所以,utf8字符集不能存储所有的unicode code points.

只能从0x000 to 0xFFFF(叫做Basic Multilingual Plane:BMP)  

 

The character set named utf8 uses a maximum of three bytes per character and contains only BMP characters. As of MySQL 5.5.3, the utf8mb4 character set uses a maximum of four bytes per character supports supplemental characters:

  • For a BMP character, utf8 and utf8mb4 have identical storage characteristics: same code values, same encoding, same length.

  • For a supplementary character, utf8 cannot store the character at all, while utf8mb4 requires four bytes to store it. Since utf8 cannot store the character at all, you do not have any supplementary characters in utf8 columns and you need not worry about converting characters or losing data when upgrading utf8 data from older versions of MySQL.

 So if you want your column to support storing characters lying outside the BMP (and you usually want to), such as emoji, use "utf8mb4". See also What are the most common non-BMP Unicode characters in actual use?.

技术分享图片

 

MySQL utf8 和 utf8mb4 的区别

标签:nbsp   encoding   flow   maximum   basic   mon   hat   image   poi   

原文地址:https://www.cnblogs.com/zienzir/p/9094092.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!