Change database charset and collation


What are charset and collation?

The charset (character set) defines what characters the database can store (for example, UTF-8 allows characters from multiple languages). The collation defines how those characters are sorted and compared.

It's important to use the correct charset and collation to avoid problems with special characters, accents, and characters from other languages.


Most common charsets and collations

  • utf8mb4: Supports all Unicode characters, including emojis (recommended)
  • utf8: Old version, doesn't support all Unicode characters
  • latin1: Only basic Latin characters

Common collations for utf8mb4:

  • utf8mb4_unicode_ci: Unicode sorting (recommended)
  • utf8mb4_general_ci: Faster sorting but less precise
  • utf8mb4_spanish_ci: Specific sorting for Spanish

Check current charset and collation

To see the charset and collation of a database:

mysql -u root -p
USE database_name;
SHOW CREATE DATABASE database_name;

Or query directly:

SELECT DEFAULT_CHARACTER_SET_NAME, DEFAULT_COLLATION_NAME 
FROM information_schema.SCHEMATA 
WHERE SCHEMA_NAME = 'database_name';

Change database charset and collation

Method 1: ALTER DATABASE (recommended for new or empty database)

ALTER DATABASE database_name 
CHARACTER SET utf8mb4 
COLLATE utf8mb4_unicode_ci;

Method 2: Create new database with correct charset

If the database already has data, it's better to create a new one and migrate:

CREATE DATABASE new_database_name 
CHARACTER SET utf8mb4 
COLLATE utf8mb4_unicode_ci;

Change charset and collation of existing tables

If you need to change tables that already have data:

ALTER TABLE table_name 
CONVERT TO CHARACTER SET utf8mb4 
COLLATE utf8mb4_unicode_ci;

To change all tables in a database:

SELECT CONCAT('ALTER TABLE ', table_name, ' CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;') 
AS sql_statement 
FROM information_schema.TABLES 
WHERE TABLE_SCHEMA = 'database_name';

Copy and execute each generated command.


Change charset and collation of specific columns

To change only one column:

ALTER TABLE table_name 
MODIFY column_name VARCHAR(255) 
CHARACTER SET utf8mb4 
COLLATE utf8mb4_unicode_ci;

Complete process for database with existing data

1. Make full backup

mysqldump -u root -p database_name > backup.sql

2. Export data

mysqldump -u root -p --default-character-set=utf8mb4 database_name > data.sql

3. Create new database with correct charset

CREATE DATABASE new_database_name 
CHARACTER SET utf8mb4 
COLLATE utf8mb4_unicode_ci;

4. Import data

mysql -u root -p --default-character-set=utf8mb4 new_database_name < data.sql

5. Verify and rename

-- Verify everything is correct
DROP DATABASE database_name;
RENAME DATABASE new_database_name TO database_name;

Note: In recent MySQL versions, RENAME DATABASE may not be available. In that case, rename manually or recreate the original database.


Change default charset for new databases

For all new databases to use utf8mb4 by default, edit the configuration file:

sudo nano /etc/mysql/mysql.conf.d/mysqld.cnf

Add in the [mysqld] section:

[mysqld]
character-set-server=utf8mb4
collation-server=utf8mb4_unicode_ci

Restart MySQL:

sudo systemctl restart mysql

Verify changes

After making changes, verify:

-- Database
SHOW CREATE DATABASE database_name;

-- Tables
SHOW CREATE TABLE table_name;

-- Columns
SHOW FULL COLUMNS FROM table_name;

Common problems

Error: "Incorrect string value"

  • Occurs when trying to insert characters that are not compatible with the current charset.
  • Solution: Change charset to utf8mb4.

Strange characters or "?" instead of accents

  • Generally a problem with incorrect charset in the connection or database.
  • Verify that the connection uses utf8mb4 and that the database also uses it.

Sorting problems

  • If ORDER BY queries don't sort correctly, it may be a collation problem.
  • Change to a more appropriate collation for your language.

Best practices

  • Use utf8mb4: It's the current standard and supports all Unicode characters.
  • Use utf8mb4_unicode_ci: For more precise and internationally compatible sorting.
  • Configure when creating: Always specify charset and collation when creating databases and tables.
  • Make backups: Before changing charset in databases with data, make a full backup.
  • Test first: Test changes on a copy of the database before applying them in production.

Need help?

If you're having trouble changing the charset or collation of your database, open a ticket from the billing.baires.host panel or contact us for support.

You can also reach us through our social media:

Was this answer helpful? 0 Users Found This Useful (0 Votes)