HTML - HTML charsets
When you mention HTML charsets, you’re talking about character encodings in HTML documents. These define how text is represented so that browsers display the correct characters, including letters, symbols, and emojis. Here's a detailed overview:
1. Declaring Charset in HTML
The most common way to declare a character encoding is in the <head>
section of your HTML:
<meta charset="UTF-8">
-
UTF-8
is the most widely used charset today. -
It supports virtually all characters from all languages.
Older syntax (HTML4/XHTML):
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
-
ISO-8859-1
(Latin-1) was common in Western Europe and the U.S. -
Modern HTML prefers the
<meta charset>
syntax.
2. Common HTML Charsets
Charset | Description |
---|---|
UTF-8 | Unicode; supports all languages; preferred for modern web. |
ISO-8859-1 | Latin alphabet; Western European languages. |
UTF-16 | Unicode 16-bit; less common in HTML. |
Windows-1252 | Similar to ISO-8859-1 with extra symbols. |
Shift_JIS | Japanese encoding. |
GB2312 / GB18030 | Simplified Chinese. |
EUC-KR | Korean. |
3. Why Charset Matters
-
Correct display of text: Without proper charset, characters may appear as � (replacement character) or gibberish.
-
SEO & indexing: Search engines read your content better if charset is properly declared.
-
Form submission: Ensures that user input is correctly interpreted.
4. Best Practice
-
Use UTF-8 for all modern HTML documents.
-
Declare it as the first element inside
<head>
for best browser support:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>My Website</title>
</head>
<body>
<p>Hello, world!</p>
</body>
</html>