Hex to UTF-8 Converter

Decode hexadecimal values to UTF-8 text instantly with our free online tool. Enter hex bytes and get immediate Unicode output—no signup needed. Perfect for debugging, data recovery, and international text analysis.

HEXADECIMAL

Enter hexadecimal values to convert

UTF-8

Outputs human-readable text characters

Loading converter...

How to Decode Hex to UTF-8

Paste Unicode Hex

Enter multi-byte sequences from databases or API logs

Decode Variable-Length

Each character reconstructed from 1-4 hex bytes automatically

Verify Character Rendering

Copy international text and check for mojibake or � symbols

Test Across Languages

Supports CJK, Arabic, emoji validation in real-time

Hex to UTF-8 Conversion Examples

Hexadecimal Input	UTF-8 Output	Description
CE 9A CE B1 CE BB CE B7 CE BC CE AD CF 81 CE B1	Καλημέρα	Greek text "Good morning" (2 bytes each)
E0 A4 A8 E0 A4 AE E0 A4 B8 E0 A5 8D E0 A4 A4 E0 A5 87	नमस्ते	Hindi "Namaste" (3 bytes with combining marks)
D7 A9 D7 9C D7 95 D7 9D	שלום	Hebrew "Shalom" (2 bytes, RTL script)
E0 B8 AA E0 B8 A7 E0 B8 B1 E0 B8 AA E0 B8 94 E0 B8 B5	สวัสดี	Thai "Hello" (3 bytes with tone marks)
F0 9F 98 80 F0 9F 8E 89	😀🎉	Emoji characters (4 bytes each)
C2 A9 20 32 30 32 34	© 2024	Copyright symbol + ASCII year

What is Hex to UTF-8 Conversion?

Hex to UTF-8 decoding solves internationalization mysteries by reconstructing multilingual text from hexadecimal byte sequences. Database administrators use this when debugging charset issues—finding that corrupted text like "Ã©" actually contains valid UTF-8 hex bytes C3 A9 (é) that were misinterpreted. The decoder validates proper byte sequence patterns: single bytes 00-7F for ASCII compatibility, two-byte sequences starting with C2-DF for Latin extensions, three-byte sequences (E0-EF) for most international scripts including Chinese/Japanese/Korean, and four-byte sequences (F0-F7) for emoji and rare symbols.

UTF-8's self-synchronizing design allows the decoder to detect invalid sequences—continuation bytes must follow the pattern 10xxxxxx (hex 80-BF), otherwise the data is corrupted or uses a different encoding. Web developers troubleshoot API responses by examining hex bytes: E4 B8 AD E6 96 87 correctly decodes to "中文" (Chinese), while malformed sequences reveal encoding conversion errors between systems. This is critical for applications serving international users where character corruption can break functionality. Testing tools validate that database BLOB fields, JSON payloads, and log files preserve UTF-8 integrity across system boundaries. Encode UTF-8 text to hex using our UTF-8 to Hex converter.

UTF-8 Byte Sequences: Decoding Guide

Byte Length by Range

00-7F→ 1 byte (ASCII)

C2-DF + 80-BF→ 2 bytes

E0-EF + 80-BF...→ 3 bytes

F0-F7 + 80-BF...→ 4 bytes

Common Decoding Examples

41→ A

C3 A9→ é

E2 82 AC→ €

E4 B8 AD→ 中

💡 Pro Tip: When debugging mojibake (garbled text), check if bytes starting with C2-F4 are being decoded correctly—invalid UTF-8 sequences often indicate double-encoding or wrong charset assumptions in database migrations.

Common Use Cases for Hex to UTF-8 Decoding

🌐

Internationalization & i18n

• Decode multilingual database content
• Debug international character issues
• Recover UTF-8 text from hex dumps
• Validate Unicode character encoding

🔍

Web API & JSON Debugging

• GraphQL response hex fields
• REST API Unicode payloads
• WebSocket message inspection
• JSON string encoding errors

💾

Data Recovery & Forensics

• Recover corrupted text files
• Extract readable data from hex dumps
• Analyze binary file contents
• Reverse engineer data formats

🌐 Unicode Decoder Features

✓

Byte Validator

Detects invalid patterns

✓

Multi-Language

All Unicode scripts

✓

Database Tool

Hex-stored fields

✓

Error Finder

Charset corruption

✓

Text Recovery

Multilingual content

✓

API Decoder

JSON with UTF-8

✓

Mojibake Fix

Repair garbled text

Understanding UTF-8 Decoding from Hexadecimal

Validating UTF-8 byte sequences requires checking continuation byte patterns—a 3-byte character like E2 82 AC (€) must have E2 (starts 1110xxxx), then 82 and AC (both start 10xxxxxx). Invalid sequences like C2 C3 (C3 doesn't start with 10) indicate corruption. Database charset migrations often produce these malformed sequences when UTF-8 data gets double-encoded or decoded with wrong charset assumptions.

Mojibake appears when UTF-8 bytes are decoded as Latin-1 or Windows-1252: café becomes cafÃ© because C3 A9 (UTF-8 é) displays as two characters in single-byte encodings. Conversely, E9 is é in Latin-1 but invalid UTF-8 (E9 requires continuation bytes that aren't present). Converting suspect text to hex immediately reveals whether bytes follow UTF-8 patterns. The ASCII to Hex table shows the 00-7F subset valid in both encodings.

Byte Order Marks like EF BB BF at file starts indicate UTF-8 encoding but cause parsing issues in many systems—stripping them fixes hidden bugs. Overlong encodings like C0 80 for NULL (correct: 00) are security vulnerabilities used to bypass validation filters. Standards-compliant decoders reject these. For reverse encoding, use our UTF-8 to Hex converter. For simpler ASCII, try Hex to ASCII.

Unicode Decoding Answers

This is double-encoding. The UTF-8 bytes C3 A9 (é) were stored correctly but decoded as Latin-1. Don't re-save! Convert displayed text back to bytes, then decode as UTF-8 to recover.

Why do Chinese characters sometimes show as ??? or boxes?

Missing fonts or wrong encoding. Convert to hex: valid Chinese is E4-E9 range (3-byte sequences). If hex shows E4-E9 but displays wrong, install CJK font. If hex is 3F 3F 3F, data is corrupted.

What does �� (replacement character) indicate in decoded output?

Invalid UTF-8 byte sequence. Common causes: wrong encoding assumption, truncated multi-byte char, or binary data. Check if leading bytes (C2-F7) have correct continuation bytes (80-BF).

How do I validate API responses that claim to be UTF-8?

Copy suspect characters to hex. Verify byte patterns: 2-byte (C2-DF), 3-byte (E0-EF), 4-byte (F0-F7) with proper continuations (80-BF). Invalid sequences indicate server encoding bugs.

Can I recover original text from a corrupted JSON file?

Extract hex sections around corrupted areas. Look for valid UTF-8 patterns—consecutive E-range bytes (CJK), C-range (Latin extended). Piece together readable fragments, then reconstruct JSON structure.

What's the overlong encoding vulnerability in UTF-8?

Encoding ASCII chars with 2+ bytes (C0 80 for NULL instead of 00) bypasses security filters. Decoders must reject overlong sequences. Our converter follows specs and rejects these malformed patterns.