Hex to UTF-8 Converter
Decode hexadecimal values to UTF-8 text instantly with our free online tool. Enter hex bytes and get immediate Unicode output—no signup needed. Perfect for debugging, data recovery, and international text analysis.
How to Decode Hex to UTF-8
Paste Unicode Hex
Enter multi-byte sequences from databases or API logs
Decode Variable-Length
Each character reconstructed from 1-4 hex bytes automatically
Verify Character Rendering
Copy international text and check for mojibake or � symbols
Test Across Languages
Supports CJK, Arabic, emoji validation in real-time
Hex to UTF-8 Conversion Examples
Hexadecimal Input | UTF-8 Output | Description |
---|---|---|
CE 9A CE B1 CE BB CE B7 CE BC CE AD CF 81 CE B1 | Καλημέρα | Greek text "Good morning" (2 bytes each) |
E0 A4 A8 E0 A4 AE E0 A4 B8 E0 A5 8D E0 A4 A4 E0 A5 87 | नमस्ते | Hindi "Namaste" (3 bytes with combining marks) |
D7 A9 D7 9C D7 95 D7 9D | שלום | Hebrew "Shalom" (2 bytes, RTL script) |
E0 B8 AA E0 B8 A7 E0 B8 B1 E0 B8 AA E0 B8 94 E0 B8 B5 | สวัสดี | Thai "Hello" (3 bytes with tone marks) |
F0 9F 98 80 F0 9F 8E 89 | 😀🎉 | Emoji characters (4 bytes each) |
C2 A9 20 32 30 32 34 | © 2024 | Copyright symbol + ASCII year |
What is Hex to UTF-8 Conversion?
Hex to UTF-8 decoding solves internationalization mysteries by reconstructing multilingual text from hexadecimal byte sequences. Database administrators use this when debugging charset issues—finding that corrupted text like "é" actually contains valid UTF-8 hex bytes C3 A9 (é) that were misinterpreted. The decoder validates proper byte sequence patterns: single bytes 00-7F for ASCII compatibility, two-byte sequences starting with C2-DF for Latin extensions, three-byte sequences (E0-EF) for most international scripts including Chinese/Japanese/Korean, and four-byte sequences (F0-F7) for emoji and rare symbols.
UTF-8's self-synchronizing design allows the decoder to detect invalid sequences—continuation bytes must follow the pattern 10xxxxxx (hex 80-BF), otherwise the data is corrupted or uses a different encoding. Web developers troubleshoot API responses by examining hex bytes: E4 B8 AD E6 96 87 correctly decodes to "中文" (Chinese), while malformed sequences reveal encoding conversion errors between systems. This is critical for applications serving international users where character corruption can break functionality. Testing tools validate that database BLOB fields, JSON payloads, and log files preserve UTF-8 integrity across system boundaries. Encode UTF-8 text to hex using our UTF-8 to Hex converter.
UTF-8 Byte Sequences: Decoding Guide
Byte Length by Range
Common Decoding Examples
Common Use Cases for Hex to UTF-8 Decoding
Internationalization & i18n
- • Decode multilingual database content
- • Debug international character issues
- • Recover UTF-8 text from hex dumps
- • Validate Unicode character encoding
Web API & JSON Debugging
- • GraphQL response hex fields
- • REST API Unicode payloads
- • WebSocket message inspection
- • JSON string encoding errors
Data Recovery & Forensics
- • Recover corrupted text files
- • Extract readable data from hex dumps
- • Analyze binary file contents
- • Reverse engineer data formats
🌐 Unicode Decoder Features
Byte Validator
Detects invalid patterns
Multi-Language
All Unicode scripts
Database Tool
Hex-stored fields
Error Finder
Charset corruption
Text Recovery
Multilingual content
API Decoder
JSON with UTF-8
Mojibake Fix
Repair garbled text
Understanding UTF-8 Decoding from Hexadecimal
Validating UTF-8 byte sequences requires checking continuation byte patterns—a 3-byte character like E2 82 AC
(€) must have E2 (starts 1110xxxx), then 82 and AC (both start 10xxxxxx). Invalid sequences like C2 C3
(C3 doesn't start with 10) indicate corruption. Database charset migrations often produce these malformed sequences when UTF-8 data gets double-encoded or decoded with wrong charset assumptions.
Mojibake appears when UTF-8 bytes are decoded as Latin-1 or Windows-1252: café becomes café because C3 A9
(UTF-8 é) displays as two characters in single-byte encodings. Conversely, E9
is é in Latin-1 but invalid UTF-8 (E9 requires continuation bytes that aren't present). Converting suspect text to hex immediately reveals whether bytes follow UTF-8 patterns. The ASCII to Hex table shows the 00-7F subset valid in both encodings.
Byte Order Marks like EF BB BF
at file starts indicate UTF-8 encoding but cause parsing issues in many systems—stripping them fixes hidden bugs. Overlong encodings like C0 80
for NULL (correct: 00
) are security vulnerabilities used to bypass validation filters. Standards-compliant decoders reject these. For reverse encoding, use our UTF-8 to Hex converter. For simpler ASCII, try Hex to ASCII.
Unicode Decoding Answers
How do I fix café displaying as café in my database?
This is double-encoding. The UTF-8 bytes C3 A9 (é) were stored correctly but decoded as Latin-1. Don't re-save! Convert displayed text back to bytes, then decode as UTF-8 to recover.
Why do Chinese characters sometimes show as ??? or boxes?
Missing fonts or wrong encoding. Convert to hex: valid Chinese is E4-E9 range (3-byte sequences). If hex shows E4-E9 but displays wrong, install CJK font. If hex is 3F 3F 3F, data is corrupted.
What does �� (replacement character) indicate in decoded output?
Invalid UTF-8 byte sequence. Common causes: wrong encoding assumption, truncated multi-byte char, or binary data. Check if leading bytes (C2-F7) have correct continuation bytes (80-BF).
How do I validate API responses that claim to be UTF-8?
Copy suspect characters to hex. Verify byte patterns: 2-byte (C2-DF), 3-byte (E0-EF), 4-byte (F0-F7) with proper continuations (80-BF). Invalid sequences indicate server encoding bugs.
Can I recover original text from a corrupted JSON file?
Extract hex sections around corrupted areas. Look for valid UTF-8 patterns—consecutive E-range bytes (CJK), C-range (Latin extended). Piece together readable fragments, then reconstruct JSON structure.
What's the overlong encoding vulnerability in UTF-8?
Encoding ASCII chars with 2+ bytes (C0 80 for NULL instead of 00) bypasses security filters. Decoders must reject overlong sequences. Our converter follows specs and rejects these malformed patterns.
Convert UTF8 characters to hexadecimal numbers.
Convert hexadecimal numbers to text characters.
Convert hexadecimal numbers to ASCII characters.
Convert hexadecimal numbers to a string of characters.
ASCII to hex conversion table.