1
0
mirror of https://github.com/swisskyrepo/PayloadsAllTheThings.git synced 2025-08-21 01:44:39 +02:00
PayloadsAllTheThings/Encoding Transformations
2025-08-13 12:07:35 +02:00
..
README.md SQL injection - Fix linting errors 2025-08-13 12:07:35 +02:00

Encoding and Transformations

Encoding and Transformations are techniques that change how data is represented or transferred without altering its core meaning. Common examples include URL encoding, Base64, HTML entity encoding, and Unicode transformations. Attackers use these methods as gadgets to bypass input filters, evade web application firewalls, or break out of sanitization routines.

Summary

Unicode

Unicode is a universal character encoding standard used to represent text from virtually every writing system in the world. Each character (letters, numbers, symbols, emojis) is assigned a unique code point (for example, U+0041 for "A"). Unicode encoding formats like UTF-8 and UTF-16 specify how these code points are stored as bytes.

Unicode Normalization

Unicode normalization is the process of converting Unicode text into a standardized, consistent form so that equivalent characters are represented the same way in memory.

Unicode Normalization reference table

  • NFC (Normalization Form Canonical Composition): Combines decomposed sequences into precomposed characters where possible.
  • NFD (Normalization Form Canonical Decomposition): Breaks characters into their decomposed forms (base + combining marks).
  • NFKC (Normalization Form Compatibility Composition): Like NFC, but also replaces characters with compatibility equivalents (may change appearance/format).
  • NFKD (Normalization Form Compatibility Decomposition): Like NFD, but also decomposes compatibility characters.
Character Payload After Normalization
(U+2025) ‥/‥/‥/etc/passwd ../../../etc/passwd
(U+FE30) ///etc/passwd ../../../etc/passwd
(U+FF07) or 1=1 ' or '1'='1
(U+FF02) or 1=1 " or "1"="1
(U+FE63) admin'﹣﹣ admin'--
(U+3002) domain。com domain.com
(U+FF0F) domain.com //domain.com
(U+FF1C) img src=a <img src=a/>
(U+FE5B) ﹛﹛3+3﹜﹜ {{3+3}}
(U+FF3B) 5+5 [[5+5]]
(U+FF06) whoami &&whoami
(U+FF50) shell.pʰp shell.php
ʰ (U+02B0) shell.pʰp shell.php
ª (U+00AA) ªdmin admin
import unicodedata
string = "ᴾᵃʸˡᵒᵃᵈˢ𝓐𝓵𝓵𝕋𝕙𝕖𝒯𝒽𝒾𝓃ℊ𝓈"
print ('NFC: ' + unicodedata.normalize('NFC', string))
print ('NFD: ' + unicodedata.normalize('NFD', string))
print ('NFKC: ' + unicodedata.normalize('NFKC', string))
print ('NFKD: ' + unicodedata.normalize('NFKD', string))

Punycode

Punycode is a way to represent Unicode characters (including non-ASCII letters, symbols, and scripts) using only the limited set of ASCII characters (letters, digits, and hyphens).

It's mainly used in the Domain Name System (DNS), which traditionally supports only ASCII. Punycode allows internationalized domain names (IDNs), so that domain names can include characters from many languages by converting them into a safe ASCII form.

Visible in Browser (IDN support) Actual ASCII (Punycode)
раypal.com xn--ypal-43d9g.com
paypal.com paypal.com

In MySQL, similar character are treated as equal. This behavior can be abused in Password Reset, Forgot Password, and OAuth Provider sections.

SELECT 'a' = 'ᵃ';
+-------------+
| 'a' = 'ᵃ'   |
+-------------+
|           1 |
+-------------+

This trick works the SQL query uses COLLATE utf8mb4_0900_as_cs.

SELECT 'a' = 'ᵃ' COLLATE utf8mb4_0900_as_cs;
+----------------------------------------+
| 'a' = 'ᵃ' COLLATE utf8mb4_0900_as_cs   |
+----------------------------------------+
|                                      0 |
+----------------------------------------+

Base64

Base64 encoding is a method for converting binary data (like images or files) or text with special characters into a readable string that uses only ASCII characters (A-Z, a-z, 0-9, +, and /). Every 3 bytes of input are divided into 4 groups of 6 bits and mapped to 4 Base64 characters. If the input isn't a multiple of 3 bytes, the output is padded with = characters.

echo -n admin | base64                            
YWRtaW4=

echo -n YWRtaW4= | base64 -d
admin

Labs

References