What is URL Encoding? — Percent-Encoding Explained
Definition
URL encoding, also called percent-encoding, is a mechanism for encoding characters that have special meaning in URLs or that are not allowed in the URI specification. Unsafe characters are replaced with a % symbol followed by their two-digit hexadecimal ASCII code. For example, a space becomes %20, and an ampersand becomes %26. This ensures that URLs remain valid and unambiguous when transmitted.
How It Works
The URL encoding process identifies characters that are not in the "unreserved" set (letters, digits, -, _, ., ~). For each such character, the encoding takes its UTF-8 byte value, converts it to hexadecimal, and prepends a % sign. Multi-byte UTF-8 characters produce multiple %XX sequences. For example, the euro sign € is encoded as %E2%82%AC because its UTF-8 representation is 3 bytes: 0xE2, 0x82, 0xAC. Decoding reverses this by replacing each %XX sequence with the corresponding byte and then decoding the UTF-8 byte sequence.
Common Use Cases
- ▸Passing query string parameters that contain special characters like &, =, or spaces
- ▸Encoding form data submitted with application/x-www-form-urlencoded content type
- ▸Safely embedding user-generated text in URLs for search or filtering
- ▸Constructing API requests where path or query parameters contain arbitrary text
- ▸Encoding file names with spaces or unicode in download URLs
Example
Input: "hello world & foo=bar" Encoded: "hello%20world%20%26%20foo%3Dbar" Input: "café" Encoded: "caf%C3%A9" (é is UTF-8 bytes 0xC3 0xA9)
Related Tools
Encode text for safe use in URLs using percent-encoding.
Decode percent-encoded URL strings back to readable text.
Break down a URL into its individual components using the browser URL API.
Parse a query string into a key-value table with decoded values.
FAQ
- What is the difference between %20 and + for spaces?
- In query strings submitted from HTML forms, + represents a space (application/x-www-form-urlencoded). In path components and modern API usage, %20 is the correct encoding. Using + in a path would encode a literal plus sign.
- Should I encode the entire URL?
- No. Only encode the individual components (path segments, query parameter names and values) not the full URL. Encoding structural characters like :, /, and ? would break the URL structure.
- Why does the same character sometimes have different encodings?
- Some characters have context-dependent encoding. For example, + means space in form data but is encoded as %2B in path segments. Always use the correct encoding function for the URL component you are building.