$devtoolkit.sh/glossary/what-is-url-encoding

What is URL Encoding? — Percent-Encoding Explained

Definition

URL encoding, also called percent-encoding, is a mechanism for encoding characters that have special meaning in URLs or that are not allowed in the URI specification. Unsafe characters are replaced with a % symbol followed by their two-digit hexadecimal ASCII code. For example, a space becomes %20, and an ampersand becomes %26. This ensures that URLs remain valid and unambiguous when transmitted.

How It Works

The URL encoding process identifies characters that are not in the "unreserved" set (letters, digits, -, _, ., ~). For each such character, the encoding takes its UTF-8 byte value, converts it to hexadecimal, and prepends a % sign. Multi-byte UTF-8 characters produce multiple %XX sequences. For example, the euro sign € is encoded as %E2%82%AC because its UTF-8 representation is 3 bytes: 0xE2, 0x82, 0xAC. Decoding reverses this by replacing each %XX sequence with the corresponding byte and then decoding the UTF-8 byte sequence.

Common Use Cases

  • Passing query string parameters that contain special characters like &, =, or spaces
  • Encoding form data submitted with application/x-www-form-urlencoded content type
  • Safely embedding user-generated text in URLs for search or filtering
  • Constructing API requests where path or query parameters contain arbitrary text
  • Encoding file names with spaces or unicode in download URLs

Example

Input: "hello world & foo=bar"
Encoded: "hello%20world%20%26%20foo%3Dbar"

Input: "café"
Encoded: "caf%C3%A9"  (é is UTF-8 bytes 0xC3 0xA9)

Related Tools

FAQ

What is the difference between %20 and + for spaces?
In query strings submitted from HTML forms, + represents a space (application/x-www-form-urlencoded). In path components and modern API usage, %20 is the correct encoding. Using + in a path would encode a literal plus sign.
Should I encode the entire URL?
No. Only encode the individual components (path segments, query parameter names and values) not the full URL. Encoding structural characters like :, /, and ? would break the URL structure.
Why does the same character sometimes have different encodings?
Some characters have context-dependent encoding. For example, + means space in form data but is encoded as %2B in path segments. Always use the correct encoding function for the URL component you are building.

Related Terms

/glossary/what-is-url-encodingv1.0.0