HTTP Encoding¶
URL encoding, also known as percent-encoding, is a mechanism used to convert characters into a format that can be transmitted over the Internet. URLs (Uniform Resource Locators) can only contain a limited set of characters, consisting mainly of alphanumeric characters along with a few special characters such as hyphens, underscores, periods, and tilde (~).
When a URL contains characters outside this set, such as spaces or non-alphanumeric characters like ampersands, question marks, or slashes, URL encoding is used to represent those characters in a safe and compatible format.
URL encoding works by replacing each non-alphanumeric character with a percent sign (%) followed by two hexadecimal digits that represent the ASCII (or Unicode) code of the character. For example:
Space (’ ‘) is encoded as %20.
Ampersand (’&’) is encoded as %26.
Question mark (‘?’) is encoded as %3F.
Slash (‘/’) is encoded as %2F.
URL encoding ensures that URLs remain valid and functional across different systems and protocols. It is commonly used in web browsers, HTTP requests, and other internet-related technologies to transmit data safely and reliably. Most programming languages provide built-in functions or libraries to perform URL encoding and decoding operations.
PHP provides urlencode() and urldecode() to handle coding and decoding this format.
<?php
$text = "This is the Euro symbol '€'.";
// builds a valid URL
$url = 'https://www.somesite.com/'.urlencode($text);
// https://www.somesite.com/This+is+the+Euro+symbol+%27%E2%82%AC%27.
?>
See also urlencode(), urldecode()
Related : Text Encoding