HTTP Encoding

URL encoding, also known as percent-encoding, is a mechanism used to convert characters into a format that can be transmitted over the Internet. URLs (Uniform Resource Locators) can only contain a limited set of characters, consisting mainly of alphanumeric characters along with a few special characters such as hyphens, underscores, periods, and tilde (~).

When a URL contains characters outside this set, such as spaces or non-alphanumeric characters like ampersands, question marks, or slashes, URL encoding is used to represent those characters in a safe and compatible format.

URL encoding works by replacing each non-alphanumeric character with a percent sign (%) followed by two hexadecimal digits that represent the ASCII (or Unicode) code of the character. For example:

  • Space (’ ‘) is encoded as %20.

  • Ampersand (’&’) is encoded as %26.

  • Question mark (‘?’) is encoded as %3F.

  • Slash (‘/’) is encoded as %2F.

URL encoding ensures that URLs remain valid and functional across different systems and protocols. It is commonly used in web browsers, HTTP requests, and other internet-related technologies to transmit data safely and reliably. Most programming languages provide built-in functions or libraries to perform URL encoding and decoding operations.

PHP provides urlencode() and urldecode() to handle coding and decoding this format.

<?php

$text = "This is the Euro symbol '€'.";

// builds a valid URL
$url = 'https://www.somesite.com/'.urlencode($text);

// https://www.somesite.com/This+is+the+Euro+symbol+%27%E2%82%AC%27.

?>

Documentation

See also urlencode(), urldecode()

Related : Text Encoding