Why do we encode URL?

URLs (Uniform Resource Locators) can only contain a very limited set of characters from the US-ASCII charset. These characters include upper and lowercase alphabets (A-Za-z), digits (0-9), and the special characters (-_~.).

Some ASCII characters like ?, &, =, / have special meaning within URLs. Other ASCII characters like backspace, newline are unprintable. All these ASCII characters and any non-ASCII character must be encoded so that it can be safely placed inside URLs

Which characters are not allowed in URL?

Following class of characters are not allowed within URLs:

  • Reserved characters: Some characters like : / ? # [ ] @ ! $ & ' ( ) * + , ; = are reserved for special purpose in the URLs. For example, the character ? is used to specify query parameters, the character & is used to separate two query parameters. These characters cannot be placed in URLs without encoding.
  • Unprintable characters: ASCII characters in the range 0-31 and 127 are unprintable. These are also called control characters. These characters are not allowed in URLs.
  • Unsafe characters: Other ASCII characters like space < > { } | ` ^ \ are considered unsafe and are not allowed in URLs.
  • Non-ASCII characters: Any character outside the US-ASCII charset are not allowed in URLs.

What is %20 in a URL?

%20 is the percent encoding of the space character.

What is %2f in a URL?

%2f is the percent encoding of forward slash (/) character.

What is %3f in a URL?

%3f is the percent encoding of question mark (?)