URL encoding — formally percent-encoding — is the mechanism the web uses to ship special characters safely inside URLs. Every time a user enters a search term with spaces, every time an API receives a query string containing an ampersand, every time a redirect path preserves a non-ASCII filename, percent-encoding is what makes it work. This guide covers the rules, the two JavaScript APIs, and the mistakes that will bite you.

Reserved vs unreserved characters

RFC 3986 splits URL characters into two buckets. Unreserved characters — A–Z, a–z, 0–9, and - _ . ~ — are always safe. Reserved characters — : / ? # [ ] @ ! $ & ' ( ) * + , ; = — have structural meaning (separating scheme from path, key from value, component from component) and must be encoded when they appear as data rather than structure. Everything else (spaces, Unicode, binary bytes) must always be encoded.

The encoding rule

Each byte that needs escaping becomes a percent sign followed by two hex digits. A space is %20, an ampersand is %26, an em dash is three bytes of UTF-8 and therefore three percent sequences: %E2%80%94. Decoders reverse the process byte-by-byte.

encodeURI vs encodeURIComponent

JavaScript ships two builtin functions that trip up everyone at least once. encodeURI(str) assumes str is a complete URL and leaves reserved characters alone (so slashes, colons, and question marks survive). encodeURIComponent(str) assumes str is a single component — a path segment or query value — and escapes every reserved character aggressively. The practical rule: when you're building a query string value to drop into ?key=value, always use encodeURIComponent. When you have a full URL and only want the non-URL bits cleaned up, use encodeURI.

Common mistakes

When to reach for each

Reach for encodeURIComponent every time you compose a query string value dynamically. Reach for encodeURI when you have a user-supplied URL (maybe with Unicode in the path) that you want to make safe without breaking its structure. Decode with decodeURIComponent unless you're deliberately preserving reserved characters — in which case, you probably shouldn't have encoded them in the first place.