🔗 URL Parser & Query Builder

Last updated: December 4, 2025

🔗 URL Parser & Query Builder

Paste any URL to decompose it, edit parts, and rebuild with correct encoding

URL Components
Query Parameters
Raw Query String Editor
Built URL
Decoded Parameter Table

Anatomy of a URL: Why Every Character Matters in API Testing

Every HTTP request you make — whether it is a browser navigation, a curl command, or a Postman call — begins with a URL. Most developers treat URLs as monolithic strings, pasting them wholesale without examining their internal structure. That approach works until it does not: a misencoded space breaks an OAuth callback, a duplicated query key confuses a REST endpoint, a missing port specification fails silently in a staging environment. Understanding how URLs are constructed at a mechanical level is not academic pedantry — it is the difference between debugging a malformed request in thirty seconds versus three hours.

The RFC 3986 Anatomy

The authoritative specification for Uniform Resource Identifiers is RFC 3986, published in 2005 and still the definitive reference. It defines a URL as a composite of up to six components arranged in a strict hierarchical order:

scheme "://" [userinfo "@"] host [":" port] path ["?" query] ["#" fragment]

Each component has its own character encoding rules. The scheme (what most people call the protocol) is case-insensitive and contains only ASCII letters, digits, hyphens, and plus signs — https, ftp, ws, mailto are all valid schemes. The authority section combines optional user credentials, the hostname, and an optional port. The path is the slash-separated sequence that identifies a resource; it is always present even if it is just a single /. The query is everything after the ? and before the #. The fragment, after the #, is never sent to the server — it is client-side only, used by browsers for in-page navigation and by single-page applications for routing.

Percent-Encoding: The Source of Most URL Bugs

RFC 3986 divides characters into "reserved" and "unreserved." Unreserved characters — letters, digits, hyphens, underscores, tildes, and dots — are safe in any URL position and must never be encoded. Reserved characters — things like /, ?, #, &, = — carry syntactic meaning in specific positions. If you want to use a reserved character as literal data rather than a structural delimiter, you must percent-encode it: replace it with a % followed by its two-digit uppercase hex code. A space becomes %20. An ampersand becomes %26. A plus sign becomes %2B.

Here is where a persistent confusion lives: the application/x-www-form-urlencoded content type uses a slightly different encoding that predates RFC 3986. In that encoding, a space becomes + rather than %20. Most browser form submissions and many legacy APIs use this format. When you see q=hello+world in a query string, the server decodes the + as a space. But if you percent-encode the same string yourself using encodeURIComponent in JavaScript, you get q=hello%20world, which is also correct. The two forms are semantically identical in query strings. The problem arises when someone decodes hello+world with a percent-decoder that does not handle plus signs — they get a literal plus sign and wonder why the search results are wrong.

How the Browser URL API Works

Modern JavaScript exposes URL parsing through the URL constructor, which implements the WHATWG URL Standard — a living specification that is slightly more permissive than RFC 3986 and designed to match what browsers actually do in practice. When you call new URL('https://api.example.com:8080/search?q=test#top'), you get an object with named properties for every component: .protocol returns 'https:' (including the colon), .hostname returns the bare host, .port returns the port as a string (empty string if it is the default port for that protocol), .pathname gives the path, .hash includes the leading #.

The searchParams property is especially powerful: it exposes a URLSearchParams object that understands the application/x-www-form-urlencoded format. You can call .get('key'), .getAll('key') (for repeated parameters), .set(), .append(), and .delete() on it. When you read searchParams.get('q'), it returns the decoded value automatically — no manual percent-decoding required. This is the correct, canonical way to interact with query strings in JavaScript.

Repeated Parameters and Ordering

A query string allows the same key to appear multiple times: ids[]=1&ids[]=2&ids[]=3 or equivalently ids=1&ids=2&ids=3. How the server interprets this is entirely framework-dependent. PHP uses bracket notation (ids[]) to build arrays. Express.js and Django both accept repeated bare keys and collect them into a list. The WHATWG URL Standard preserves all repetitions and exposes them via searchParams.getAll(). If you use searchParams.set(), it will replace all existing values for that key with a single value. If you want to add an additional value without replacing, use searchParams.append().

Parameter ordering technically has no semantic meaning in HTTP — the server should not care whether page=2&q=test appears before or after q=test&page=2. In practice, some badly written servers or caching layers do depend on ordering. More commonly, ordered parameters help with cache key consistency in CDNs and with readability in logs and debugging sessions. Sorting parameters alphabetically before making requests can improve cache hit rates if your CDN uses the full URL as the cache key.

Default Ports and When They Are Elided

Every URL scheme has a default port associated with it: HTTP defaults to 80, HTTPS to 443, FTP to 21, WebSocket (ws://) to 80, Secure WebSocket (wss://) to 443. The WHATWG standard specifies that default ports must be omitted from the serialized URL. If you create new URL('https://example.com:443/path') and then read .href, you get back https://example.com/path — the port is stripped automatically because it is redundant. However, non-standard ports are always preserved. This matters for API testing: if you have a development server running on port 3000 or a staging environment on port 8080, those ports must appear explicitly in every request.

The Fragment Is Client-Only — and Often Confuses API Testers

One of the most common mistakes when constructing API requests manually is including a fragment identifier in the URL sent to the server. The fragment — everything from # onward — is stripped by the browser before the request is made. The server never sees it. If you are building a URL to test a REST endpoint and you accidentally include #section at the end, your HTTP client may silently drop it. Some clients (including curl with certain flags) will error; others will just ignore it. Fragments are meaningful only for client-side routing in single-page applications or for identifying anchor positions in HTML documents. They have no place in API endpoint URLs.

Practical Workflow: Parse, Mutate, Rebuild

A structured URL editor enables a workflow that raw string manipulation cannot match. When you receive a URL from a colleague or extract one from a log file, parsing it first gives you a clean inventory: what is the base endpoint, what parameters are already set, are there any credentials embedded in the authority section. From that inventory, you can systematically mutate specific components — bump the page number, swap the environment hostname from staging to production, add a new filter parameter — without risk of accidentally breaking the encoding of untouched parameters. Rebuilding from structured components guarantees that every value is correctly percent-encoded, that the parameter separator is consistently &, and that no accidental double-encoding (%2520 when a %25 gets encoded again) corrupts the values.

For API developers, this parse-mutate-rebuild cycle is the foundation of reproducible testing. Keeping a structured record of the parameters you tested, in what combinations, with what values, eliminates the ambiguity that plagues informal testing with raw URL strings. Combined with knowledge of encoding semantics and the browser's own URL API, it turns URL construction from an error-prone manual task into a systematic, verifiable process.

FAQ

What is the difference between encodeURIComponent and encodeURI in JavaScript?
encodeURI is designed to encode a complete URL and leaves structural characters like '/', '?', '&', '=', and '#' untouched, because they are needed as delimiters. encodeURIComponent is designed to encode a single component — a key or value — and encodes all reserved characters including those delimiters. For query parameter values, always use encodeURIComponent so that a value containing '&' or '=' does not break the query string structure.
Why do some URLs use + for spaces and others use %20?
Two different encoding conventions exist. The application/x-www-form-urlencoded format (used by HTML form submissions) encodes spaces as '+'. RFC 3986 percent-encoding uses '%20'. Both are widely accepted in query strings. The WHATWG URLSearchParams API outputs '+' for spaces when you read .toString(). Pure percent-encoding via encodeURIComponent produces '%20'. Servers built on modern frameworks handle both, but some legacy systems only handle one or the other — use the decoded parameter table to verify what your server actually receives.
Can a URL have the same query parameter key more than once?
Yes. The query string format allows repeated keys: for example, 'tags=js&tags=css&tags=html'. How the server interprets this depends on the framework. PHP requires bracket notation (tags[]) to collect them into an array, while Express.js, Django, and Rails all support bare repeated keys. The WHATWG URLSearchParams.getAll('tags') method retrieves all values for a repeated key. When building URLs in this tool, duplicate keys are preserved — use the 'Remove duplicates' button only when you genuinely want to collapse them.
Is the fragment (#hash) part of the URL ever sent to the server?
No. The fragment identifier — everything from '#' onward — is processed exclusively by the client. The browser strips it before sending the HTTP request. This means the server never receives it and cannot use it for routing or logic. Fragments are used client-side for in-page anchor navigation and by single-page application routers. If you are testing an API endpoint, including a fragment in the URL has no effect on the server response.
What happens if I put credentials (user:password) in a URL?
RFC 3986 allows a userinfo subcomponent in the authority section: 'https://user:password@host/path'. The browser or HTTP client sends these as HTTP Basic Authentication credentials. However, this approach is strongly discouraged for production use because credentials appear in plain text in logs, browser history, referrer headers, and any intermediate proxy that inspects URLs. Modern APIs use Authorization headers or OAuth tokens instead. The URL parser extracts credentials if they are present so you can audit them, but avoid embedding them in URLs you share or store.
Why would I want to sort query parameters alphabetically?
Parameter order has no semantic effect on the HTTP request or server response — RFC 3986 does not assign meaning to ordering. However, consistent ordering has practical benefits: CDNs and caching proxies that use the full URL as a cache key will treat 'page=2&q=test' and 'q=test&page=2' as different cache entries, halving effective cache hit rates. Alphabetical sorting ensures the same logical request always generates the same cache key. It also improves readability in logs and diffs, making it easier to spot when a parameter value changes between requests.