🔗 URL Parser & Query Builder
Paste any URL to decompose it, edit parts, and rebuild with correct encoding
Anatomy of a URL: Why Every Character Matters in API Testing
Every HTTP request you make — whether it is a browser navigation, a curl command, or a Postman call — begins with a URL. Most developers treat URLs as monolithic strings, pasting them wholesale without examining their internal structure. That approach works until it does not: a misencoded space breaks an OAuth callback, a duplicated query key confuses a REST endpoint, a missing port specification fails silently in a staging environment. Understanding how URLs are constructed at a mechanical level is not academic pedantry — it is the difference between debugging a malformed request in thirty seconds versus three hours.
The RFC 3986 Anatomy
The authoritative specification for Uniform Resource Identifiers is RFC 3986, published in 2005 and still the definitive reference. It defines a URL as a composite of up to six components arranged in a strict hierarchical order:
scheme "://" [userinfo "@"] host [":" port] path ["?" query] ["#" fragment]
Each component has its own character encoding rules. The scheme (what most people call the protocol) is case-insensitive and contains only ASCII letters, digits, hyphens, and plus signs — https, ftp, ws, mailto are all valid schemes. The authority section combines optional user credentials, the hostname, and an optional port. The path is the slash-separated sequence that identifies a resource; it is always present even if it is just a single /. The query is everything after the ? and before the #. The fragment, after the #, is never sent to the server — it is client-side only, used by browsers for in-page navigation and by single-page applications for routing.
Percent-Encoding: The Source of Most URL Bugs
RFC 3986 divides characters into "reserved" and "unreserved." Unreserved characters — letters, digits, hyphens, underscores, tildes, and dots — are safe in any URL position and must never be encoded. Reserved characters — things like /, ?, #, &, = — carry syntactic meaning in specific positions. If you want to use a reserved character as literal data rather than a structural delimiter, you must percent-encode it: replace it with a % followed by its two-digit uppercase hex code. A space becomes %20. An ampersand becomes %26. A plus sign becomes %2B.
Here is where a persistent confusion lives: the application/x-www-form-urlencoded content type uses a slightly different encoding that predates RFC 3986. In that encoding, a space becomes + rather than %20. Most browser form submissions and many legacy APIs use this format. When you see q=hello+world in a query string, the server decodes the + as a space. But if you percent-encode the same string yourself using encodeURIComponent in JavaScript, you get q=hello%20world, which is also correct. The two forms are semantically identical in query strings. The problem arises when someone decodes hello+world with a percent-decoder that does not handle plus signs — they get a literal plus sign and wonder why the search results are wrong.
How the Browser URL API Works
Modern JavaScript exposes URL parsing through the URL constructor, which implements the WHATWG URL Standard — a living specification that is slightly more permissive than RFC 3986 and designed to match what browsers actually do in practice. When you call new URL('https://api.example.com:8080/search?q=test#top'), you get an object with named properties for every component: .protocol returns 'https:' (including the colon), .hostname returns the bare host, .port returns the port as a string (empty string if it is the default port for that protocol), .pathname gives the path, .hash includes the leading #.
The searchParams property is especially powerful: it exposes a URLSearchParams object that understands the application/x-www-form-urlencoded format. You can call .get('key'), .getAll('key') (for repeated parameters), .set(), .append(), and .delete() on it. When you read searchParams.get('q'), it returns the decoded value automatically — no manual percent-decoding required. This is the correct, canonical way to interact with query strings in JavaScript.
Repeated Parameters and Ordering
A query string allows the same key to appear multiple times: ids[]=1&ids[]=2&ids[]=3 or equivalently ids=1&ids=2&ids=3. How the server interprets this is entirely framework-dependent. PHP uses bracket notation (ids[]) to build arrays. Express.js and Django both accept repeated bare keys and collect them into a list. The WHATWG URL Standard preserves all repetitions and exposes them via searchParams.getAll(). If you use searchParams.set(), it will replace all existing values for that key with a single value. If you want to add an additional value without replacing, use searchParams.append().
Parameter ordering technically has no semantic meaning in HTTP — the server should not care whether page=2&q=test appears before or after q=test&page=2. In practice, some badly written servers or caching layers do depend on ordering. More commonly, ordered parameters help with cache key consistency in CDNs and with readability in logs and debugging sessions. Sorting parameters alphabetically before making requests can improve cache hit rates if your CDN uses the full URL as the cache key.
Default Ports and When They Are Elided
Every URL scheme has a default port associated with it: HTTP defaults to 80, HTTPS to 443, FTP to 21, WebSocket (ws://) to 80, Secure WebSocket (wss://) to 443. The WHATWG standard specifies that default ports must be omitted from the serialized URL. If you create new URL('https://example.com:443/path') and then read .href, you get back https://example.com/path — the port is stripped automatically because it is redundant. However, non-standard ports are always preserved. This matters for API testing: if you have a development server running on port 3000 or a staging environment on port 8080, those ports must appear explicitly in every request.
The Fragment Is Client-Only — and Often Confuses API Testers
One of the most common mistakes when constructing API requests manually is including a fragment identifier in the URL sent to the server. The fragment — everything from # onward — is stripped by the browser before the request is made. The server never sees it. If you are building a URL to test a REST endpoint and you accidentally include #section at the end, your HTTP client may silently drop it. Some clients (including curl with certain flags) will error; others will just ignore it. Fragments are meaningful only for client-side routing in single-page applications or for identifying anchor positions in HTML documents. They have no place in API endpoint URLs.
Practical Workflow: Parse, Mutate, Rebuild
A structured URL editor enables a workflow that raw string manipulation cannot match. When you receive a URL from a colleague or extract one from a log file, parsing it first gives you a clean inventory: what is the base endpoint, what parameters are already set, are there any credentials embedded in the authority section. From that inventory, you can systematically mutate specific components — bump the page number, swap the environment hostname from staging to production, add a new filter parameter — without risk of accidentally breaking the encoding of untouched parameters. Rebuilding from structured components guarantees that every value is correctly percent-encoded, that the parameter separator is consistently &, and that no accidental double-encoding (%2520 when a %25 gets encoded again) corrupts the values.
For API developers, this parse-mutate-rebuild cycle is the foundation of reproducible testing. Keeping a structured record of the parameters you tested, in what combinations, with what values, eliminates the ambiguity that plagues informal testing with raw URL strings. Combined with knowledge of encoding semantics and the browser's own URL API, it turns URL construction from an error-prone manual task into a systematic, verifiable process.