After a bit more research, I decided to go with the
URI#merge option, with an extra substitution to deal with the percent-encoding issue. Something along the lines of:
normalized_url = params[:url].gsub(/%2E/i, '.')
merged_url = URI(root_url).merge(normalized_url).to_s
According to section 2.3 of RFC 3986, the period is an "unreserved character," and "URIs that differ in the replacement of an unreserved character with its corresponding percent-encoded US-ASCII octet are equivalent." So I felt reasonably certain about replacing
%2e with periods.
Section 6 of the same RFC mentions normalizing generic URIs by:
- normalizing case: "the hexadecimal digits within a percent-encoding triplet" should be treated as case-insensitive
- decoding percent-encoded versions of unreserved characters
.. as necessary
That covers the case for a generic URI, and RFC 7230 doesn't seem to add anything relevant to this situation specifically for HTTP and HTTPS URIs. So I felt like I was covering the most likely issues, though I can't be sure there isn't an edge case I've missed for some clients or servers.