URLs.fileToUrl does not percent-encode non-ASCII characters

Description

Good grief. The Java standard library never fails to disappoint. After the File.toURL() debacle, you would think the maintainers of the standard library would have learned, but no, they recommend File.toURI().toURL() as a workaround, when the implementation of URI.toURL() uses URI.toString() not URI.toASCIIString(), resulting in the failure to percent-encode non-ASCII characters (i.e. those above 0x7f).

(new File("file café")).toURI().toURL().toString() -> String ends with file%20café

Note the presence of "é", which is not permitted in a URI. Compare with:

(new File("file café")).toURI().toASCIIString() -> String ends with file%20caf%C3%A9

Note that "é" has been correctly-percent encoded as the UTF-8 octets "%C3%A9".

Impact on GeoTools is that DataUtilities.fileToURL, which uses File.toURI().toURL(), does not percent-encode non-ASCII characters. The solution is to use File.toURI().toASCIIString().

Environment

None

Assignee

Ben Caradoc-Davies

Reporter

Ben Caradoc-Davies

Triage

None

Components

Fix versions

Affects versions

Priority

Medium
Configure