Java url class getPath (), getQuery () and getFile () are incompatible with RFC3986 URI syntax

I am writing a utility class that does Java semi-automation URL class

, and I wrote a bunch of test cases to test the methods that I completed with a custom implementation. I don't understand the output of some of the Java getters for certain strings URL

.

According to RFC 3986, the path component is defined as follows:

The path is terminated by the first question mark ("?") or number sign   
("#") character, or by the end of the URI.

      

The request component is defined as follows:

The query component is indicated by the first question
mark ("?") character and terminated by a number sign ("#") character
or by the end of the URI.

      

I have some test cases that are being treated by Java as valid urls, but getters for path, file and request do not return the values ​​I expected:

URL url = new URL("https://www.somesite.com/?param1=val1");

System.out.print(url.getPath());
System.out.println(url.getFile());
System.out.println(url.getQuery());

      

The above results produce the following output:

//?param1=val1
param1=val1
<empty string>

      

My other test case:

URL url = new URL("https://www.somesite.com?param1=val1");

System.out.print(url.getPath());
System.out.println(url.getFile());
System.out.println(url.getQuery());

      

The above results produce the following output:

?param1=val1
param1=val1
<empty string>

      

According to the documentation for Java URL

:

public String getFile()

Gets the file name of this URL. The returned file portion will be the  
same as getPath(), plus the concatenation of the value of getQuery(), if 
any. If there is no query portion, this method and getPath() will return 
identical results.

Returns:
    the file name of this URL, or an empty string if one does not exist

      

So my test cases result in an empty line when called getQuery()

. In this case, I expected to getFile()

return the same value as getPath()

. This is not true.

I was expecting the following output for both test cases:

<empty string>
?param1=val1
param1=val1

      

Perhaps my interpretation of RFC 3986 is wrong. But the result I saw also doesn't match the documentation for the URL class? Can anyone explain what I am seeing?

+3


source to share


1 answer


Here are some executable code based on your snippets:

import java.net.MalformedURLException;
import java.net.URL;

public class URLExample {
  public static void main(String[] args) throws MalformedURLException {
    printURLInformation(new URL("https://www.somesite.com/?param1=val1"));
    printURLInformation(new URL("https://www.somesite.com?param1=val1"));
  }

  private static void printURLInformation(URL url) {
    System.out.println(url);
    System.out.println("Path:\t" + url.getPath());
    System.out.println("File:\t" + url.getFile());
    System.out.println("Query:\t" + url.getQuery() + "\n");
  }

}

      



Works well, here is the result you might have expected. The only difference is that you used one System.out.

print and then System.out.

println , which printed the result for the path and file on the same line.

https://www.somesite.com/?param1=val1
Path:   /
File:   /?param1=val1
Query:  param1=val1

https://www.somesite.com?param1=val1
Path:   
File:   ?param1=val1
Query:  param1=val1

      

+3


source







All Articles