[Date Prev][Date Next][Thread Prev][][Date Index][Thread Index]

Re: https by default



In [emacs-w3m : No.12753]
On Tue, 18 Jul 2017 01:04:05 -0400, Boruch Baum wrote:
>>> I also see that w3m-util.el, line 806, there is defined a constant
>>> `w3m-url-fallback-base' , set to "http:///";, not "https:///";.
[...]
>> Well, please let me know how/what/when three slashes are used for?

> Looking at the specification[1][2], it seems that they are intended to
> indicate a URI (not to be confused with a URL) that doesn't have a
> 'global authority', ie. for a local resource. An example would be a
> local file.

Thanks.  I found that RFC3986 3.2.2 says as follows:

,----
| For example, the "file" URI scheme is defined so that no authority,
| an empty host, and "localhost" all mean the end-user's machine,
| whereas the "http" scheme considers a missing authority or empty
| host invalid.
`----

I sometimes use "file:///path", that means "file://localhost/path",
however we cannot use "http:///"; for meaning "http://localhost/";
as it says IIUC.

> Thus, without having studied the code, it seems that the
> original intent of the variable `w3m-url-fallback-base' was that if a
> user entered a URI path?query string, emacs-w3m would correctly fail to
> parse it as a URL because it had no 'authority' element[1], and would
> fallback to treating it as local request for http access to the
> path?querystring. In other words, local html files.

> I tried it in firefox, and it works for scheme "file:", but when I try
> it for scheme "http:", firefox auto-converts the it to two slashes
> instead of three and fails to find the local file.

I verified it with Firefox, Chrome, and MS IE, too (but MS Edge
launches the Bing search for "http:///host";).

> Without access to documentation or the perfect memory of the author,
> it's hard to know whether this is a bug, and whether it should be either
> "file:///" or "http://";

Ok, I'll ask the original emacs-w3m authors about it later.

>> Some browsers visit https://site if "https:///site"; is instructed
>> to, but (w3m "https:///site";) doesn't.

> Strictly speaking, w3m performs correctly in that way. Whether that's
> desirable or not, I have no opinon.

>> In addition, is it better to change `w3m-url-invalid-regexp' to
>> something like "\\`https?:///" ?  The `w3m-url-valid' function
>> uses it.  Though I don't know why a given url is considered valid
>> by only checking if it does not match `w3m-url-invalid-regexp'.

> Yeah. Also, I can't figure out why the leading back-tick is there.

"\\`http" is similar to "^http" meaning a regexp that matches a
string beginning with "http".

> Footnotes:
> [1]  https://tools.ietf.org/html/rfc3986#section-3
> [2]  https://tools.ietf.org/html/rfc3986#section-3.3