[Date Prev][Date Next][Thread Prev][][Date Index][Thread Index]

Re: https by default



On 2017-07-18 11:24, Katsumi Yamaoka wrote:
> In [emacs-w3m : No.12750]
> On Sun, 16 Jul 2017 07:56:55 -0400, Boruch Baum wrote:
> > I also see that w3m-util.el, line 806, there is defined a constant
> > `w3m-url-fallback-base' , set to "http:///";, not "https:///";.
>
> (w3m-expand-url "example.com") => "http:///example.com";
>
> Well, please let me know how/what/when three slashes are used for?

Looking at the specification[1][2], it seems that they are intended to
indicate a URI (not to be confused with a URL) that doesn't have a
'global authority', ie. for a local resource. An example would be a
local file. Thus, without having studied the code, it seems that the
original intent of the variable `w3m-url-fallback-base' was that if a
user entered a URI path?query string, emacs-w3m would correctly fail to
parse it as a URL because it had no 'authority' element[1], and would
fallback to treating it as local request for http access to the
path?querystring. In other words, local html files.

I tried it in firefox, and it works for scheme "file:", but when I try
it for scheme "http:", firefox auto-converts the it to two slashes
instead of three and fails to find the local file.

Without access to documentation or the perfect memory of the author,
it's hard to know whether this is a bug, and whether it should be either
"file:///" or "http://";

> Some browsers visit https://site if "https:///site"; is instructed
> to, but (w3m "https:///site";) doesn't.

Strictly speaking, w3m performs correctly in that way. Whether that's
desirable or not, I have no opinon.

> In addition, is it better to change `w3m-url-invalid-regexp' to
> something like "\\`https?:///" ?  The `w3m-url-valid' function
> uses it.  Though I don't know why a given url is considered valid
> by only checking if it does not match `w3m-url-invalid-regexp'.

Yeah. Also, I can't figure out why the leading back-tick is there. try
these urls, and notice the difference in how emacs-w3m responds:
`http://google.com, `http:///google.com, http:///google.com. For me, the
first two seem to have emacs-w3m accepting the url as valid and
attempting to retrieve it (resulting in an error message in the body of
the w3m buffer), while the third responds with a mini-buffer message
that the url in invalid (and no change to the prior w3m buffer contents).

So it looks like buggy code, but are we bold enough to dare change it?

> > On 2017-07-16 07:44, Boruch Baum wrote:
> >> I noticed just now that in w3m.el, function `w3m-canonicalize-url',
> >> line 4574, that http is being used instead of https. Would it be much
> >> of a problem to default to https instead?
>
> I agree with making it default to https.
>
> Thanks.

Footnotes:
[1]  https://tools.ietf.org/html/rfc3986#section-3

[2]  https://tools.ietf.org/html/rfc3986#section-3.3

-- 
hkp://keys.gnupg.net
CA45 09B5 5351 7C11 A9D1  7286 0036 9E45 1595 8BC0