Re: Bug#457909: Changes + to space in URLs

On Mon, Jan 07, 2008 at 06:12:53PM +0900, Katsumi Yamaoka wrote:
>> I regularly use y (w3m-print-current-url) to display the URL at
>> point in the echo area.  I then use Screen's C-a [ (copy) to
>> copy-and-paste the URL into a non-Emacs window running, say, wget.
>> I noticed that for URLs like
>> https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.15/+bug/54419
>> ...the +'s are replaced with spaces, resulting in a *wrong* URL.
> (setq w3m-show-decoded-url nil)

Ah, thanks.  I did not know about this setting.

> The behavior of the function in question (w3m-url-decode-string)
> has never changed basically since it was introduced in August,
> 2001.  I'm not sure whether replacing +'s with spaces is wrong

I still think that converting hex escapes to normal characters is
correct, but converting a + (%2b) to a space (%20) is a Bad Thing, but
not enough to fight for it, since I rarely use non-ASCII URLs.


Closing the Debian bug.  Further remarks are for posterity; people
reading this bug report in later years.

> but I believe it is done for a user to view with the eyes, not
> for copying a url.

Instead of using screen's C-a [ directly, I should be customizing
interprogram-paste-function and interprogram-cut-function to use
send-string-to-terminal to somehow use screen copy/paste indirectly.
That way simply typing y (w3m-print-current-url) would copy the
non-decoded URL to the screen clipboard.  Then I could leave
w3m-show-decoded-url at t, and the converting of + to space wouldn't
really bother me.

> The function does not only replacing +'s with spaces but also
> decoding characters that are represented in the hexadecimal form.
> In the later case, it might fail in copying a url displayed in the
> echo area to the other client because of non-ASCII characters.

For the record, as at version 4.0.3 GNU Screen's copy/paste feature
works fine in a UTF-8 environment.  OTOH, the screenshot feature does

