[Date Prev][Date Next][Thread Prev][][Date Index][Thread Index]

Re: Bug#674744: Wikipedia broke w3m-search-escape-query-string (+ vs. %20)



Hi emacs-w3m developers,

Forwarded from Debian bug#674744 <http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=674744>:

On May 27, 2012 at 6:46PM +1000, trentbuck (at gmail.com) wrote:
> Package: w3m-el-snapshot
> Version: 1.4.478+0.20120501-1
> Severity: normal
> 
> I use wikipedia as my default search engine:
> 
>     (setq
>      w3m-search-engine-alist
>      '(("google" "http://encrypted.google.com/search?q=%s";)
>        ;; ("google" "http://google.com.au/search?q=%s";)
>        ("wikipedia" "https://en.wikipedia.org/wiki/Special:Search/%s";)
>        ;; ("wikipedia" "http://en.wikipedia.org/wiki/Special:Search/%s";)
>        ("duckduckgo" "https://duckduckgo.com/lite?q=%s";))
>      w3m-search-default-engine "wikipedia")
> 
> Around 15 July 2011, this stopped working properly.  It turned out to
> be because Wikipedia started treating these links differently:
> 
>     https://en.wikipedia.org/wiki/Special:Search/foo+bar
>     https://en.wikipedia.org/wiki/Special:Search/foo%20bar
> 
> At that time, I deployed a workaround in my .emacs:
> 
>     ;;; Guerilla patch -- redefine this to cat using "%20" instead of "+", since
>     ;;; apparently Wikipedia doesn't like the latter anymore.
>     (eval-after-load "w3m-search"
>       '(defun w3m-search-escape-query-string (str &optional coding)
>          (mapconcat
>           (lambda (s)
>             (w3m-url-encode-string s (or coding w3m-default-coding-system)))
>           (split-string str)
>           "%20")))
> 
> ...however this should probably be dealt with upstream.  I do not know
> if simply changing + to %20 is the right way to address this, but I
> haven't had any problems so far.

On July 20, 2013 at 2:19PM +1000, trentbuck (at gmail.com) wrote:
> Although now I look at it, this is simpler:
> 
>     (eval-after-load "w3m-search"
>       '(defun w3m-search-escape-query-string (str &optional coding)
>          (w3m-url-encode-string str (or coding w3m-default-coding-system))))
> 
> Wikipedia *does* give diffrent results for .../foo+bar and .../foo bar
> (encoded as %20 or not).  The former has no exact match, so it goes to
> a results page, the latter goes to a specific article.
> 
> Do any search engines need "+" instead of " "?
> If not, maybe this function should not bother to split and rejoin.

Thanks,
-- 
Tatsuya Kinoshita