[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Location =?utf-8?b?44Gu5paH5a2X5YyW44GR?=

From: Katsumi Yamaoka <yamaoka@xxxxxxx>
Date: Mon, 22 May 2017 18:47:43 +0900
X-ml-name: emacs-w3m
X-mail-count: 12647
References: <20170219.204047.1851720903885178807.shirai@meadowy.org> <b4mlgt14u3k.fsf@jpl.org> <20170220.233557.1065049133520034814.shirai@meadowy.org>

In [emacs-w3m : No.12630]
On Sun, 19 Feb 2017 20:40:47 +0900, 白井さん wrote:
> Emacs-26 だと

>>> (string-make-unibyte STRING)

>>> This function is obsolete since 26.1;
>>> use `encode-coding-string'.

> なのですね。。。

おそらくこの文句を書いたと思われる Stefan Monnier さんが今日
rfc2047.el でこんな変更を行ないました:
<http://lists.gnu.org/archive/html/emacs-diffs/2017-05/msg00395.html>

-      (string-to-multibyte string))))
+      (if (multibyte-string-p string)
+          string
+        (decode-coding-string string 'us-ascii)))))

じっくり検証していませんが、たぶん string-(as|make|to)-multibyte
の置き換えはそれでいいんでしょう。

これに触発されて emacs-w3m に唯一残っている string-make-unibyte
の置き換えに再度挑戦。そうしたら、ひょんなことから insert の doc
string にこんなことが書かれているのを発見:

If the current buffer is multibyte, unibyte strings are converted
to multibyte for insertion (see ‘string-make-multibyte’).
If the current buffer is unibyte, multibyte strings are converted
to unibyte for insertion (see ‘string-make-unibyte’).

そして

> 最近の環境は Emacs-25.1 なのですが、ふと気づいたら、*w3m* の一番
> 上の Location: が文字化けしていました。例えば、

> http://www.google.co.jp/search?q=テスト&lr=lang_ja&hl=ja&oe=UTF-8&tbs=lr:lang_1ja
> https://ja.wikipedia.org/wiki/テスト

> などの「テスト」部分です。

この文字化けの根は、"%E3%83%86%E3%82%B9%E3%83%88" を "テスト" と
いう文字列にデコードする関数 w3m-url-decode-string にあることが
わかりました。

今までは "%E3" などの各 byte をデコードして、multibyte の環境で
それらを concat していたので、"テ" に相当する 3-byte "%E3%83%86"
の一部だけが multibyte 表現になってしまっていたのです。そういう
ものを正しく decode-coding するために string-make-unibyte を使っ
ていたわけですね。

これもまた、じっくり検証していませんが、

(string-(as|make|to)-multibyte STRING)

は

(with-temp-buffer
  (set-buffer-multibyte nil)
  (insert STRING)
  (buffer-string))

で置き換え可能のように思えます (use `encode-coding-string' と
いう文句に、私はまだ懐疑的です)。

で、それはそれとして、emacs-w3m ではもっとスマートに解決してみま
した (CVS)。
-- 
山岡

Follow-Ups:
- Re: Location =?utf-8?b?44Gu5paH5a2X5YyW44GR?=
  - From: Katsumi Yamaoka

References:
- Location $B$N(B$BJ8;z2=$1(B
  - From: $BGr0f=(9T(B
- Re: Location =?utf-8?b?44Gu5paH5a2X5YyW44GR?=
  - From: Katsumi Yamaoka
- Re: Location $B$NJ8;z2=$1(B
  - From: $BGr0f=(9T(B

Prev by Date: cvs $B:G@hC<$N(B w3m-util.el $B$G(B shimbun $B$N2hA|$,Mp$l$k(B
Next by Date: Re: Location =?utf-8?b?44Gu5paH5a2X5YyW44GR?=
Previous by thread: Re: Location $B$NJ8;z2=$1(B
Next by thread: Re: Location =?utf-8?b?44Gu5paH5a2X5YyW44GR?=
Index(es):
- Date
- Thread

Namazu Search: [Help]