[Date Prev][Date Next][Thread Prev][][Date Index][Thread Index]

Re: shimbun.el: ISO-2022-JP as default Content-type charset?



>>>>> In [emacs-w3m : No.09576] Eugene Oleinik wrote:

> I think I've found a bug or at least something that should be
> made customizable.

> I use Shimbun's `rss-hash' backend to read articles with
> cyrillic and occasionally CJK characters, almost always
> encoded in utf-8 nowdays.

> With default setup I get

>   Content-Type: text/html; charset=ISO-2022-JP

> in header,

It seems to be one of the vestiges made when Shimbun was
supporting only the Japanese newspapers.  The default timezone,
which is used when generating a Date header, is another example.

> and thus unreadable garbled body, filled with something like

>   ^[$,1(>(b(Z(`(k([(j^[(B ^[$,1(T([(o^[(B ^[$,1(a(U(Q(o^[(B
>   ^[$,1(g(`(U(W(R(k(g(P(Y(](^^[

> When default charset changed to utf-8, articles are displayed
> correctly.

This is the right fix, since there seems to be currently no way
to use the charset that the original web page uses.  I've applied
your patch to the CVS trunk.  Thanks for the contribution.

Now Emacs 21 users and XEmacs 21.4 users who read CJK pages have
to load the Mule-UCS package.

> -    (setq charset "ISO-2022-JP"))
> +    (setq charset "utf-8"))

I use "UTF-8" so that it may follow in FLIM's convention.