[Date Prev][Date Next][Thread Prev][][Date Index][Thread Index]

Re: Coding systems



In [emacs-w3m:12914]
On Sun, 18 Feb 2018 06:42:22 -0500, Boruch Baum wrote:
> Is there a reason at this point in history that the default coding
> systems for emacs-w3m are not utf-8? I see six related variables for
> emacs-w3m, which seems like a lot, but there may be others that I've
> missed.

I think it is good to make all the ^w3m.+coding-system variables,
except the ones that should inherit system's locale, be utf-8.
A coding system such as utf-8 that can be used to encode/decode
any charset was not generalized in the Emacs world when we started
emacs-w3m, so we used iso-2022-7bit, etc. then instead.

But we should be very careful if we change those default values.
For instance, even the bleeding-edge Emacs still uses traditional
japanese-shift-jis as the default coding system for the Japanese
language environment on windows-nt and cygwin, whereas Windows
changed at least the file name coding system to utf-8 long before.
I mean there will probably be those who still use shift_jis on
Windows, and similarly those who still use euc-jp on Unix/Linux.
Such is one of the difficulties of the Japanese coding systems.

Here are the ones on my cygwin environment:

$ locale
LANG=C
LC_CTYPE="ja_JP.UTF-8"
LC_NUMERIC="C"
LC_TIME="C"
LC_COLLATE="C"
LC_MONETARY="C"
LC_MESSAGES="C"
LC_ALL=

w3m-bookmark-file-coding-system
 euc-japan
w3m-coding-system
 iso-2022-7-bit
w3m-default-coding-system
 shift_jis
w3m-file-coding-system
 iso-2022-7-bit
w3m-form-textarea-file-coding-system
 iso-2022-7-bit-ss2
w3m-input-coding-system
 utf-8
w3m-output-coding-system
 utf-8
w3m-terminal-coding-system
 euc-japan

I don't maintain those w3m-* variables except for the ones of
which the values are utf-8, but it causes no problem for at
least the features I usually use.