[Date Prev][Date Next][Thread Prev][][Date Index][Thread Index]

[emacs-w3m/emacs-w3m] Refactor codebase for utf-8 (#9)

The code maintains inconsistent use of character encoding. Maybe
because parts of the code were written so long ago, before utf-8
became a defacto world-wide standard, other character encoding were
set for various files. The purpose of this commit is to try to make
the project consistently -utf-8 throughout.

This commit was originally applied in four months ago in the git
repository which I was using for development befor the project had its
own official git repository. During the intervening four months it was
publicly available for testing and was the version which I was using.
I received no complaints, and observed nothing suspicious; however, I
don't use Japanese, and I don't know whether anyone else bothered to

In practice, much of the work should be easy to test just by using the
menu system in Japanese, and by using emacs-w3m for the various

The character sets that had been in use included those which the w3
consortium say are to be especially avoided
Most that have explicit encoding are set to iso-2022-7bit, and file
w3m-bug.el is encoded for 'euc-japan'.

Since this was a huge and mind-numbing task, I automated it.

Step 1 was a few sed commands to change the first line of the *.el files.

Step 2 was to run iconv on the files.

#+BEGIN_SRC conf
for file in *.el ;
do echo "$file" ;
iconv -c -f iso-2022-jp -t utf-8 "$file" > "$file".new ;

You can view, comment on, or merge this pull request online at:


Commit Summary

File Changes

Patch Links:

You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.