[Date Prev][Date Next][Thread Prev][][Date Index][Thread Index]

[emacs-w3m/emacs-w3m] Refactor codebase for utf-8 (#9)



The code maintains inconsistent use of character encoding. Maybe
because parts of the code were written so long ago, before utf-8
became a defacto world-wide standard, other character encoding were
set for various files. The purpose of this commit is to try to make
the project consistently -utf-8 throughout.

This commit was originally applied in four months ago in the git
repository which I was using for development befor the project had its
own official git repository. During the intervening four months it was
publicly available for testing and was the version which I was using.
I received no complaints, and observed nothing suspicious; however, I
don't use Japanese, and I don't know whether anyone else bothered to
test.

In practice, much of the work should be easy to test just by using the
menu system in Japanese, and by using emacs-w3m for the various
shimbuns.

The character sets that had been in use included those which the w3
consortium say are to be especially avoided
https://www.w3.org/International/questions/qa-choosing-encodings#avoid.
Most that have explicit encoding are set to iso-2022-7bit, and file
w3m-bug.el is encoded for 'euc-japan'.

Since this was a huge and mind-numbing task, I automated it.

Step 1 was a few sed commands to change the first line of the *.el files.

Step 2 was to run iconv on the files.

#+BEGIN_SRC conf
for file in *.el ;
do echo "$file" ;
iconv -c -f iso-2022-jp -t utf-8 "$file" > "$file".new ;
done
#+END_SRC


You can view, comment on, or merge this pull request online at:

  https://github.com/emacs-w3m/emacs-w3m/pull/9

Commit Summary

File Changes

Patch Links:


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.