[Date Prev][Date Next][Thread Prev][][Date Index][Thread Index]

Re: Help: using chinese-gbk

Quoting Katsumi Yamaoka <yamaoka@xxxxxxx>:


In [emacs-w3m : No.09362] Jielei Fan wrote:

I surf internet using the emacs-w3m written by you. But I meet a
problem, I have already installed
 mule-gbk package on my emacs 22(I use it on windows xp system), and
it works well, however it
does not work in w3m-mode, in which iso-8859-1-dos, gb2312-dos or
other code system will invoked
automatically. But if I write these codes

;; (setq w3m-bookmark-file-coding-system 'chinese-gbk)
;; (setq w3m-coding-system 'chinese-gbk)
;; (setq w3m-default-coding-system 'chinese-gbk)
;; (setq w3m-file-coding-system 'chinese-gbk)
;; (setq w3m-file-name-coding-system 'chinese-gbk)
;; (setq w3m-terminal-coding-system 'chinese-gbk)
;; (setq w3m-input-coding-system 'chinese-gbk)
;; (setq w3m-output-coding-system 'chinese-gbk)

in my .emacs,

the website will be emerged mess code.

First of all, you should never have need to modify at least `w3m-input-coding-system' and `w3m-output-coding-system'. The values for those variables should be supported by the external w3m command, and `utf-8' is a good choice. If I understand correctly, GBK is a superset of GB2312 and all characters can be expressed with Unicode.

Emacs-w3m fetches an html page as binary data, decode it
according to the charset that the page specifies[1], encode it
with a certain coding system[2], and passes it to the external
w3m command.  And then the external w3m processes it, encodes it
with a certain coding system[3], returns it to emacs-w3m, and
finally emacs-w3m decodes it with a certain coding system[3].

[1] The charset is specified in the page header or in the meta
tag which looks like:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

The `=' command shows the page header, and the `\' command shows
raw (but charset-decoded) html contents.

[2] The value of `w3m-input-coding-system'.

[3] The value of `w3m-output-coding-system'.

I think the cause of your problem is that emacs-w3m doesn't know
how to find a suitable coding system for the GBK charset, and it
might be solved by adding a proper rule to the
`w3m-charset-coding-system-alist'.  Could you let me know a
typical web page that uses the GBK charset?

Another problem is that, when I use command w3m-search(google engine),
after I input chinese character,
the content will be taken as ??????? in google website. Could you
please tell me how to solve it?

Well, it will probably be solved if all GBK pages are displayed correctly.


I think I should set `w3m-charset-coding-system-alist' to support chinese,
and then set variable 'w3m-terminal-coding-system'.
but I dont know how to do it?  I hope to search chinese in search engine.

Best regards,