[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Czech characters displayed incorrectly
- From: niels.giesen@xxxxxxxxx
- Date: Sat, 26 Jan 2008 09:27:51 +0100
- X-ml-name: emacs-w3m
- X-mail-count: 09975
This bug report will be sent to the emacs-w3m development team,
not to your local site managers!!
Please write in simple English, because the emacs-w3m developers
aren't good at English reading. ;-)
Please describe as succinctly as possible:
- What happened.
- What you thought should have happened.
- Precisely what you were doing at the time.
Please also include any Lisp back-traces that you may have.
================================================================
Dear Bug Team!
Problem with czech characters in w3m. For instance, the sequence říň (if messed up, this
means r with a haček, long i and n with a haček) is displayed incorrectly.
When I retrieve the page with wget, and then visited as a file, it gets displayed
right. So this is the expected behaviour:
character: ř (331897, #o1210171, #x51079, U+0159)
charset: mule-unicode-0100-24ff (Unicode characters of the range U+0100..U+24FF.)
code point: #x20 #x79
syntax: w which means: word
category: l:Latin
buffer code: #x9C #xF4 #xA0 #xF9
file code: #xC5 #x99 (encoded by coding system mule-utf-8)
display: by this font (glyph code)
-Misc-Fixed-Medium-R-Normal--18-120-100-100-C-90-ISO10646-1 (#x159)
character: í (2285, #o4355, #x8ed, U+00ED)
charset: latin-iso8859-1 (Right-Hand Part of Latin Alphabet 1 (ISO/IEC 8859-1): ISO-IR-100.)
code point: #x6D
syntax: w which means: word
category: l:Latin
buffer code: #x81 #xED
file code: #xC3 #xAD (encoded by coding system mule-utf-8)
display: by this font (glyph code)
-Misc-Fixed-Medium-R-Normal--18-120-100-100-C-90-ISO8859-1 (#xED)
character: ň (331880, #o1210150, #x51068, U+0148)
charset: mule-unicode-0100-24ff (Unicode characters of the range U+0100..U+24FF.)
code point: #x20 #x68
syntax: w which means: word
category: l:Latin
buffer code: #x9C #xF4 #xA0 #xE8
file code: #xC5 #x88 (encoded by coding system mule-utf-8)
display: by this font (glyph code)
-Misc-Fixed-Medium-R-Normal--18-120-100-100-C-90-ISO10646-1 (#x148)
In emacs-w3m I got instead the following:
character: SPC (32, #o40, #x20, U+0020)
charset: ascii (ASCII (ISO646 IRV))
code point: #x20
syntax: which means: whitespace
category:
a:ASCII graphic characters 32-126 (ISO646 IRV:1983[4/0])
l:Latin
Properties: jisx0208: 53409;
buffer code: #x20
file code: #x20 (encoded by coding system utf-8)
display: by this font (glyph code)
-Misc-Fixed-Medium-R-Normal--18-120-100-100-C-90-ISO8859-1 (#x20)
character: ⊭ (343277, #o1236355, #x53ced, U+22AD)
charset: mule-unicode-0100-24ff
(Unicode characters of the range U+0100..U+24FF.)
code point: #x79 #x6D
syntax: w which means: word
buffer code: #x9C #xF4 #xF9 #xED
file code: #xE2 #x8A #xAD (encoded by coding system utf-8)
display: by this font (glyph code)
-Misc-Fixed-Medium-R-Normal--18-120-100-100-C-90-ISO10646-1 (#x22AD),,
character: SPC (32, #o40, #x20, U+0020)
charset: ascii (ASCII (ISO646 IRV))
code point: #x20
syntax: which means: whitespace
category:
a:ASCII graphic characters 32-126 (ISO646 IRV:1983[4/0])
l:Latin
Properties: jisx0208: 53409;
buffer code: #x20
file code: #x20 (encoded by coding system utf-8)
display: by this font (glyph code)
-Misc-Fixed-Medium-R-Normal--18-120-100-100-C-90-ISO8859-1 (#x20)
and
character: (232, #o350, #xe8)
charset: eight-bit-graphic (8-bit graphic char (0xA0..0xFF))
code point: #xE8
syntax: which means: whitespace
buffer code: #xE8
file code: #xE8 (encoded by coding system utf-8)
display: by this font (glyph code)
-Misc-Fixed-Medium-R-Normal--18-120-100-100-C-90-ISO8859-1 (#xE8),
I checked the same in stand-alone w3m, which displayed everything correctly. Please note
these are not the only czech characters displayed incorrectly. If you like me to provide
the whole alphabet, I shall. One strange thing I saw with some other characters is that
they seemed to be displayed as Thai characters. Please let me know if you want more
info.
Regards,
Niels Giesen
niels.giesen@xxxxxxxxx
================================================================
System Info to help track down your bug:
---------------------------------------
emacs-w3m-version
=> "1.4.4"
emacs-version
=> "GNU Emacs 22.1.1 (i486-pc-linux-gnu, X toolkit, Xaw3d scroll bars)\n of 2007-11-06 on terranova, modified by Ubuntu"
mule-version
=> "5.0 (SAKAKI)"
system-type
=> gnu/linux
w3m-version
=> "w3m/0.3.2+mee-p24-19+moe-1.5.0"
w3m-type
=> w3mmee
w3m-compile-options
=> ("lang=many" "kanji-symbols" "image" "color" "ansi-color" "mouse" "menu" "cookie" "ssl" "ssl-verify" "w3mmailer" "nntp" "gopher" "ipv6" "mark" "romaji")
w3m-language
=> nil
w3m-command-arguments
=> ("-o" "concurrent=0" "-F")
w3m-command-arguments-alist
=> nil
w3m-command-environment
=> (("W3MLANG" . "ja_JP.kterm"))
w3m-input-coding-system
=> ctext
w3m-output-coding-system
=> ctext
w3m-use-mule-ucs
=> nil