[Date Prev][Date Next][Thread Prev][][Date Index][Thread Index]

w3m-detect-meta-charset considering html comments



Dear Bug Team!

Debugging a coding issue on a page i found that following:

On (w3m-decode-buffer) when there is no content-charset defined by
headers, it calls (w3m-detect-meta-charset). But that ends up detecting
commented meta tags such as :
<!--<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />-->

Setting the wrong charset on the content.

I corrected the issue through:
(advice-add #'w3m-detect-meta-charset :before #'w3m-remove-comments)

Please add (w3m-remove-comments) where you see most properly (such as
w3m-detect-meta-charset or w3m-decode-buffer).

Thank you very much for such a great package, keep up with the good
work!


-- 
Att.,
Renato Ferreira



================================================================

System Info to help track down your bug:
---------------------------------------
emacs-w3m-version
 => "1.4.606"
emacs-version
 => "GNU Emacs 25.3.1 (x86_64-pc-linux-gnu, GTK+ Version 3.22.19)\n of 2017-09-16"
mule-version
 => "6.0 (HANACHIRUSATO)"
system-type
 => gnu/linux
(featurep 'gtk)
 => t
w3m-version
 => "w3m/0.5.3+git20170827"
w3m-type
 => w3m-m17n
w3m-compile-options
 => ("lang=en" "m17n" "image" "color" "ansi-color" "mouse" "gpm" "menu" "cookie" "ssl" "ssl-verify" "external-uri-loader" "nntp" "ipv6" "alarm" "mark")
w3m-language
 => nil
w3m-command-arguments
 => nil
w3m-command-arguments-alist
 => nil
w3m-command-environment
 => (("LC_ALL" . "C"))
w3m-input-coding-system
 => utf-8
w3m-output-coding-system
 => utf-8
w3m-use-mule-ucs
 => nil