[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Google broken
- From: Katsumi Yamaoka <yamaoka@xxxxxxx>
- Date: Mon, 28 May 2012 08:41:04 +0900
- X-ml-name: emacs-w3m
- X-mail-count: 11831
- References: <CAJcAo8smRS-FrmZXP5UmYQ9ym+kmsz_aCH1usEsSmVOSRYt0Zg@xxxxxxxxxxxxxx> <b4mobwm9n1g.fsf@xxxxxxx> <CAJcAo8vmSana6SSqEpwWuO8A_KTNTe_JpOMs3qyOZ_Zgtyx8OA@xxxxxxxxxxxxxx> <b4m1um83ket.fsf@xxxxxxx> <CAJcAo8vAAVuHDj5CJPUtkMJKTPNf_VfPtFojh9Pt80_QcK6HYA@xxxxxxxxxxxxxx>
In [emacs-w3m : No.11830] Samuel Wales wrote:
> On 5/24/12, Katsumi Yamaoka <yamaoka@xxxxxxx> wrote:
>> Could you let me know a search word that reproduces the problem?
> I think it is every search.
>> Or the url of the search result page? Since Google appears to
> http://www.google.com/search%3Fq%3Dkatsumi%2Bjpl%26btnG%3DSearch%26oe%3Dutf-8
> With your filter and (setf w3m-fill-column 50):
Ah, thanks. I realized it's overkill that the filter removes
<br>s and trailing whitespace. The new filter preserves a space
at the line-break point. In addition, it makes text more easy-
to-read by separating ASCII and non-ASCII words with a space,
and inserting a space after a comma.
--8<---------------cut here---------------start------------->8---
(setq w3m-use-filter t)
(require 'w3m-filter)
(when (rassoc '(w3m-filter-google) w3m-filter-rules)
(setcdr (rassoc '(w3m-filter-google) w3m-filter-rules)
'(w3m-filter-google-2)))
(defun w3m-filter-google-2 (url)
"Align table columns vertically to shrink the table width."
(let ((case-fold-search t)
last)
(goto-char (point-min))
(while (re-search-forward "<tr[\t\n\r >]" nil t)
(when (w3m-end-of-tag "tr")
(save-restriction
(narrow-to-region (goto-char (match-beginning 0))
(match-end 0))
(setq last nil)
(while (re-search-forward "<td[\t\n\r >]" nil t)
(when (w3m-end-of-tag "td")
(setq last (match-end 0))
(replace-match "<tr>\\&</tr>")))
(when last
(goto-char (+ 4 last))
(delete-char 4))
(goto-char (point-max)))))
;; Remove width spec and <br>s.
(goto-char (point-min))
(while (re-search-forward "<table[\t\n\r >]" nil t)
(when (w3m-end-of-tag "table")
(save-restriction
(narrow-to-region (goto-char (match-beginning 0))
(match-end 0))
(while (re-search-forward
"[\t\n\r ]*\\(?:width=\"[^\"]+\"\\|<br>\\)[\t\n\r ]*"
nil t)
;; Preserve a space at the line-break point.
(replace-match " "))
;; Insert a space between ASCII and non-ASCII characters
;; and after a comma.
(goto-char (point-min))
(while (re-search-forward "\
\\([!-~]\\)\\([^ -~]\\)\\|\\([^ -~]\\)\\([!-~]\\)\\|\\(,\\)\\([^ ]\\)"
nil t)
(replace-match (cond ((match-beginning 1)
"\\1 \\2")
((match-beginning 3)
"\\3 \\4")
(t
"\\5 \\6"))))
(goto-char (point-max)))))))
--8<---------------cut here---------------end--------------->8---