[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[UPDATE] Re: Filters for stackexchange and youtube [SNIPPET]
New version of the snippet attached, but see my notes first.
On 2018-06-07 16:03, Katsumi Yamaoka wrote:
> In [emacs-w3m:12999]
> On Fri, 01 Jun 2018 08:58:55 -0400, Boruch Baum wrote:
> > I have filter functions for stackexchange and for youtube, and since
> > those are very commonly used site, I thought to share it for
> > consideration to be added to w3m-filter.el
>
> Because
>
> (byte-compile (lambda () (replace-regexp "REGEXP" "TO-STRING")))
> => Warning: `replace-regexp' is for interactive use only;
> use `re-search-forward' and `replace-match' instead.
>
> , I've temporarily replaced it with a function that uses
> `perform-replace' in `w3m-filter-stackexchange' locally. Anyway,
> it's better to try ``make clean lisp'' before posting a patch.
>
1] My experience today is that your proposed solution broke the filter;
function `perform-replace' is itself a part of `replace-regexp', and
was prompting the user repeatedly for responses, something that wasn't
happening to me for 'replace-regexp'.
2] The solution re-coded per recommendation written in the compiler
warning was even worse, due to some side-effect of how emacs-w3m is
structured. For some reason, a subset of the search `while' loops
would generate an "error in process sentinel: while: Search failed:".
3] The attached snippet thus includes a comment asking myself or someone else
to track down whatever process sentinel is responsible, and fix the
error. For the interim, I found myself doing what I was alarmed at
seeing so often elsewhere in the code-base, and one of the thing that
I started removing in the refactoring: I wrapped the code in a
`condition-case' statement to suppress the error!
3.1] For all I know, this may be exactly how many / most / all those
other `condition-case' statements leaked into the code!
4] It would have simpler to just live with the compiler warnings, since
they are just warnings, and keep the original version, but it is
better to be strict / formal and not generate any warnings at all, so
here attached is the version with the 'condition-case', byte-compiled
with no errors.
--
hkp://keys.gnupg.net
CA45 09B5 5351 7C11 A9D1 7286 0036 9E45 1595 8BC0
(defun w3m-filter-stackexchange (url)
"Filter top and bottom cruft for stackexchange.com."
(w3m-filter-delete-regions url
"<body.*>" "<h1.*>" t t t nil nil 1)
(w3m-filter-delete-regions url
"<h2 class=\"space\">Your Answer</h2>" "<h4 id=\"h-related\">Related</h4>"
nil t nil nil nil 1)
(w3m-filter-delete-regions url
"<div id=\"hot-network-questions\" class=\"module tex2jax_ignore\">" "</body>"
nil t nil nil nil 1)
; (when (search-forward "<table>" nil t)
; (replace-match ""))
(goto-char (point-min))
(w3m-filter-delete-regions url
"<a class=\"vote-[ud]"
"</a>" nil nil t (point))
(goto-char (point-min))
(w3m-filter-delete-regions url
"<a class=\"star-off"
"</td>")
(w3m-filter-replace-regexp url
"<span itemprop=\"upvoteCount[^>]+>"
"Votes: ")
(w3m-filter-replace-regexp url
"<div class=\"post-text[^>]+>"
"<blockquote>")
(w3m-filter-replace-regexp url
"<div class=\"post-taglist[^>]+>"
"</blockquote>")
(w3m-filter-delete-regions url
"<a name='new-answer'>"
"</form>" nil nil nil nil nil 1)
(w3m-filter-replace-regexp url
"<div class=\"spacer\">[^>]+>[^>]+>+?\\([0-9]+\\)</div></a>"
"\\1 ")
(w3m-filter-delete-regions url
"<td class=\"vt\">"
"</td>")
(goto-char (point-min))
; TODO: FIXME: The following condition-case is a kludge because when
; the `re-search-forward' statements were not finding anything, we
; were getting "error in process sentinel: while: Search failed:". The
; proper solution is likely in the process sentinel (whichever one
; that turns out to be), not here.
(condition-case nil
(while (search-forward "<div class=\"user-info \">" nil t)
(let ((p1 (match-end 0))
(p2 (if (search-forward "<li" nil t)
(match-beginning 0)
(point-max))))
(w3m-filter-delete-regions url
"<div class=\"user-details\">" "</a>" nil nil nil p1 p2)
(goto-char p1)
(while (re-search-forward "</?div[^>]*>" p2 nil)
(replace-match ""))
(goto-char p1)
(while (re-search-forward "<span class=\"reputation-score[^>]*>" p2 nil)
(replace-match "[rep:"))
(goto-char p1)
(while (re-search-forward "<span class=\"badge1\">" p2 nil)
(replace-match "] [gold:"))
(goto-char p1)
(while (re-search-forward "<span class=\"badge2\">" p2 nil)
(replace-match "] [silver:"))
(goto-char p1)
(while (re-search-forward "<span class=\"badge3\">" p2 nil)
(replace-match "] [bronze:"))
(goto-char p1)
(while (re-search-forward "</?span[^>]*>" p2 nil)
(replace-match ""))))
(error))
(w3m-filter-replace-regexp url
"<td" "<td valign=top")
(w3m-filter-delete-regions url
"<div id=\"tabs\">"
"<a name" nil t)
(goto-char (point-min))
(while (search-forward "<div id=\"answer-" nil t)
(replace-match "</ul><hr>\\&"))
(w3m-filter-delete-regions url
"<div id=\"comments-link"
"</div>")
(goto-char (point-min))
(when (search-forward "<h4 id=\"h-linked\">Linked</h4>" nil t)
(replace-match "<p><b>Linked</b><br>")
(let ((p1 (match-end 0))
(p2 (progn
(search-forward "<h4" nil t)
(match-beginning 0))))
(goto-char p1)
(while (re-search-forward "^\t</a>" p2 nil)
(replace-match ""))
(goto-char p1)
(while (re-search-forward "</a>" p2 nil)
(replace-match "</a><br>"))
(goto-char p1)
(while (re-search-forward "</div>" p2 nil)
(replace-match " "))
(w3m-filter-delete-regions url
"<div class=\"spacer\">"
"<div class=\"answer-votes answered-accepted [^>]+>"
nil nil t)))
(goto-char (point-min))
(when (search-forward "<table id=\"qinfo\">" nil t)
(replace-match "")
(let ((p1 (match-end 0))
(p2 (progn
(search-forward "</table>" nil t)
(replace-match "")
(match-end 0))))
(w3m-filter-replace-regexp url "<tr>" "" p1 p2)
(w3m-filter-replace-regexp url "</tr>" "<br>" p1 p2)
(w3m-filter-replace-regexp url "</?td[^>]*>" "" p1 p2)
(w3m-filter-replace-regexp url "<b>" "" p1 p2)
(w3m-filter-replace-regexp url "<a[^>]+>" "" p1 p2)
(w3m-filter-replace-regexp url "</?p[^>]*>" "" p1 p2))))