[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: sb-atom and sb-rss fetching already deleted articles
- From: David Engster <deng@xxxxxxxxxxxxxxx>
- Date: Thu, 25 Feb 2010 22:00:34 +0100
- X-ml-name: emacs-w3m
- X-mail-count: 11162
- References: <m2bpfnlx4l.fsf@xxxxxxxxxxxxxxxxxx> <b4m1vgb9chu.fsf@xxxxxxx> <87ocjen8di.fsf@xxxxxxxxxxxxxxx> <b4m1vga9i9r.fsf@xxxxxxx>
Katsumi Yamaoka <yamaoka@xxxxxxx> writes:
>>>>>> In [emacs-w3m : No.11159] David Engster wrote:
>> That's perfectly fine for me. Another solution might be to compare the
>> date headers of the entries: get all entries of the feed which are not
>> present in the group, but also check their date header if they are
>> actually newer than the ones which are already there. If an entry is
>> older, we can assume it was already expired?
>
> I'm not familiar with wl/elmo code nowadays, but AFAICT, wl/elmo
> doesn't seem to have the one like nov that keeps date information.
> So, I've made changes in sb-rss.el and sb-rss-blogs.el in the way
> I wrote last. sb-atom.el is unchanged.
I've implemented the approach I quoted above and attached a patch
against current CVS. The advantage of this approach is that it works
more generally, without the need for an extra parameter. So far it has
worked for me, but I obviously couldn't test some of the special cases
for 'hankaku' which are implemented there.
Maybe you could take a look at the patch and see if it works for you. I
could implement the same for sb-atom.el if you should decide to apply
it.
Regards,
David
? sb-slashdot-patch.diff
Index: sb-rss-blogs.el
===================================================================
RCS file: /storage/cvsroot/emacs-w3m/shimbun/sb-rss-blogs.el,v
retrieving revision 1.3
diff -u -r1.3 sb-rss-blogs.el
--- sb-rss-blogs.el 25 Feb 2010 02:11:10 -0000 1.3
+++ sb-rss-blogs.el 25 Feb 2010 20:53:09 -0000
@@ -80,7 +80,7 @@
shimbun-rss-blogs)
&optional range)
(let ((group (shimbun-current-group-internal shimbun))
- (headers (shimbun-rss-get-headers shimbun range t nil t))
+ (headers (shimbun-rss-get-headers shimbun range t))
(type (sb-rss-blogs-guess-type-from-rss))
from)
(cond
Index: sb-rss.el
===================================================================
RCS file: /storage/cvsroot/emacs-w3m/shimbun/sb-rss.el,v
retrieving revision 1.50
diff -u -r1.50 sb-rss.el
--- sb-rss.el 25 Feb 2010 02:11:10 -0000 1.50
+++ sb-rss.el 25 Feb 2010 20:53:09 -0000
@@ -173,8 +173,7 @@
(shimbun-rss-get-headers shimbun range t))
(defun shimbun-rss-get-headers (shimbun &optional range
- need-descriptions need-all-items
- quit-immediately)
+ need-descriptions need-all-items)
(let ((xml (condition-case err
(xml-parse-region (point-min) (point-max))
(error
@@ -182,14 +181,55 @@
(shimbun-index-url shimbun)
(error-message-string err))
nil)))
- (ignored-subject (luna-slot-value shimbun 'ignored-subject))
- dc-ns rss-ns author hankaku headers)
- (when xml
- (setq dc-ns (shimbun-rss-get-namespace-prefix
- xml "http://purl.org/dc/elements/1.1/")
- rss-ns (shimbun-rss-get-namespace-prefix
- xml "http://purl.org/rss/1.0/")
- author
+ headers newheaders newestdate)
+ (setq headers
+ (shimbun-rss-get-headers-1 xml shimbun need-descriptions))
+ (when headers
+ (if need-all-items
+ headers
+ ;; Search for the newest item we already have in the shimbun.
+ (dolist (header headers newestdate)
+ (let* ((date (shimbun-header-date header))
+ (id (shimbun-header-id header))
+ datesecs)
+ (when (and (stringp date)
+ (> (length date) 0))
+ (setq datesecs (float-time (date-to-time date))))
+ (if (and datesecs
+ (shimbun-search-id shimbun id)
+ (or (null newestdate)
+ (< newestdate datesecs)))
+ (setq newestdate datesecs)
+ (push header newheaders))))
+ (if (null newestdate)
+ ;; Feed does not have any date information, so we return everything.
+ newheaders
+ ;; Return only those items which are newer.
+ (delq nil
+ (mapcar
+ (lambda (header)
+ (let ((date (shimbun-header-date header))
+ datesecs)
+ (when (and (stringp date)
+ (> (length date) 0))
+ (setq datesecs (float-time (date-to-time date))))
+ (when (or (null datesecs)
+ (<= newestdate datesecs))
+ header)))
+ newheaders)))))))
+
+
+(defun shimbun-rss-get-headers-1 (xml shimbun need-descriptions)
+ "Retrieve all items found in XML for SHIMBUN and return headers.
+If NEED-DESCRIPTIONS, include node text as description."
+ (when xml
+ (let ((dc-ns (shimbun-rss-get-namespace-prefix
+ xml "http://purl.org/dc/elements/1.1/"))
+ (rss-ns (shimbun-rss-get-namespace-prefix
+ xml "http://purl.org/rss/1.0/"))
+ (ignored-subject (luna-slot-value shimbun 'ignored-subject))
+ author hankaku headers)
+ (setq author
(catch 'found-author
(dolist (channel
(shimbun-rss-find-el (intern (concat rss-ns "channel"))
@@ -203,29 +243,21 @@
'(body nil))
(generate-new-buffer " *temp*")))
(unwind-protect
- (catch 'done
- (dolist (item (shimbun-rss-find-el (intern (concat rss-ns "item"))
- xml)
- headers)
- (let ((url (and (listp item)
- (eq (intern (concat rss-ns "item")) (car item))
- (shimbun-rss-node-text rss-ns 'link (cddr item)))))
- (when url
- (let* ((date (or (shimbun-rss-get-date shimbun url)
- (shimbun-rss-node-text dc-ns 'date item)
- (shimbun-rss-node-text rss-ns 'pubDate item)))
- (id (shimbun-rss-build-message-id shimbun url date))
- (subject (shimbun-rss-node-text rss-ns 'title item)))
- (when (and id
- (or need-all-items
- (if (shimbun-search-id shimbun id)
- (if quit-immediately
- (throw 'done headers)
- nil)
- t))
- (if (and ignored-subject subject)
- (not (string-match ignored-subject subject))
- t))
+ (dolist (item (shimbun-rss-find-el (intern (concat rss-ns "item"))
+ xml)
+ headers)
+ (let ((url (and (listp item)
+ (eq (intern (concat rss-ns "item")) (car item))
+ (shimbun-rss-node-text rss-ns 'link (cddr item)))))
+ (when url
+ (let* ((date (or (shimbun-rss-get-date shimbun url)
+ (shimbun-rss-node-text dc-ns 'date item)
+ (shimbun-rss-node-text rss-ns 'pubDate item)))
+ (id (shimbun-rss-build-message-id shimbun url date))
+ (subject (shimbun-rss-node-text rss-ns 'title item)))
+ (when id
+ (unless (and ignored-subject subject
+ (string-match ignored-subject subject))
(push
(shimbun-create-header
0