[Date Prev][Date Next][Thread Prev][][Date Index][Thread Index]

Re: Proposal to change shimbun-rss-build-message-id



Katsumi Yamaoka <yamaoka@xxxxxxx> writes:
>>>>>> In [emacs-w3m : No.10876] David Engster wrote:
> [...]
>> Maybe we are talking about different things here.
>
>> I thought shimbun-rss-build-message-id takes URLs which are published
>> through RSS feeds. Those do not contain session IDs.
>
> Tsuyoshi CHO wrote in [emacs-w3m:10063] that there will likely
> be not only session IDs but also meaningless queries or the ones
> for tracking ?rss, etc.  Those cause different IDs for a single
> article.  So, he agreed that the default behavior of
> `shimbun-rss-build-message-id' is that of sb-rss.el.

Thank you for the explanation.

>> I have the problem that I read RSS feeds through rss-hash which publish
>> URLs like
>
>> index.php?id=256&tx_ttnews%5Btt_news%5D=214&cHash=717f4649d4
>> index.php?id=256&tx_ttnews%5Btt_news%5D=213&cHash=498f45accb
>
>> and so on. This is a feed generated by tt_news, a widely used plugin for
>> Typo3. With the current default implemention of
>> shimbun-rss-build-message-id, the generated MIDs are not unique.
>
>>> So, I implemented the version of `shimbun-rss-build-message-id'
>>> that strips nothing in sb-mainichi.el, and in sb-nytimes.el
>>> afterward.
>
>> OK. Then I would suggest to override shimbun-rss-build-message-id for
>> rss-hash (and probably also atom-hash) with the version that doesn't
>> strip everything after '?'.
>
> Though I'm not an expert for that, I believe it's harmless.

If that is the case and nobody objects, maybe the attached patch could
be applied.

(sb-atom-hash is not affected by this, since it already contains its own
shimbun-atom-build-message-id.)

Thank you,
David
Index: ChangeLog
===================================================================
RCS file: /storage/cvsroot/emacs-w3m/shimbun/ChangeLog,v
retrieving revision 1.192
diff -u -r1.192 ChangeLog
--- ChangeLog	11 May 2009 10:53:03 -0000	1.192
+++ ChangeLog	13 May 2009 12:24:23 -0000
@@ -1,3 +1,8 @@
+2009-05-13  David Engster  <dengste@xxxxxx>
+
+	* sb-rss-hash.el (shimbun-rss-build-message-id): New override so that
+	URL is not stripped at question mark.
+
 2009-05-11  Katsumi Yamaoka  <yamaoka@xxxxxxx>
 
 	* sb-yahoo.el (shimbun-yahoo-content-end): Update.
Index: sb-rss-hash.el
===================================================================
RCS file: /storage/cvsroot/emacs-w3m/shimbun/sb-rss-hash.el,v
retrieving revision 1.5
diff -u -r1.5 sb-rss-hash.el
--- sb-rss-hash.el	5 Apr 2009 11:56:05 -0000	1.5
+++ sb-rss-hash.el	13 May 2009 12:24:23 -0000
@@ -144,6 +144,17 @@
   (content-hash-shimbun-article (luna-slot-value shimbun 'content)
 				shimbun header outbuf))
 
+(luna-define-method shimbun-rss-build-message-id ((shimbun
+						   shimbun-rss-hash)
+						  url &optional date)
+  (let* ((group (shimbun-current-group-internal shimbun)))
+    (when (string-match "#" url)
+     (setq url (substring url 0 (match-beginning 0))))
+    (when (stringp date)
+      (setq url (concat url date)))
+    (concat "<" (md5 (concat url)) "." group "@rss-blogs>")))
+
+
 (provide 'sb-rss-hash)
 
 ;;; sb-rss-hash.el ends here