[Date Prev][Date Next][Thread Prev][][Date Index][Thread Index]

Re: Proposal to change shimbun-rss-build-message-id

Katsumi Yamaoka <yamaoka@xxxxxxx> writes:
>>>>>> In [emacs-w3m : No.10873] David Engster wrote:
>> The function shimbun-rss-build-message-id, which should build a unique
>> message-id from an URL and optionally a date, fails for some of my RSS
>> groups. This is due to two things:
>> * Everything after a '?' in the URL is ignored, which is problematic
>>   since some CMS (e.g. Typo3) can generate RSS feeds with URLs like
>>   index.php?id=100&item=3124
>> * The optional date is not used at all.

> TSUCHIYA-san wrote in [emacs-w3m:10061] as follows:
> (cf. http://news.gmane.org/group/gmane.emacs.w3m/thread=7421)


> T> The point is how we treat things that follow "?".  As you know
> T> it's the CGI's query part, which often contains a session ID.
> T> If it's just a session ID, it will be as follows when having
> T> fetched the index page for the first time:
> T> <a href="0001?sid=0001">Article 1</a>
> T> But it will be the following when fetching the same index page
> T> for the second time:
> T> <a href="0001?sid=0002">Article 1</a>

Maybe we are talking about different things here.

I thought shimbun-rss-build-message-id takes URLs which are published
through RSS feeds. Those do not contain session IDs.

I have the problem that I read RSS feeds through rss-hash which publish
URLs like


and so on. This is a feed generated by tt_news, a widely used plugin for
Typo3. With the current default implemention of
shimbun-rss-build-message-id, the generated MIDs are not unique.

> So, I implemented the version of `shimbun-rss-build-message-id'
> that strips nothing in sb-mainichi.el, and in sb-nytimes.el
> afterward.

OK. Then I would suggest to override shimbun-rss-build-message-id for
rss-hash (and probably also atom-hash) with the version that doesn't
strip everything after '?'.