[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: w3m download pdf causes emacs to run out of memory



In [emacs-w3m:13036] On Wed, 26 Sep 2018 21:48:26 -0400, Ryan wrote:
> going to this url 'https://libgen.pw/item/adv/5a1f047f3a044650f5fdb74a',
> and trying to download the pdf M-d on the 'Get' link
> 'https://libgen.pw/download/book/5a1f047f3a044650f5fdb74a' causes emacs
> to use an extreme amount of memory (~1.6G or 20% on htop) and prompt
> with a out of memory error

For the moment I have no idea to make it work properly.  Sorry.
What is happening then is:

・Launch the w3m executable with the options -dump_extra, URL,
  etc.[1]
・w3m issues a lot of progress messages[2] while downloading, as
  if it runs as a normal user command invoked by a user manually.
・Emacs receives all those progress messages and then the pdf
  data in a single working buffer.
・Those messages seem to be issued for every constant period, so
  the messages will gobble memory if the downloading takes time.

Possible solutions would be:

・Make emacs-w3m ignore those progress messages.  But how do we
  do that?  Doesn't it slow emacs-w3m?
・Add a new command line option to w3m.  It makes w3m silent.

Another item that should be improved is:

・Make emacs-w3m parse the Content-Disposition header and prompt
  a user with its `filename' attribute as a candidate of the file
  name to save.  The url in question has this header:

Content-Disposition: attachment; filename="Mastering-Regular-Expressions.pdf"


[1] w3m -no-cookie -o follow_redirection=0 -o user_agent='Emacs-w3m/1.4.631 w3m/0.5.3+git20180520' -o accept_language=ja,en -dump_extra https://libgen.pw/download/book/5a1f047f3a044650f5fdb74a
[2] "W3m-in-progress: 0/5.79Mb" .. "W3m-in-progress: 5.79/5.79Mb"