[Date Prev][Date Next][Thread Prev][][Date Index][Thread Index]

Another approach to improve w3m-doenload


I tried the following two dowmload commands several times and
realized w3m is neither notably slower nor faster than wget.
The main factor that affects the speed would be how the net is
crowded at that time.

$ time w3m -no-cookie -o follow_redirection=0 -o user_agent='Emacs-w3m/1.4.632 w3m/0.5.3+git20190105' -o accept_language=ja,en -dump_extra https://ftp.gnu.org/pub/gnu/emacs/emacs-26.1.tar.xz| sed '1,/^$/d'> emacs-26.1.tar.xz

$ time wget https://ftp.gnu.org/pub/gnu/emacs/emacs-26.1.tar.xz

However, w3m-download (not Boruch Baum's wget version) is very
slow especially when downloading a big file as you all know.
IIUC, the cause of it is to take time to fetch data, that w3m
already buffered, into an Emacs buffer little by little.
Whereas, visiting file's contents similar in size into a buffer
is fast.  Even when the data are still being gotten to a buffer,
the download has already been completed in w3m, and the progress
indicator that w3m issues (because of the -dump_extra option)[1]
designates that it is so, so the indicator is useless at all.

One way to make it fast would be to download data directly to
a file, not a buffer.  There is neither a cache nor a progress
indicator, but who is bothered?  Even if download fails, there
should be no way to help it (if it is not due to an emacs-w3m
bug) other than retrying it.  The attached patch may be still
somewhat incomplete but works.

[1] Though it is not a main cause of the w3m-download slowness,
`w3m -dump_extra' issues too much amount of progress indicators
like this:

W3m-in-progress: 0/42.2Mb
[... 344,980-line ...]
W3m-in-progress: 42.2/42.2Mb
--- w3m.el~	2019-03-01 01:33:04.939158800 +0000
+++ w3m.el	2019-03-22 08:12:39.291856300 +0000
@@ -6075,49 +6075,51 @@
       (message "A url is required")
       (sit-for 1)))
-  (unless filename
+  (if filename
+      (when (file-exists-p filename)
+	(if (file-directory-p filename)
+	    (error "File(%s) is a directory" filename)
+	  (delete-file filename)))
     (let ((basename (file-name-nondirectory (w3m-url-strip-query url))))
       (when (string-match "^[\t ]*$" basename)
 	(when (string-match "^[\t ]*$"
 			    (setq basename (file-name-nondirectory url)))
 	  (setq basename "index.html")))
-      (setq filename
-	    (w3m-read-file-name (format "Download %s to: " url)
-				w3m-default-save-directory basename))))
+      (while (not filename)
+	(setq filename
+	      (w3m-read-file-name (format "Download %s to: " url)
+				  w3m-default-save-directory basename))
+	(when (file-exists-p filename)
+	  (if (file-directory-p filename)
+	      (message "File(%s) is a directory" (prog1 filename
+						   (sit-for 1)
+						   (setq filename nil)))
+	    (if (y-or-n-p (format "File(%s) already exists. Overwrite? "
+				  filename))
+		(delete-file filename)
+	      (setq filename nil))
+	    (message nil))))))
   (if (and w3m-use-ange-ftp (string-match "\\`ftp://" url))
       (w3m-goto-ftp-url url filename)
     (lexical-let ((url url)
 		  (filename filename)
 		  (page-buffer (current-buffer)))
-      (w3m-process-do-with-temp-buffer
-	  (type (progn
-		  (w3m-clear-local-variables)
-		  (setq w3m-current-url url)
-		  (w3m-retrieve url t no-cache post-data nil handler)))
-	(if type
-	    (let ((buffer-file-coding-system 'binary)
-		  (coding-system-for-write 'binary)
-		  jka-compr-compression-info-list
-		  format-alist)
-	      (when (or (not (file-exists-p filename))
-			(prog1 (y-or-n-p
-				(format "File(%s) already exists. Overwrite? "
-					filename))
-			  (message nil)))
-		(write-region (point-min) (point-max) filename)
-		(w3m-touch-file filename (w3m-last-modified url))
-		t))
-	  (ding)
-	  (with-current-buffer page-buffer
-	    (message "Cannot retrieve URL: %s%s" url
-		     (cond ((and w3m-process-exit-status
-				 (not (equal w3m-process-exit-status 0)))
-			    (format " (exit status: %s)"
-				    w3m-process-exit-status))
-			   (w3m-http-status
-			    (format " (http status: %s)" w3m-http-status))
-			   (t ""))))
-	  nil)))))
+      (w3m-process-do
+	  (_ret (shell-command
+		 (mapconcat
+		  (lambda (x)
+		    (replace-regexp-in-string "\\([\t ]\\)" "\\\\\\1" x))
+		  `(,w3m-command ,@w3m-command-arguments
+				 ,@(w3m-w3m-expand-arguments
+				    w3m-dump-head-source-command-arguments)
+				 ,url
+				 "|" "sed" "'1,/^$/d'" ">" ,filename)
+		  " ")))
+	(w3m-touch-file filename (w3m-last-modified url))
+	(with-current-buffer page-buffer
+	  (message "File(%s) has been downloaded" filename)
+	  (sit-for 1)
+	  t)))))
 ;;; Retrieve data: