[Date Prev][Date Next][Thread Prev][][Date Index][Thread Index]

w3m-expand-url



青田です。

w3m-expand-urlをいろいろといじりました。

まず、
(w3m-expand-url "index.html" "http://example.com")
 => "http://example.com/home/hoge/.w3m/index.html"
のように BaseURI がホスト名で終わっていると結果がおかしくなるので修正。

次に、emacs-w3m:09333での私のパッチの実装方法がどうも別のバグをひきおこ
しているようだったので修正しました。

ついでに、RFC3986の5.4.1、5.4.2を見てw3m-expand-urlの結果をテストしまし
た。RFCと違ったのは以下の8つでした。

BaseURI = http://a/b/c/d;p?q
  url        RFCによる定義      w3m-expand-urlの結果
1 ?y         http://a/b/c/d;p?y	http://a/b/c/?y
2 /../g      http://a/g         http://a/../g
3 ../../../g http://a/g         http://a/../g
4 http:g     http:g             http://a/b/c/g
5 ./g/.      http://a/b/c/g/    http://a/b/c/g
6 /./g       http://a/g         http://a/./g
7 ..         http://a/b/        http://a/b
8 .          http://a/b/c/      http://a/b/c

1〜3は意図と違うページを開いたり、Apacheが理解しなかったりするので修正
しました。

4は厳格に言えば間違いなのですが、互換性を考えると特に問題ないようなので
放置しています。

5〜8は、(file-name-nondirectory (substring url 0 path-end))が"."か".."
の時に"/"を追加してやれば直るような気はするのですが…どうせサーバ側で処
理できるでしょうし直していません。

ところで、URLがhttp://example.com/?q=fooのページで、

<form method="GET" action="">
<input type="hidden" value="bar" name="q">
<input type="sumit" value="SUBMIT">
</form>

こんなformをsubmitすると、http://example.com/?q=foo?q=barをとりにいって
ますが、これはどう処理すべきなのでしょうか?クエリを上書きするのが(おそ
らく)制作者の意図にはあうのでしょうけど…。
(素のw3mだと上書きしてるみたいですね)

Index: ChangeLog
===================================================================
RCS file: /storage/cvsroot/emacs-w3m/ChangeLog,v
retrieving revision 1.3043
diff -u -r1.3043 ChangeLog
--- ChangeLog	7 Sep 2007 01:12:17 -0000	1.3043
+++ ChangeLog	9 Sep 2007 15:47:37 -0000
@@ -1,3 +1,9 @@
+2007-09-09  Naohiro Aota  <nao.aota@xxxxxxxxx>
+
+	* w3m.el (w3m-expand-url): Use "/" as path when it of base-uri is not
+	defined; Clear query of base-uri when empty query exist; Changes to
+	follow RFC3986.
+
 2007-09-07  Katsumi Yamaoka  <yamaoka@xxxxxxx>
 
 	* w3m-ems.el (w3m-euc-japan-encoder, w3m-iso-latin-1-encoder): Use
Index: w3m.el
===================================================================
RCS file: /storage/cvsroot/emacs-w3m/w3m.el,v
retrieving revision 1.1302
diff -u -r1.1302 w3m.el
--- w3m.el	7 Sep 2007 01:12:16 -0000	1.1302
+++ w3m.el	9 Sep 2007 15:48:11 -0000
@@ -6013,10 +6013,6 @@
 	       (= (match-beginning 9) (length url)))
       (setq url (substring url 0 (match-beginning 8)))
       (w3m-string-match-url-components url))
-    (when (and (not (zerop (length url)))
-	       (eq ?? (aref url 0)))
-      (setq url (concat "./" url))
-      (w3m-string-match-url-components url))
     ;; Remove an empty query part.
     (when (and (match-beginning 6)
 	       (= (match-beginning 7) (or (match-beginning 8)
@@ -6024,7 +6020,9 @@
       (setq url (concat (substring url 0 (match-beginning 6))
 			(if (match-beginning 8)
 			    (substring url (match-beginning 8))
-			  "")))
+			  ""))
+	    base (progn (w3m-string-match-url-components base)
+			(substring base 0 (match-beginning 6))))
       (w3m-string-match-url-components url))
     (cond
      ((match-beginning 1)
@@ -6048,34 +6046,32 @@
       (w3m-string-match-url-components base)
       (concat (substring base 0 (match-end 1)) url))
      ((> (match-end 5) (match-beginning 5))
-      ;; URL has a hierarchical part.
-      (if (eq ?/ (aref url (match-beginning 5)))
-	  ;; Its first character is the slash "/". => The hierarchical
-	  ;; part of URL has an absolute spec.
-	  (progn
-	    (w3m-string-match-url-components base)
-	    (concat (substring base 0 (or (match-end 3) (match-end 1)))
-		    url))
-	;; The hierarchical part of URL has a relative spec.
-	(let ((path-end (match-end 5))
-	      ;; See the following thread about a problem related to
-	      ;; the use of file-name-* functions for url string:
-	      ;; http://news.gmane.org/group/gmane.emacs.w3m/thread=4210
-	      file-name-handler-alist)
-	  (w3m-string-match-url-components base)
-	  (concat
-	   (substring base 0 (match-beginning 5))
-	   (if (member (match-string 2 base) w3m-url-hierarchical-schemes)
-	       (w3m-expand-path-name
-		(substring url 0 path-end)
-		(file-name-directory (match-string 5 base)))
-	     (substring url 0 path-end))
-	   (substring url path-end)))))
+      (let ((path-end (match-end 5))
+	    expanded-path
+	    ;; See the following thread about a problem related to
+	    ;; the use of file-name-* functions for url string:
+	    ;; http://news.gmane.org/group/gmane.emacs.w3m/thread=4210
+	    file-name-handler-alist)
+	(w3m-string-match-url-components base)
+	(setq expanded-path
+	      (w3m-expand-path-name
+	       (substring url 0 path-end)
+	       (or (file-name-directory (match-string 5 base))
+		   "/")))
+	(save-match-data
+	  (when (string-match "^/\\.\\./?" expanded-path)
+	    (setq expanded-path 
+		  (concat "/" (substring expanded-path (match-end 0))))))
+	(concat
+	 (substring base 0 (match-beginning 5))
+	 (if (member (match-string 2 base) w3m-url-hierarchical-schemes)
+	     expanded-path
+	   (substring url 0 path-end))
+	 (substring url path-end))))
      ((match-beginning 6)
       ;; URL has a query part.
       (w3m-string-match-url-components base)
-      (concat (file-name-directory (substring base 0 (match-end 5)))
-	      url))
+      (concat (substring base 0 (match-end 5)) url))