[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
w3m-expand-url
- From: Naohiro Aota <nao.aota@xxxxxxxxx>
- Date: Mon, 10 Sep 2007 01:42:44 +0900 (JST)
- X-ml-name: emacs-w3m
- X-mail-count: 09608
青田です。
w3m-expand-urlをいろいろといじりました。
まず、
(w3m-expand-url "index.html" "http://example.com")
=> "http://example.com/home/hoge/.w3m/index.html"
のように BaseURI がホスト名で終わっていると結果がおかしくなるので修正。
次に、emacs-w3m:09333での私のパッチの実装方法がどうも別のバグをひきおこ
しているようだったので修正しました。
ついでに、RFC3986の5.4.1、5.4.2を見てw3m-expand-urlの結果をテストしまし
た。RFCと違ったのは以下の8つでした。
BaseURI = http://a/b/c/d;p?q
url RFCによる定義 w3m-expand-urlの結果
1 ?y http://a/b/c/d;p?y http://a/b/c/?y
2 /../g http://a/g http://a/../g
3 ../../../g http://a/g http://a/../g
4 http:g http:g http://a/b/c/g
5 ./g/. http://a/b/c/g/ http://a/b/c/g
6 /./g http://a/g http://a/./g
7 .. http://a/b/ http://a/b
8 . http://a/b/c/ http://a/b/c
1〜3は意図と違うページを開いたり、Apacheが理解しなかったりするので修正
しました。
4は厳格に言えば間違いなのですが、互換性を考えると特に問題ないようなので
放置しています。
5〜8は、(file-name-nondirectory (substring url 0 path-end))が"."か".."
の時に"/"を追加してやれば直るような気はするのですが…どうせサーバ側で処
理できるでしょうし直していません。
ところで、URLがhttp://example.com/?q=fooのページで、
<form method="GET" action="">
<input type="hidden" value="bar" name="q">
<input type="sumit" value="SUBMIT">
</form>
こんなformをsubmitすると、http://example.com/?q=foo?q=barをとりにいって
ますが、これはどう処理すべきなのでしょうか?クエリを上書きするのが(おそ
らく)制作者の意図にはあうのでしょうけど…。
(素のw3mだと上書きしてるみたいですね)
Index: ChangeLog
===================================================================
RCS file: /storage/cvsroot/emacs-w3m/ChangeLog,v
retrieving revision 1.3043
diff -u -r1.3043 ChangeLog
--- ChangeLog 7 Sep 2007 01:12:17 -0000 1.3043
+++ ChangeLog 9 Sep 2007 15:47:37 -0000
@@ -1,3 +1,9 @@
+2007-09-09 Naohiro Aota <nao.aota@xxxxxxxxx>
+
+ * w3m.el (w3m-expand-url): Use "/" as path when it of base-uri is not
+ defined; Clear query of base-uri when empty query exist; Changes to
+ follow RFC3986.
+
2007-09-07 Katsumi Yamaoka <yamaoka@xxxxxxx>
* w3m-ems.el (w3m-euc-japan-encoder, w3m-iso-latin-1-encoder): Use
Index: w3m.el
===================================================================
RCS file: /storage/cvsroot/emacs-w3m/w3m.el,v
retrieving revision 1.1302
diff -u -r1.1302 w3m.el
--- w3m.el 7 Sep 2007 01:12:16 -0000 1.1302
+++ w3m.el 9 Sep 2007 15:48:11 -0000
@@ -6013,10 +6013,6 @@
(= (match-beginning 9) (length url)))
(setq url (substring url 0 (match-beginning 8)))
(w3m-string-match-url-components url))
- (when (and (not (zerop (length url)))
- (eq ?? (aref url 0)))
- (setq url (concat "./" url))
- (w3m-string-match-url-components url))
;; Remove an empty query part.
(when (and (match-beginning 6)
(= (match-beginning 7) (or (match-beginning 8)
@@ -6024,7 +6020,9 @@
(setq url (concat (substring url 0 (match-beginning 6))
(if (match-beginning 8)
(substring url (match-beginning 8))
- "")))
+ ""))
+ base (progn (w3m-string-match-url-components base)
+ (substring base 0 (match-beginning 6))))
(w3m-string-match-url-components url))
(cond
((match-beginning 1)
@@ -6048,34 +6046,32 @@
(w3m-string-match-url-components base)
(concat (substring base 0 (match-end 1)) url))
((> (match-end 5) (match-beginning 5))
- ;; URL has a hierarchical part.
- (if (eq ?/ (aref url (match-beginning 5)))
- ;; Its first character is the slash "/". => The hierarchical
- ;; part of URL has an absolute spec.
- (progn
- (w3m-string-match-url-components base)
- (concat (substring base 0 (or (match-end 3) (match-end 1)))
- url))
- ;; The hierarchical part of URL has a relative spec.
- (let ((path-end (match-end 5))
- ;; See the following thread about a problem related to
- ;; the use of file-name-* functions for url string:
- ;; http://news.gmane.org/group/gmane.emacs.w3m/thread=4210
- file-name-handler-alist)
- (w3m-string-match-url-components base)
- (concat
- (substring base 0 (match-beginning 5))
- (if (member (match-string 2 base) w3m-url-hierarchical-schemes)
- (w3m-expand-path-name
- (substring url 0 path-end)
- (file-name-directory (match-string 5 base)))
- (substring url 0 path-end))
- (substring url path-end)))))
+ (let ((path-end (match-end 5))
+ expanded-path
+ ;; See the following thread about a problem related to
+ ;; the use of file-name-* functions for url string:
+ ;; http://news.gmane.org/group/gmane.emacs.w3m/thread=4210
+ file-name-handler-alist)
+ (w3m-string-match-url-components base)
+ (setq expanded-path
+ (w3m-expand-path-name
+ (substring url 0 path-end)
+ (or (file-name-directory (match-string 5 base))
+ "/")))
+ (save-match-data
+ (when (string-match "^/\\.\\./?" expanded-path)
+ (setq expanded-path
+ (concat "/" (substring expanded-path (match-end 0))))))
+ (concat
+ (substring base 0 (match-beginning 5))
+ (if (member (match-string 2 base) w3m-url-hierarchical-schemes)
+ expanded-path
+ (substring url 0 path-end))
+ (substring url path-end))))
((match-beginning 6)
;; URL has a query part.
(w3m-string-match-url-components base)
- (concat (file-name-directory (substring base 0 (match-end 5)))
- url))
+ (concat (substring base 0 (match-end 5)) url))