regex - 将相对 URL 转换为绝对 URL

Question

输入：

基本网址：www.example.com/1/2/index.php
相对网址：../../index.php

输出：

绝对网址：www.example.com/index.php

这将是完美的，它将使用sed完成。

据我了解，这个正则表达式应该在 URLsomefolder/中为每个删除一个。../

score 1 · Accepted Answer

realpath is a quick but slightly hacky way to do what you want.
(Actually, I'm surprised that it doesn't deal properly with URLs; it treats them as plain old filesystem paths.)
~$ realpath -m http://www.example.com/1/2/../../index.php => ~$ /home/username/http:/www.example.com/index.php
The -m (for "missing") says to resolve the path even if components of it don't actually exist on the filesystem.
So you'll still have to strip off the actual filesystem part of that (which will just be $(pwd). And note that the slash-slash for the protocol was also canonicalized to a single slash. So you might be better off to leave the "http://" off of your input and just prepend it to your output instead.
See man 1 realpath for the full story. Or info coreutils 'realpath invocation' for a more verbose full story, if you have the info system installed.

score 1 · Accepted Answer

sed在里面使用bash

#!/bin/bash

base_url='www.example.com/1/2/index.php'
rel_url='../../index.php'

str="${base_url};${rel_url}"
str=$(echo $str | sed -r 's#/[^/]*;#/#')
while [ ! -z $(echo $str | grep '\.\.') ]
do
  str=$(echo $str | sed -r 's#\w+/\.\./##')
done
abs_url=$str

echo $abs_url

输出：

www.example.com/index.php

score 1 · Accepted Answer

如果您唯一的要求是..变成“上一级”，那么这是一个可能的解决方案。它不使用正则表达式或 sed 或 JVM ;)

#!/bin/bash                                                                                                                                

domain="www.example.com"
origin="1/2/3/4/index.php"
rel="../../index.php"

awk -v rel="$rel" -v origin="$origin" -v file="$(basename "$rel")" -v dom="$domain" '                                                                
BEGIN {                                                                                                                                    
    n = split(rel, a, "/")                                                                                                                 
    for(i = 1; i <= n; ++i) {                                                                                                              
        if(a[i] == "..") ++c                                                                                                               
    }                                                                                                                                      
    abs = dom                                                                                                                              
    m=split(origin, b, "/")                                                                                                                
    for(i = 1; i < m - c; ++i) {                                                                                                           
        abs=abs"/"b[i]                                                                                                                     
    }                                                                                                                                      
    print abs"/"file                                                                                                                       
}'

使用的另一种方法awk，感谢 Edward 提到realpath -m：

#!/bin/bash                                                                                                                                

rel="../../index.php"
origin="www.example.com/1/2/index.php"

directory=$(dirname "$origin")
fullpath=$(realpath -m "$directory/$rel")
echo "${fullpath#$(pwd)/}"

score -3 · Accepted Answer

您不能为此使用单个正则表达式，因为正则表达式无法计数。

您应该改用真正的编程语言。甚至 Java 也可以轻松做到这一点。

regex - 将相对 URL 转换为绝对 URL

4 回答 4

Related

Reference