输入:
- 基本网址:
www.example.com/1/2/index.php
- 相对网址:
../../index.php
输出:
- 绝对网址:
www.example.com/index.php
这将是完美的,它将使用sed完成。
据我了解,这个正则表达式应该在 URLsomefolder/
中为每个删除一个。../
realpath
is a quick but slightly hacky way to do what you want.
(Actually, I'm surprised that it doesn't deal properly with URLs; it treats them as plain old filesystem paths.)
~$ realpath -m http://www.example.com/1/2/../../index.php
=>
~$ /home/username/http:/www.example.com/index.php
The -m
(for "missing") says to resolve the path even if components of it don't actually exist on the filesystem.
So you'll still have to strip off the actual filesystem part of that (which will just be $(pwd)
. And note that the slash-slash for the protocol was also canonicalized to a single slash. So you might be better off to leave the "http://" off of your input and just prepend it to your output instead.
See man 1 realpath
for the full story. Or info coreutils 'realpath invocation'
for a more verbose full story, if you have the info system installed.
sed
在里面使用bash
#!/bin/bash
base_url='www.example.com/1/2/index.php'
rel_url='../../index.php'
str="${base_url};${rel_url}"
str=$(echo $str | sed -r 's#/[^/]*;#/#')
while [ ! -z $(echo $str | grep '\.\.') ]
do
str=$(echo $str | sed -r 's#\w+/\.\./##')
done
abs_url=$str
echo $abs_url
输出:
www.example.com/index.php
如果您唯一的要求是..
变成“上一级”,那么这是一个可能的解决方案。它不使用正则表达式或 sed 或 JVM ;)
#!/bin/bash
domain="www.example.com"
origin="1/2/3/4/index.php"
rel="../../index.php"
awk -v rel="$rel" -v origin="$origin" -v file="$(basename "$rel")" -v dom="$domain" '
BEGIN {
n = split(rel, a, "/")
for(i = 1; i <= n; ++i) {
if(a[i] == "..") ++c
}
abs = dom
m=split(origin, b, "/")
for(i = 1; i < m - c; ++i) {
abs=abs"/"b[i]
}
print abs"/"file
}'
使用的另一种方法awk
,感谢 Edward 提到realpath -m
:
#!/bin/bash
rel="../../index.php"
origin="www.example.com/1/2/index.php"
directory=$(dirname "$origin")
fullpath=$(realpath -m "$directory/$rel")
echo "${fullpath#$(pwd)/}"
您不能为此使用单个正则表达式,因为正则表达式无法计数。
您应该改用真正的编程语言。甚至 Java 也可以轻松做到这一点。