php - PHP：比较百分比编码不同的 URI

Question

在 PHP 中，我想比较两个相对 URL 是否相等。问题：URL 的百分比编码可能不同，例如

/dir/file+file对比/dir/file%20file
/dir/file(file)对比 /dir/file%28file%29
/dir/file%5bfile对比 /dir/file%5Bfile

根据RFC 3986，服务器应该以相同的方式对待这些 URI。但如果我==用来比较，我会以不匹配而告终。

所以我正在寻找一个 PHP 函数，它将接受两个字符串并返回它们是否代表相同的 URI（在编码字符中TRUE区分相同字符的编码/解码变体、大写/小写十六进制数字，以及+与%20空格），以及FALSE它们是否不同。

我事先知道这些字符串中只有 ASCII 字符——没有 unicode。

score 4 · Accepted Answer

function uriMatches($uri1, $uri2)
{
    return urldecode($uri1) == urldecode($uri2);
}

echo uriMatches('/dir/file+file', '/dir/file%20file');      // TRUE
echo uriMatches('/dir/file(file)', '/dir/file%28file%29');  // TRUE
echo uriMatches('/dir/file%5bfile', '/dir/file%5Bfile');    // TRUE

网址解码

score 0 · Accepted Answer

编辑：请查看@webbiedave 的回复。他的要好得多（我什至不知道 PHP 中有一个函数可以做到这一点.. 每天学习新东西）

您将不得不解析字符串以查找匹配%##的内容以找到这些百分比编码的出现。然后从中获取数字，您应该能够传递它，以便chr()函数获取这些百分比编码的字符。重建字符串，然后你应该能够匹配它们。

不确定这是最有效的方法，但考虑到 URL 通常不会那么长，它不应该对性能造成太大影响。

score 0 · Accepted Answer

我知道这里的这个问题似乎是由 webbiedave 解决的，但我有自己的问题。

第一个问题：编码字符不区分大小写。所以 %C3 和 %c3 都是完全相同的字符，尽管它们作为 URI 是不同的。所以两个 URI 都指向同一个位置。

第二个问题：folder%20(2) 和 folder%20%282%29 都是有效的 urlencoded URI，它们指向同一个位置，尽管它们是不同的 URI。

第三个问题：如果我去掉 url 编码的字符，我有两个位置具有相同的 URI，例如 bla%2Fblubb 和 bla/blubb。

那么该怎么办呢？为了比较两个 URI，我需要以将它们拆分为所有组件的方式对它们进行规范化，一次对所有路径和查询部分进行 urldecode，对它们进行 rawurlencode 并将它们粘合在一起，然后我可以比较它们。

这可能是标准化它的功能：

function normalizeURI($uri) {
    $components = parse_url($uri);
    $normalized = "";
    if ($components['scheme']) {
        $normalized .= $components['scheme'] . ":";
    }
    if ($components['host']) {
        $normalized .= "//";
        if ($components['user']) { //this should never happen in URIs, but still probably it's anything can happen thursday
            $normalized .= rawurlencode(urldecode($components['user']));
            if ($components['pass']) {
                $normalized .= ":".rawurlencode(urldecode($components['pass']));
            }
            $normalized .= "@";
        }
        $normalized .= $components['host'];
        if ($components['port']) {
            $normalized .= ":".$components['port'];
        }
    }
    if ($components['path']) {
        if ($normalized) {
            $normalized .= "/";
        }
        $path = explode("/", $components['path']);
        $path = array_map("urldecode", $path);
        $path = array_map("rawurlencode", $path);
        $normalized .= implode("/", $path);
    }
    if ($components['query']) {
        $query = explode("&", $components['query']);
        foreach ($query as $i => $c) {
            $c = explode("=", $c);
            $c = array_map("urldecode", $c);
            $c = array_map("rawurlencode", $c);
            $c = implode("=", $c);
            $query[$i] = $c;
        }
        $normalized .= "?".implode("&", $query);
    }
    return $normalized;
}

现在您可以将 webbiedave 的功能更改为：

function uriMatches($uri1, $uri2) {
    return normalizeURI($uri1) === normalizeURI($uri2);
}

应该这样做。是的，它甚至比我想要的要复杂得多。

php - PHP：比较百分比编码不同的 URI

3 回答 3

Related

Reference