0

我有以下 .htaccess 规则。我需要在这个块中添加一些规则。我不想失去我的旧人。

<FilesMatch "\.(htaccess|htpasswd|ini|phps|fla|psd|log|sh)$">
Order allow,Deny
Deny from all
</FilesMatch>

<IfModule mod_rewrite.c>
    RewriteEngine On

    RewriteCond %{HTTP_HOST} ^www\.(.*)$ [NC]
    RewriteRule ^(.*)$ http://%1/$1 [R=301,L]

    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteRule ^(.*)$ index.php [QSA,L]
</IfModule>

我的规则是这样的:

- if HTTP_USER_AGENT includes BotOne
- or HTTP_USER_AGENT includes OtherBot
- or HTTP_COOKIE user_id != 1

    - if REQUEST_URI is "/" main directory
    - or REQUEST_FILENAME includes "utm_source"
    - or REQUEST_FILENAME includes "utm_medium"
    - or REQUEST_FILENAME includes "utm_campaign" and "utm_content"

        - if REQUEST_FILENAME doesn't include "/blog/"
        - or REQUEST_FILENAME doesn't include "gif"
        - or REQUEST_FILENAME doesn't include "jpg"

            - then RewriteRule all files to index.html

我试过这个。但没有帮助。我该如何编写这些规则?

<IfModule mod_rewrite.c>
    RewriteEngine On

    RewriteCond %{HTTP_HOST} ^www\.(.*)$ [NC]
    RewriteRule ^(.*)$ http://%1/$1 [R=301,L]

    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteRule ^(.*)$ index.php [QSA,L]

    RewriteCond %{HTTP_USER_AGENT} "BotOne|OtherBot" [NC,OR]
    RewriteCond %{HTTP_COOKIE} !^.*user_id=1   [NC]
    #
    RewriteCond %{REQUEST_URI} \/  [NC,OR]
    RewriteCond %{REQUEST_FILENAME} ^utm_source.*  [NC,OR]
    RewriteCond %{REQUEST_FILENAME} ^utm_medium.*  [NC,OR]
    RewriteCond %{REQUEST_FILENAME} ^utm_campaign.*  [NC,OR]
    RewriteCond %{REQUEST_FILENAME} ^utm_content.*  [NC]
    #
    RewriteCond %{REQUEST_FILENAME} !\/blog\/.*  [NC,OR]
    RewriteCond %{REQUEST_FILENAME} !gif.*  [NC,OR]
    RewriteCond %{REQUEST_FILENAME} !jpg.*  [NC]
    RewriteRule ^.*? index.html [R=301,L]
</IfModule>

我要重定向的主要 URL 如下所示:
* http://example.com => http://example.com/index.html
* http://example.com/ => http://example。 com/index.html
* http://example.com/?utm_source=michael => http://example.com/index.html
* http://example.com/?utm_medium=twitter => http:// example.com/index.html
* http://example.com/?utm_campaign=camp2&utm_content=somewhere => http://example.com/index.html
* http://example.com/blog/ * => 否重定向
* http://example.com/myfile.jpg=> 无重定向
* http://example.com/myfile.gif => 无重定向

如果(用户代理是“BotOne”)或(用户代理是“OtherBot”)或(他/她的 Cookie 用户 ID 不是 1),将触发此重定向。

任何查询参数都将被删除。

4

1 回答 1

0

在 .htaccess 中处理规则的方式,根本没有办法用某种构造或解析来表达这一点,就像你在编程语言中所做的那样。过去我也有类似的问题,而且很难得到一个完整的答案,所以当我终于找到答案时,我自己写了下来,以便将来再次找到它。这是我给自己写的:

## After quite a bit of puzzlement and seemingly maddeningly
##  vague documentation, I finally figured out exactly how mod_rewrite's
##  [OR] flag really works: In mod_rewrite there's not really any
##  "precendence"; RewriteCond's are simply processed sequentially.
##  Without any modification, the default is to AND _everything_.
##  Including the [OR] modifier on some RewriteCond's creates a
##  two-level expression with only ANDs at the outer/upper level and
##  only ORs at the inner/lower level. Thus
##  RewriteCond a [OR]
##  RewriteCond b
##  RewriteCond c [OR]
##  RewriteCond d
##  RewriteCond e [OR]
##  RewriteCond f [OR]
##  RewriteCond g
##  is equivalent to the boolean expression
##  ((a OR b) AND (c OR d) AND (e OR f OR g))
## There's _no_ way to have ANDs at the _lower/inner_ level and ORs
##  at the _upper/outer_ level; such constructs can only be implemented with
##  either multiple rulesets (and unavoidable duplication), or the
##  introduction of intermediate environment variables.
## Thus the only advantages of [OR] over a | in an RE are increased
##  clarity/maintainability, and the possibility of checking against
##  unrelated variables. REs with lots of |, on the other hand, are
##  assumed to be much faster.

如果我正确理解您的需求,则可以将整个事情视为一个巨大的条件,其中块不是通过附属的“if”子句而是通过 AND 连接的,如下所示:

IF

((- HTTP_USER_AGENT includes BotOne
- or HTTP_USER_AGENT includes OtherBot
- or HTTP_COOKIE user_id != 1)
AND
(- REQUEST_URI is "/" main directory
- or REQUEST_FILENAME includes "utm_source"
- or REQUEST_FILENAME includes "utm_medium"
- or REQUEST_FILENAME includes "utm_campaign" and "utm_content")
AND
(- REQUEST_FILENAME doesn't include "/blog/"
- or REQUEST_FILENAME doesn't include "gif"
- or REQUEST_FILENAME doesn't include "jpg"))

THEN

- RewriteRule all files to index.html

我看到的最大复杂性是关于“utm_campaign”和“utm_content”的规则,因为据我所知,正则表达式(即使是像.htaccess 中的那些复杂的 PERL 样式)根本不能很好地处理未指定的顺序. 如果您知道字符串实际上总是以相同的顺序排列,您可以编写一个类似于“utm_campaign.*utm_content”的 RE。如果订单确实未指定,要完全满足您的规范,您将需要两个规则条件,一个用于每个可能的订单,如下所示:

RewriteCond "utm_campaign.*utm_content" [OR]
RewriteCond "utm_content.*utm_campaign"

在我看来,您的某些 RE 与您的伪规则实际上所说的内容并不完全相同。例如:

REQUEST_FILENAME includes "utm_source"

应该成为

RewriteCond ${REQUEST_FILENAME} utm_source

因为

RewriteCond ${REQUEST_FILENAME} ^utm_source 

实际实现

REQUEST_FILENAME **startswith** utm_source

另外,我允许奇怪的浏览器根本不发送根目录,如下所示(另请注意,'/' 没有单独的大写和小写版本,因此 [NC] 只会给您带来轻微的性能影响没有充分的理由)。请注意,您需要字符串锚点的开头 ('^') 和结尾 ('$'),否则您也会匹配诸如“/xxx/yyy/zzz”之类的内容,因为它们包含斜杠:

RewriteCond ${REQUEST_URI} ^/?$ [OR]

最后,只匹配你关心的字符串部分;无需匹配字符串的其余部分(实际上,尝试匹配字符串的其余部分通常会导致奇怪的不必要的错误)。换句话说,.htaccess RE 中“.*”的存在通常表示某种不必要的怪异,充其量会影响性能,最坏的情况是掩盖一些错误。与其说“utm_source.*”,不如说“utm_source”。

乍一看,您的具有多个条件的逻辑对我来说是正确的(幸运的是,有很多方法可以使这些复杂的条件变得混乱)。因此,如果它不起作用,我会怀疑规则(尤其是正则表达式)存在其他问题,而不是逻辑/优先级错误。(另外,我的猜测是问题有几个不同的原因,而不仅仅是一个共同的根本原因,所以解决一个问题不太可能解决所有其他问题。)

你能给我们举一个输入字符串的具体例子吗?你期望发生什么,以及实际发生了什么?

于 2013-03-09T04:01:40.930 回答