0

我试图将此功能移植到 c#

http://www.phpsnaps.com/snaps/view/clean-url/

我在正则表达式 charp 命名法中转换 ""~[^-a-z0-9_]+~" (php 模式命名法)时遇到问题。

<?php

function cleanURL($string)
{
    $url = str_replace("'", '', $string);
    $url = str_replace('%20', ' ', $url);
    // (PROBLEM) substitutes anything but letters, numbers and '_' with separator
    $url = preg_replace('~[^\pL0-9_]+~u', '-', $url);
    $url = trim($url, "-");
    // you may opt for your own custom character map for encoding.
    $url = iconv("utf-8", "us-ascii//TRANSLIT", $url); 
    $url = strtolower($url);
     (PROBLEM)
    $url = preg_replace('~[^-a-z0-9_]+~', '', $url); // keep only letters, numbers, '_' and separator
    return $url;
} // echo cleanURL("Shelly's%20Greatest%20Poem%20(2008)");  // shellys-greatest-poem-2008
?>

这是 c# 函数:

static String cleanURL(String url)
{
    url = url.Replace("'", "");
    url = url.Replace("%20", " ");            
    url = System.Text.RegularExpressions.Regex.Replace(url, "~[^\pL0-9_]+~u", "-");           
    url = url.Trim(new char[1]{'-'});         

    Encoding ascii = Encoding.ASCII;           
    Encoding utf8 = Encoding.UTF8;           
    byte[] utf8bytes = utf8.GetBytes(url);           
    byte[] asciiBytes = Encoding.Convert(utf8, ascii, utf8bytes);            
    char[] asciiChars = new char[ascii.GetCharCount(asciiBytes, 0, asciiBytes.Length)];           
    ascii.GetChars(asciiBytes, 0, asciiBytes.Length, asciiChars, 0);           

    url = new string(asciiChars);           
    url = url.ToLower();                    
    url = System.Text.RegularExpressions.Regex.Replace(url, "~[^-a-z0-9_]+~", "");
    return url;           
}    

谢谢。任何可以帮助我?

4

1 回答 1

0

~ at start 和 end 只是模式开始结束标记,在 c# 格式中不需要它们

所以~[^-a-z0-9_]~应该是[^-a-z0-9_]

第一个模式末尾的 u 使 php 将该模式视为 UTF8,您不需要它。

于 2011-07-12T15:42:10.530 回答