我需要从中删除:http ://example.com/media/catalog/product/cache/1/thumbnail/56x/9df78eab33525d08d6e5fb8d27136e95/i/m/images_3.jpg product/ 和 /i 之间的所有内容并保留http ://example.com/media/catalog/product/i/m/images_3.jpg使用正则表达式或 c#。这些是爬虫应用程序中的选项。请帮忙。
问问题
210 次
2 回答
1
var input = "http://example.com/media/catalog/product/cache/1/thumbnail/56x/9df78eab33525d08d6e5fb8d27136e95/i/m/images_3.jpg";
var re = new Regex("^(.+/product)/.+(/i/.+)$");
var m = re.Match(input);
if (!m.Success) throw new Exception("does not match");
var result = m.Groups[1].Value + m.Groups[2].Value;
//result = "http://example.com/media/catalog/product/i/m/images_3.jpg"
于 2013-10-25T14:41:55.620 回答
0
string str = "http://example.com/media/catalog/product/cache/1/thumbnail/56x/9df78eab33525d08d6e5fb8d27136e95/i/m/images_3.jpg";
int prodIndex = str.IndexOf("/product/");
int iIndex = str.IndexOf("/i/");
string newStr = str.Substring(0, prodIndex + "/product/".Length)
+ str.Substring(iIndex + 1);
这是一个使用正则表达式的更通用的示例,它只查找 32 个字符散列之后的部分,而不是假设它是/i/
:
string str = "http://example.com/media/catalog/product/cache/1/thumbnail/56x/9df78eab33525d08d6e5fb8d27136e95/i/m/images_3.jpg";
var match = Regex.Match(str, @"(.*/product/).*/.{32}/(.*)");
var newStr = match.Groups[1].Value + match.Groups[2].Value;
于 2013-10-25T14:22:14.500 回答