0

I want regex code for only scraping .com domains without any subfolders or anything after the .com.

for example: on a webpage with a list of urls i want to scrape http://www.google.com and http://www.yahoo.com but not http://www.google.com/hello.html or http://www.yahoo.com/news/

4

1 回答 1

1

尝试这个:

(https?:\/\/)?www.[a-zA-Z0-9-]+\.[^/\s]*

于 2013-01-24T20:50:44.787 回答