0

我是python新手。我有两种不同的网址格式:

url_format_1 = 'https://www.facebook.com/facebook/posts/10151927580276729'
# and
url_format_2 = 'https://www.facebook.com/photo.php?fbid=10151496277356729&set=a.10150629589136729.412063.20531316728&type=1'

我想要的是得到id. 在第一种格式上当然是10151927580276729and 在第二种格式上10151496277356729

我想检测是否使用了第一种或第二种格式并继续获取 id。

4

1 回答 1

0

For the first format, you could use a simple split(). For the second format, I'd recommend using a regular expression.

For detecting which format you are dealing with, you could first try the regular expression. If that throws an exception, proceed and try to use split().

import re

urls = [
    'https://www.facebook.com/facebook/posts/10151927580276729',
    'https://www.facebook.com/photo.php?fbid=10151496277356729&set=a.10150629589136729.412063.20531316728&type=1',
    ]

for u in urls:
    try:
        print re.search(r'fbid=([0-9]+)', u).group(1)
    except:
        print u.split('/')[-1]

Output:

10151927580276729
10151496277356729
于 2013-06-07T12:24:41.707 回答