I have tried using regex but read around and got directed to beautiful soup...
I've kinda figured out how to get urls in html tags with soup, but how would I grab urls from both html tags (href=*) and the body text of the page?
Also for grabbing the ones in tags, how do I specify that I only want urls starting with http://, https://... ?
Thanks in advance!