I am trying to parse a website for
blahblahblah
<a href="THIS IS WHAT I WANT" title="NOT THIS">I DONT CARE ABOUT THIS EITHER</a>
blahblahblah
(there are many of these, and I want all of them in some tokenized form). Unfortunately the HTML is very large and a little complicated, so trying to crawl down the tree might take me some time to just sort out the nested elements. Is there an easy way to just retrieve this?
Thanks!