So my problem is that, in the same content there are iframes, image tags and etc. They all have regex matches that will convert them into the correct format.
The last thing left is the normal URL. I need a regex, that will find all links that are simply links and not inside of a iframe, img or any other tag. Tags used in this case are regular HTML tags and not BB.
Currently I got this code as the last pass of the content rendering. But it will also react to all the other things done above (iframes and img renderings.) So it goes and swaps the urls out there aswell.
$output = preg_replace(array(
'%\b(([\w-]+://?|www[.])[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/)))%s'
), array(
'test'
), $output);
And my content looks something like this:
# dont want these to be touched
<iframe width="640" height="360" src="http://somedomain.com/but-still-its-a-link-to-somewhere/" frameborder="0"></iframe>
<img src="http://someotherdomain.com/here-is-a-img-url.jpg" border="0" />
# and only these converted
http://google.com
http://www.google.com
https://www2.google.com<br />
www.google.com
As you can see, there also might be something at the end of the link. After a full day of trying regexes to work, that last <br />
has been a nightmare for me.