I need to replace urls in the page taken by curl from another site. My php curl code is;
<?php
$ch = curl_init ("http://www.externalwebsite.com/index.php");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$page = curl_exec($ch);
preg_match('#<div class="headline"[^>]*>(.+?)</div>#is', $page, $matches);
foreach ($matches as &$match) {
$match = $match;
}
$html=$matches[1];
$html = preg_replace('~a href="([a-z,.\-]*)~si', '"', $html); //NEED TO CHANGE THIS
echo $html;
?>
This code working ok until url has any numerical character other than id. This is how html looks like without any preg_replace command.
<div class="swiper-slide red-slide">
<div class="title"><a href="http://www.externalwebsite.com/title-of-the-3-page-192345.htm" class="image">
<img src="http://www.externalwebsite.com/d/news/94406.jpg"/></a></div></div>
If I use the preg_replace command above html looks like;
<div class="swiper-slide red-slide">
<div class="title"><a href="http://www.mywebsite.com/read_curl.php?id=3-page-192345" class="image">
<img src="http://www.externalwebsite.com/d/news/94406.jpg"/></a></div></div>
Bu it must be something like this;
<div class="swiper-slide red-slide">
<div class="title"><a href="http://www.mywebsite.com/read_curl.php?id=192345" class="image">
<img src="http://www.externalwebsite.com/d/news/94406.jpg"/></a></div></div>
Only id must remain, all the other stuff must be deleted. Can anybody help me please?
UPDATE: title of the pages are changing dynamically, last 6 digit is the id, and the only thing must be remain in the url.