php - Replacing links in text with PHP

Question

I'm using the following RegEx to replace links in the text with clickable links:

preg_replace('/(http)+(s)?:(\/\/)((\w|\.)+)(\/)?(\S+)?/i', '<a href="\0" target="_blank" class="lgray">\0</a>',$message);

I need a new one, that will recognize links starting with www only as well as those with http. Here's a list of the required URL types:

I've tried to do it by myself, but I'm not very good in RegEx-s. Will appreciate any help.

Thank you!

P.S: stackoverflow also does not recognize URLs starting with www only.

score 0 · Accepted Answer

In your regex you have made the colon and the two slashes mandatory.

This line should remedy that:

preg_replace('/(http|https)?(:)?(\/\/)?((\w|\.)+)(\/)?(\S+)?/i', '<a href="\0" target="_blank" class="lgray">\0</a>',$domains);

For a better answer, try looking at Regular expression pattern to match url with or without http://www

score 0 · Accepted Answer

Using Claus Witt's link and modifying it just a little did the job. The preg_replace he gave did not work though. Here's what I did:

$regex = "(((https?|ftp)\:\/\/)|(www))";//Scheme
$regex .= "([a-z0-9-.]*)\.([a-z]{2,4})";//Host or IP
$regex .= "(\:[0-9]{2,5})?";//Port
$regex .= "(\/([a-z0-9+\$_-]\.?)+)*\/?";//Path
$regex .= "(\?[a-z+&\$_.-][a-z0-9;:@&%=+\/\$_.-]*)?";//GET Query
$regex .= "(#[a-z_.-][a-z0-9+\$_.-]*)?";//Anchor
return str_replace
(
    array('href="','http://http://','http://https://','http:///'),
    array('href="http://','http://','https://','/'),
    preg_replace('/'.$regex.'/i','<a href="\0" target="_blank" class="lgray">\0</a>',$message)
);

In the modification I made the http or the www required, removed some non-necessary checks and extended the domain extension from 3 to 4 characters (.info is a domain too).

score 0 · Accepted Answer

Disclaimer: These are very basic, and will not account for checking valid TLDs or file extensions. Use at your own risk.

Assuming you don't need to account for directories or files, to match only those base URLs without subdomains, you can use the following regex:

(?<=^|[\n\s])(?:https?:\/\/)?(?:www\.)?[a-zA-Z0-9-.]+\.com\/?(?=$|[\n\s])

#DESCRIPTION::
#  (?<=^|[\n\s])           Checks to see that what's preceding the URL is the beginning of the string, or a newline, or whitespace.
#  (?:https?:\/\/)?        Matches http(s) if it is there
#  (?:www\.)?              Matches www. if it is there
#  [a-zA-Z0-9-]+           Matches "example" in "example.com" (as well as any other valid URL character; will also match subdomains)
#  \.com\/?                Matches .com(/)
#  (?=$|[\n\s])            Checks to see that what's following the URL is the end of the string, or a newline, or whitespace.

If you need to also match directories and files, the end of the regex needs to be modified and added to slightly:

(?<=^|[\n\s])(?:https?:\/\/)?(?:www\.)?[a-zA-Z0-9-.]+\.com(?:(?:\/[\w]+)+)?(?:\/|\.[\w]+)?(?=$|[\n\s])

#DESCRIPTION::
#  (?<=^|[\n\s])           Checks to see that what's preceding the URL is the beginning of the string, or a newline, or whitespace.
#  (?:https?:\/\/)?        Matches http(s) if it is there
#  (?:www\.)?              Matches www. if it is there
#  [a-zA-Z0-9-.]+          Matches "example" in "example.com" (as well as any other valid URL character; will also match subdomains)
#  \.com                   Matches .com
#  (?:                     Start of a group
#     (?:\/[\w]+)+         Attempts to find subdirectories by matching /, then word characters
#  )?                      Ends the previous group. This group can be skipped, if there are no subdirectories
#  (?:\/|\.[\w]+)?         Matches a file extension if it is there, or a / if it is there.
#  (?=$|[\n\s])            Checks to see that what's following the URL is the end of the string, or a newline, or whitespace.

score -1 · Accepted Answer

Try this one:

$pattern = preg_replace("/((https:\/\/|http:\/\/||http:\/\/www.|https:\/\/www.|www.)+([\w\/])+(.com\/|.com))/i","<a target=\"_blank\" href=\"$1\">$1</a>",$url);

php - Replacing links in text with PHP

4 回答 4

Related

Reference