Java code to get URL from a string

This little code snippet / function will effectively extract URL strings from a string in Java. I found the basic regex for doing it here, and used it in a java function.

I expanded on the basic regex a bit with the part “|www[.]” in order to catch links not starting with “http://”

Enough talk (it is cheap), here’s the code:

9 thoughts on “Java code to get URL from a string

  1. scb says:

    Nice article.

    As I’m new to regex, please help me to find out odfuscated urls, as following. Thanks in advance

    www(dot)example(dot)com

    • Houen says:

      You’ll want something like this:
      www[(][.][)][a-zA-Z][a-zA-Z0-9]+[(][.][)][a-zA-Z]+

      • scb says:

        Thanks Houen, for your prompt reply.

        Probably my earlier post not so clear. We are filtering URLs like “www.xyz.com” , but some intelligent users :)using obfuscated URLs like following

        www(dot)xyz(dot)com
        www (dot) xyz (dot) com
        www[dot]xyz[dot]com
        www [dot] xyz [dot] com
        www{dot}xyz{dot}com
        www {dot} xyz {dot} com

        So my question is how to find out above patterns from a string. Thanks in adavance

      • scb says:

        Hi Houen

        I resolved this issue as following, Please let me know your comments. Thanks

        import java.util.regex.*;

        public class Replacement {
        public static void main(String[] args) throws Exception {

        // Create a pattern to match cat
        Pattern p = Pattern.compile(“\\((dot\\))|\\[dot\\]|\\{dot\\}”);

        // Create a matcher with an input string
        Matcher m = p.matcher(“www(dot)example(dot)com www[dot]example[dot]com www{dot}example{dot}com”);

        // Loop through and create a new String with the replacements
        boolean result = m.find();
        StringBuffer sb = new StringBuffer();
        while(result) {
        m.appendReplacement(sb, “.”);
        result = m.find();
        }

        // Add the last segment of input to the new String
        m.appendTail(sb);
        System.out.println(sb.toString());
        }
        }

  2. John Ortiz says:

    Thanks for this. It is working correctly. See you later.

  3. lmn4971 says:

    Another way of doing this is to split the string with parameters that define where the URL is:

    String extractedurl = in.readLine().split(“=’|'”)[1];

    where the URL is =”URL”

  4. voji says:

    Great article. I tweaked the regexp first part to:

    (https?://|www[.]|ftp://)

    now can accept https and ftp too.

  5. lucas says:

    great script, thanks!

  6. sumon says:

    thanks

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">