Url catching regexp
Not exactly a brilliant piece of engineering, but this is a useful dirty hack. If you want to clean urls from a string, you can match it using this regexp:
For http://
kind of urls:
1
text.replaceAll(/(https|http):\\/\\/[a-zA-Z0-9\-\._~:\\/\?#\[\]@!\\u0024&'\(\)\*\+,;=]+/, "")
For the www.
kind:
1
text.replaceAll(/www\.[a-zA-Z0-9\-\._~:\\/\?#\[\]@!\\u0024&'\(\)\*\+,;=]+/, "")
The main point here is that the above characters are the only one allowed in urls, so every string that matches these is a url. It doesn’t work for stuff like io.com
.
This post is licensed under CC BY 4.0 by the author.
Comments powered by Disqus.