ruby slow regex match

2 min read Original article ↗
salimane at Salimanes-MacBook-Pro in ~ ⚛ ruby --version ruby 2.1.0p0 (2013-12-25 revision 44422) [x86_64-darwin13.0] salimane at Salimanes-MacBook-Pro in ~ ⚛ cat regex.rb regex = %r{^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$} link = "https://www.facebook.com/DUSA.ve?ref=stream" puts link =~ regex salimane at Salimanes-MacBook-Pro in ~ ⚛ time ruby regex.rb ruby regex.rb 114.95s user 0.11s system 99% cpu 1:55.07 total salimane at Salimanes-MacBook-Pro in ~ ⚛ php -v PHP 5.4.17 (cli) (built: Aug 25 2013 02:03:38) Copyright (c) 1997-2013 The PHP Group Zend Engine v2.4.0, Copyright (c) 1998-2013 Zend Technologies salimane at Salimanes-MacBook-Pro in ~ ⚛ cat regex.php <?php $regex = '/^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$/'; $link = "https://www.facebook.com/DUSA.ve?ref=stream"; echo preg_match($regex, link); salimane at Salimanes-MacBook-Pro in ~ ⚛ time php regex.php 0php regex.php 0.07s user 0.01s system 96% cpu 0.079 total salimane at Salimanes-MacBook-Pro in ~ ⚛ java -version java version "1.7.0_21" Java(TM) SE Runtime Environment (build 1.7.0_21-b12) Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode) salimane at Salimanes-MacBook-Pro in ~ ⚛ cat Regex.java public class Regex { public static void main(String[] args) { String regex = "^(https?://)?([\\da-z.-]+).([a-z.]{2,6})([/\\w .-]*)*/?$"; String link = "https://www.facebook.com/DUSA.ve?ref=stream"; System.out.println(link.matches(regex)); } } salimane at Salimanes-MacBook-Pro in ~ ⚛ javac Regex.java salimane at Salimanes-MacBook-Pro in ~ ⚛ time java Regex false java Regex 0.75s user 0.04s system 136% cpu 0.584 total salimane at Salimanes-MacBook-Pro in ~ ⚛ python --version Python 2.7.3 salimane at Salimanes-MacBook-Pro in ~ ⚛ cat regex.py import re regex = r'^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$' link = "https://www.facebook.com/DUSA.ve?ref=stream" print re.match(regex, link) salimane at Salimanes-MacBook-Pro in ~ ⚛ time python regex.py Traceback (most recent call last): File "regex.py", line 4, in <module> print re.match(regex, link) File "/opt/boxen/homebrew/Cellar/python/2.7.3-boxen2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 137, in match return _compile(pattern, flags).match(string) File "/opt/boxen/homebrew/Cellar/python/2.7.3-boxen2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 242, in _compile raise error, v # invalid expression sre_constants.error: nothing to repeat python regex.py 0.02s user 0.01s system 43% cpu 0.079 total