Improve performance of HSTSPreloadLookup::isPreloaded on linking external urls
Closed, ResolvedPublic

Description

The SecureLinkFixer extension is looking up every non-https external linked url to be part of a list. This begins with the subdomain and if not found include also the higher level for the same domain.

To remove one level of sub domain from the url a preg_replace with non-greedy regex is used, which can be slow due to some extra work for non-greedy regex internally.

It is faster to replace the regex '/(.*?)\./' with a strpos lookup and use substr for the next substring

Event Timeline

Change 826918 had a related patch set uploaded (by Umherirrender; author: Umherirrender):

[mediawiki/extensions/SecureLinkFixer@master] Improve performance of string operation on domain lookup

https://gerrit.wikimedia.org/r/826918

The benchmarker report for new:

Running PHP version 8.0.22 (AMD64) on Windows NT 10.0 build 19044 (Windows 10)

MediaWiki\SecureLinkFixer\HSTSPreloadLookup::isPreloaded('foobar.dev')
   count: 100
    rate: 2759410.5/s
   total:     0.04ms
    mean:     0.00ms
     max:     0.00ms
  stddev:     0.00ms
Current memory usage: 44.00 MiB
   Peak memory usage: 62.00 MiB

MediaWiki\SecureLinkFixer\HSTSPreloadLookup::isPreloaded('wikipedia.org')
   count: 100
    rate: 5115004.9/s
   total:     0.02ms
    mean:     0.00ms
     max:     0.00ms
  stddev:     0.00ms
Current memory usage: 44.00 MiB
   Peak memory usage: 62.00 MiB

MediaWiki\SecureLinkFixer\HSTSPreloadLookup::isPreloaded('not-preloaded.com')
   count: 100
    rate: 2242943.3/s
   total:     0.04ms
    mean:     0.00ms
     max:     0.00ms
  stddev:     0.00ms
Current memory usage: 44.00 MiB
   Peak memory usage: 62.00 MiB

MediaWiki\SecureLinkFixer\HSTSPreloadLookup::isPreloaded('pathological.case.that.is.not.preloaded.org')
   count: 100
    rate: 979977.6/s
   total:     0.10ms
    mean:     0.00ms
     max:     0.00ms
  stddev:     0.00ms
Current memory usage: 44.00 MiB
   Peak memory usage: 62.00 MiB

and for old:

Running PHP version 8.0.22 (AMD64) on Windows NT 10.0 build 19044 (Windows 10)

MediaWiki\SecureLinkFixer\HSTSPreloadLookup::isPreloaded('foobar.dev')
   count: 100
    rate: 2162012.4/s
   total:     0.05ms
    mean:     0.00ms
     max:     0.00ms
  stddev:     0.00ms
Current memory usage: 44.00 MiB
   Peak memory usage: 62.00 MiB

MediaWiki\SecureLinkFixer\HSTSPreloadLookup::isPreloaded('wikipedia.org')
   count: 100
    rate: 3711773.5/s
   total:     0.03ms
    mean:     0.00ms
     max:     0.00ms
  stddev:     0.00ms
Current memory usage: 44.00 MiB
   Peak memory usage: 62.00 MiB

MediaWiki\SecureLinkFixer\HSTSPreloadLookup::isPreloaded('not-preloaded.com')
   count: 100
    rate: 2139951.0/s
   total:     0.05ms
    mean:     0.00ms
     max:     0.00ms
  stddev:     0.00ms
Current memory usage: 44.00 MiB
   Peak memory usage: 62.00 MiB

MediaWiki\SecureLinkFixer\HSTSPreloadLookup::isPreloaded('pathological.case.that.is.not.preloaded.org')
   count: 100
    rate: 782519.4/s
   total:     0.13ms
    mean:     0.00ms
     max:     0.00ms
  stddev:     0.00ms
Current memory usage: 44.00 MiB
   Peak memory usage: 62.00 MiB

Change 826918 merged by jenkins-bot:

[mediawiki/extensions/SecureLinkFixer@master] Improve performance of string operation on domain lookup

https://gerrit.wikimedia.org/r/826918

Legoktm assigned this task to Umherirrender.