Unpack Bulgarian, Lithuanian, Persian Elasticsearch Analyzers
Closed, ResolvedPublic5 Estimated Story Points


See parent task for details.

(These were chosen next more or less at random.)

Event Timeline

TJones triaged this task as High priority.Dec 13 2022, 6:56 PM
TJones set the point value for this task to 5.
TJones moved this task from needs triage to Language Stuff on the Discovery-Search board.

Change 884106 had a related patch set uploaded (by Tjones; author: Tjones):

[mediawiki/extensions/CirrusSearch@master] Unpack Bulgarian, Lithuanian, Persian Analyzers


Full write up on Mediawiki.

  • Pretty straightforward
  • Lots of mixed-script tokens in Bulgarian
  • Filtered out some zero-length tokens in Persian
  • Soft-hyphens are very popular this season
  • Mostly the usual.

Change 884106 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@master] Unpack Bulgarian, Lithuanian, Persian Analyzers
