Unpack Bulgarian, Lithuanian, Persian Elasticsearch Analyzers
Closed, ResolvedPublic5 Estimated Story Points

Description

See parent task for details.

(These were chosen next more or less at random.)

Event Timeline

TJones triaged this task as High priority.Dec 13 2022, 6:56 PM
TJones set the point value for this task to 5.
TJones moved this task from needs triage to Language Stuff on the Discovery-Search board.

Change 884106 had a related patch set uploaded (by Tjones; author: Tjones):

[mediawiki/extensions/CirrusSearch@master] Unpack Bulgarian, Lithuanian, Persian Analyzers

https://gerrit.wikimedia.org/r/884106

Full write up on Mediawiki.

  • Pretty straightforward
  • Lots of mixed-script tokens in Bulgarian
  • Filtered out some zero-length tokens in Persian
  • Soft-hyphens are very popular this season
  • Mostly the usual.

Change 884106 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@master] Unpack Bulgarian, Lithuanian, Persian Analyzers

https://gerrit.wikimedia.org/r/884106