Reindex Bulgarian, Lithuanian, Persian wikis to enable unpacked analyzers
Closed, ResolvedPublic3 Estimated Story Points

Description

Once T325090 is deployed, reindex Bulgarian-, Lithuanian-, and Persian-language wikis to enable unpacked analyzers

Current wiki counts:

Bulgarian (bg): 6 wikis
Lithuanian (lt): 5 wikis
Persian (fa): 7 wikis

Event Timeline

TJones created this task.
TJones set the point value for this task to 3.

Bulgarian wikis are reindexed. Lithuanian wikis should be done in an hour or two. I'll work on Persian tomorrow and finish my reports on all three after that.

Full report on Mediawiki.

Summary:

  • Moderate impact in Bulgarian, split between ICU folding diacritics and homoglyph matching!
  • Moderate impact in Persian, due to ICU folding of Persian diacritics.
  • Minimal impact in Lithuanian, but a few extra results from ICU folding.