File type filter for Media Search
Closed, ResolvedPublic

Description

As a Commons user, I want to be able to filter search results by file type so I can narrow results down to only the file type I want.

Acceptance Criteria:

  • A "File Type" filter is added that allows users to only see results of a specific file type
  • The file type filter should only appear on the Images, Audio, Video, and Other tabs (the Other tab is out of scope for this ticket - will be handled in T257699)
  • The file type filter should have the following options:
    • On the Images tab: tiff, png, gif, jpg, webp, xcf, svg
    • On the Audio tab: mid, flac, wav, mp3, ogg
    • On the Video tab: webm, mpg, ogv (this should have the display label 'ogv' but the value 'ogg')
    • On the Other tab: pdf, djvu, stl
  • The file type filter should match the following design on desktop. This task does not cover the mobile UI (this will be done as part of T258615)

filters.jpg (755×1 px, 335 KB)

Event Timeline

@CBogen / @Ramsey-WMF, could I get a second pair of eyes on the file types per tab in the acceptance criteria here? Do those look right to you? I checked the allowed file types on Special:Upload and had to look a few of them up so I wanted to make sure I didn't miss anything.

AnneT removed AnneT as the assignee of this task.Aug 18 2020, 7:31 PM

JPEY? 🤔

@CBogen / @Ramsey-WMF, could I get a second pair of eyes on the file types per tab in the acceptance criteria here? Do those look right to you? I checked the allowed file types on Special:Upload and had to look a few of them up so I wanted to make sure I didn't miss anything.

The only one other concern I have (which might be minor) is this - in our current advanced search UI we combine the extension variants into just one (for the mimetype). So for TIFF we just look for tiff, not tiff and tif. Behind the scenes it's doing a filemime:xxx search, and that is looking for specific strings (filemime:jpeg works, filemime:jpg doesn't).

I can see arguments for not doing mimetype searches and trying to match exact strings in the filename. Ultimately it probably depends on what the actual search mechanism is going to be though. Filemime, from what I understand, is pretty cheap performance-wise. I don't know if doing regular expressions on filenames to match the extension perfectly makes a big difference in performance, but it might!

As an example, currently there is no UI on Commons to facilitate finding files that have the .opus extension. But you can kinda do it sloppily with intitle: /".opus"/ BUT that includes some files that have weird filenames that include weird double extensions like ".opus.ogg" which is...kinda the same thing but maybe not what the user is looking for so then you have to add the $ to match the .opus to the very end of the string....

So I'll leave this one to the engineers 😄 But since this is a file *type* filter and not a file *extension* filter, I'd lean towards doing it the way advanced search currently does (with filemime), and not worry about getting the extensions exactly right.

Thanks for all the info, @Ramsey-WMF! This is super helpful.

It looks like opus files will be included when searching for filemime:ogg as long as filetype:audio is not included as a filter. My proposal would be for us to use filemime just like advanced search and, if a filemime type is selected, remove the filetype filter so we can accommodate special cases like this. The filetype filter is really only needed when we want to search for all image types, all audio types, etc.

Of course, if there are a bunch of other special cases for which this won't work, we'll have to rethink things. But if it's how advanced search currently works I think we're good.

Eh, nevermind, that won't work: since searching for filemime:ogg returns .ogg, .oga, and .ogv files, we need that filetype filter so we're not showing videos on the audio tab.

So, perhaps we should just replicate what advanced search is doing for now, and consider adding support for .opus files within this filter down the road if it's needed.

Change 621386 had a related patch set uploaded (by Anne Tomasevich; owner: Anne Tomasevich):
[mediawiki/extensions/WikibaseMediaInfo@master] Add mime type filter

https://gerrit.wikimedia.org/r/621386

Change 621386 merged by jenkins-bot:
[mediawiki/extensions/WikibaseMediaInfo@master] Add mime type filter

https://gerrit.wikimedia.org/r/621386

Checked on commons wmf.6 - all looks according to the specs (T261365 was filed separately).

The file type filter should have the following options:

On the Images tab: tiff, png, gif, jpg, webp, xcf, svg

Screen Shot 2020-08-26 at 5.41.09 PM.png (628×792 px, 73 KB)

On the Audio tab: mid, flac, wav, mp3, ogg

Screen Shot 2020-08-26 at 5.40.58 PM.png (631×1 px, 383 KB)

On the Video tab: webm, mpg, ogv (this should have the display label 'ogv' but the value 'ogg')

Screen Shot 2020-08-26 at 5.41.21 PM.png (650×769 px, 197 KB)

The Other tab has not been implemented yet.