Auto-addition of inferrable categories

edit

User:SchlurcherBot is great, could you change it or fork a complementary bot to add categories that are inferrable given the structured data that this bot adds?

It already reads and parses the date field, it would be nice if it then added "Videos of 2023" if that's in the field which the bot already writes into the "inception" field. It could also be put in a hidden subcat like "Uncategorized videos of 2023" so people can check these.

Likewise, it already writes the display resolution to the SD but does not add the respective Category:Videos by display resolution subcat. If not nearly all videos are in there I don't see why this category (its subcats) could be useful. If it was added to videos, then one could use this for statistics, petscans and maybe other things. The same goes for the WebM videos category which is currently up for deletion. Most webm videos are missing there so the category is essentially useless. (Note that these two are exceptional cases most WMC categories are useful.) If files were in there one could for example use this as a workaround to find videos in petscan which currently can't filter for videos except when combining with the Category:Videos by file format cat. It could also populate the Category:4K videos (this isn't just about adding these when writing the SD).

I think you may be most up to the task as the bot already adds relevant structured data so can clearly read this data and is quite good at adding it. One could also add further inferrable cats like Videos without audio, Black and white videos (if it can read the content to some degree), or "Videos of 2024 from the United States" (depending on the license tag or other categories of the file) and so on. It would be a big endevour but addition of categories that are inferrable from other categories of a file would be very useful. Please let me know if you have any thoughts on that.
It would go further than the original scope asked about above but would be very useful and it could be that getting a bot working on more such tasks would be the most straightforward way for it. An example is that a video in a subcategory Category:Black and white films should go into Category:Black and white videos, another is that files in Category:Animals in water should be moved to Category:Elephants in water if it's also in a subcat of elephants. There can be rare exceptions but having things auto-categorized with exceptional errors would be better than things missing and requiring lots of manual maintenance/subcategorization and there would be ways to deal with that (for example for video files in Category:Short films it would create a 'suggestion' to add Category:Short films videos) and move things out of ill-inferred categories where usually another cat of the file is false.
On a related note, there often is false data in structured data but unlike with categories people usually don't see and correct it...and even the depicts data has data set and that data is true, the key things depicted may not be there so syncing depicts (/main subject?) statements with categories is something that I think is much needed if the "depicts" SD is to ever be useful (rather than a time/effort sink and duplicative) which I'm not sure of at all.

Maybe focusing this discussion inferrable technical criteria cats for videos would be best for now. Here I suggested that video2commons adds as many of these right away when uploading, such as "Videos with English subtitles" if it imports en subtitles. It is similar for gifs where Category:Animated GIF files are missing on many files. You could ignore the above indented paragraph, could you change the bot so that it adds inferrable categories about technical criteria to new videos that it adds SD to and maybe later expand that to e.g. a new bot that goes through files retrospectively? Thanks a lot for your efforts developing this bot! Prototyperspective (talk) 22:36, 17 July 2024 (UTC)Reply

Special:Diff/900916198

edit

Hi, SchlurcherBot hat diesen Diff produziert. Ist ja auch ganz gut und schön. Was aber passiert, wenn dei Grafik mit einer verbesserten Version überschrieben wird. Dann stimmen alle Angaben in diesem Diff mit einem Schlag nicht mehr. Vielleicht kommt SchlurcherBot irgendwann mal wieder vorbei und macht ein Update, aber in der Zeit dazwischen haben wir dann dauernd diese falschen Angaben stehen? Ich halte das für nicht so gut, alle falschen Angaben in Wikipedia, Commons und Wikidata gehören da einfach nicht hin und sollten wir, schon bevor wir diese editieren, ausfiltern. Was ist denn Deine Meinung dazu? Ist ein RC-Bot für diese Aufgabe vielleicht besser geeignet? Liebe Grüße, – Doc TaxonTalk12:51, 21 July 2024 (UTC)Reply