(One intermediate revision by the same user not shown)
Line 1,695:
Line 1,695:
;{{int:Talk}}
;{{int:Talk}}
* it’s somewhat trivial to implement, just amount to a call of something like [[:mw:Manual:Pywikibot/template.py]] on the Main Talk namespace. Would do it myself if my bot still had a botflag but it’s easier to ask to someone who just has to call the command line.
* it’s somewhat trivial to implement, just amount to a call of something like [[:mw:Manual:Pywikibot/template.py]] on the Main Talk namespace. Would do it myself if my bot still had a botflag but it’s easier to ask to someone who just has to call the command line.
: {{comment}} Is all we have to do is remove {{tl|Item documentation}} from the top of all item talk pages, like [[Talk:Q3950|this]]? --[[User:Kanashimi|Kanashimi]] ([[User talk:Kanashimi|<span class="signature-talk">{{int:Talkpagelinktext}}</span>]]) 21:41, 25 May 2022 (UTC)
;Request process <!-- Section for bot operator only -->
;Request process <!-- Section for bot operator only -->
If you have a bot request, add a new section using the button and tell exactly what you want. To reduce the process time, first discuss the legitimacy of your request with the community in the Project chat or in the Wikiprojects's talk page. Please refer to previous discussions justifying the task in your request.
On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at 2024/07.
I am wondering whether it will always be acceptable to add the ids that are found to the item that has the original Treccani id. It seems to me that the correspondence is good for instances of human beings, human settlements, geographical features, biological taxa, to name a few. However, I think it would be unwise to attempt an automated bot run for countries:
I will continue to run against single items for testing purposes, and add features to the script to control it running against chosen batches of items. William Avery (talk) 22:22, 28 November 2021 (UTC)[reply]
Case 1 can be checked looking for "+", when present, should be compared with the relevant country calling code (P474) and if matched, it should be removed
Case 2 can be checked looking for "(" and ")" with zeros inside. If matched it should be removed
I'm not sure if this is the best place to propose it but when reviewing the urls of a query with this script:
importrequestsfromconcurrent.futuresimportThreadPoolExecutor# Checks the link of an item, if it is down then saves it in the variable "novalid"defcheck_url_item(item):# Some sites may return error if a browser useragent is not indicateduseragent='Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77'item_url=item["url"]["value"]print("Checking %s"%item_url,end="\r")req=requests.head(item_url,headers={'User-Agent':useragent},allow_redirects=True)ifreq.status_code==404:print("The url %s in the element %s returned error"%(item_url,item["item"]["value"]))novalid.append(item)base_query="""SELECT DISTINCT ?item ?url ?value{%s BIND(IF(ISBLANK(?dbvalue), "", ?dbvalue) AS ?value) BIND(REPLACE(?dbvalue, '(^.*)', ?url_format) AS ?url)}"""union_template=""" {{ ?item p:{0} ?statement . OPTIONAL {{ ?statement ps:{0} ?dbvalue }} wd:{0} wdt:P1630 ?url_format. }}"""properties=["P2942",#Dailymotion channel"P6466",#Hulu movies"P6467",#Hulu series]# Items with links that return errors will be saved herenovalid=[]query=base_query%"\n UNION\n".join([union_template.format(prop)forpropinproperties])req=requests.get('https://query.wikidata.org/sparql',params={'format':'json','query':query})data=req.json()# Schedule and run 25 checks concurrently while iterating over itemscheck_pool=ThreadPoolExecutor(max_workers=25)result=check_pool.map(check_url_item,data["results"]["bindings"])
I have noticed that almost half are invalid. I do not know if in these cases it is better to delete or archive them but a bot should periodically perform this task since the catalogs of streaming services tend to be very changeable (probably many of these broken links are due to movies/series whose license was not renewed). Unfortunately I could only include Hulu and Dailymotion since the rest of the services have the following problems:
Thanks to a recent import, we currently have more than >1.2 items where the only identifier is Freebase ID (P646). However, checking https://freebase.toolforge.org/ some of them have identifiers available there.
That would be great, I haven't seen the bot in action yet, I am still plugging away by hand as I come across them. --RAN (talk) 20:20, 28 May 2021 (UTC)[reply]
Indeed, my bot still does that (every Wednesday). In fact, it has evolved since, it also merges (seemingly) duplicate dates (that issue with -00-00 vs. -01-01 etc.). But it does not change ranks, and it even avoids statements with non-normal rank. --Matěj Suchánek (talk) 10:26, 30 May 2021 (UTC)[reply]
@Matěj Suchánek: Are you interested in picking this task up? It does kinda overlap with the task Jura mentioned. Actually, hmm, there is some subtlety here that I can see being tricky (multiple dates with different qualifiers sometimes shouldn't be merged e.g. for start time (P580)s with a applies to part (P518)). If not I may still do it. BrokenSegue (talk) 12:40, 30 May 2021 (UTC)[reply]
Sorry, I am not right now. I guess it's easy now that we have Ranker (Q105394978), which can be driven by SPARQL. (Or maybe not that easy if the qualifier is also required, but QS can do this part.) I made up a query which can be used as basis.
What if the less complete date has a reference and the other does not? Preferred statements should always be sourced. If there is no evidence for the more precise date, it should be either removed or sourced (and then up-rank'd). --Matěj Suchánek (talk) 13:12, 30 May 2021 (UTC)[reply]
Excellent! I know there are several bots trying to fill in references for dates, but they are mostly pulling data from sources that give year-only dates. At one time I calculated that about 20% of year-only dates are off by a year because they are back calculated from the age at death in an obituary. --RAN (talk) 00:37, 1 June 2021 (UTC)[reply]
Do you know who is operating these bots? Wikibase in theory supports adding uncertainty in dates but in practice I believe the correct way to add a date with that kind of uncertainty is to use e.g. earliest date (P1319). BrokenSegue (talk) 01:31, 1 June 2021 (UTC)[reply]
@Vojtěch Dostál: it seems that preferred rank is also added when dates aren't in the same year. I don't think this should be done.
No, I think it was an edit of yours, but I might be mistaken. If the request is being done by Matěj, I suppose we can close this anyways. --- Jura15:00, 27 November 2021 (UTC)[reply]
Comment (in German): Man könnte hinzufügen, dass man über die OpenRefine Reconciliation oder über https://d-nb.info/gnd/100045642/about/lds.ttl (gndo:preferredNameForThePerson) recht einfach und schnell die aktuelle Version abfragen kann. (User:Emu)
I tried to make a query to find refs using external-ID that are not reflected as main statement, but hit timeouts. If you uncomment the first commented-out line below, you get a timeout, even though ?statement has only one incoming link:
How can I change grammatical features of form? (I operate bot, I just need to know the commands). I have the list of lexemes. I reckon this should be not too hard, I'm just not familiar with the command to do the changes.
to clarify you want all items with that description replaced with that other description? Is there discussion around this? I can do it easily but no idea if this is an "Accepted" change. BrokenSegue (talk) 19:37, 15 August 2021 (UTC)[reply]
@Takhirgeran Umar So you'd like all descriptions which have this precise string in Chechen: "куцкеп Википеди" to be replaced with this precise string: "Викимедин проектан кеп" OK? Is that so? @BrokenSegue can probably do that very easily but we need to be sure what we're doing because Google Translate isn't very useful there so we have to take your word for it. Vojtěch Dostál (talk) 20:41, 25 November 2021 (UTC)[reply]
A while back, we generated missing parts for duos. Each duo would generally have one item for each member. This finds some that lack parts. Maybe some more filtering needs to be done.
@Jura1: There was one iteration in October [2] and I've scheduled one more for November. But there is a hard constraint for the bot that it must find at least one label for both new items. It can be helped by adding a more specific class items, like Special:Diff/1518556965, Special:Diff/1518552635. But this must be done by hand and the information needs to be in the Wikipedia articles.
Both samples have a Wikipedia article linked to them. I think it does make sense to create an item for each individual, but that we have one for them together depends on Wikipedia.
Sorry for reading this too quickly and assuming you meant the other. If it helps your bot, sure. Personally I tend to add the more general "duo", but additional processing might be easier with more specific items. As "cousin" can mean many things, I generally avoid adding it to kinship to subject (P1039). --- Jura12:28, 31 October 2021 (UTC)[reply]
┌────────────────────────────────────────────────────────────────────────────────────────────────────┘
At Wikidata:Database reports/duos without parts, I filtered those for now. P31 with "double act" (Q1141470) isn't included yet. There are now columns that identify from P31 the subtype (sibling, couple, other) and the field (currently music only). Also, aliases are displayed. That can make it easier to create new items. Also, to complete the ones I created, I added "inferred from" "duo" (diff, query). --- Jura09:48, 9 November 2021 (UTC)[reply]
@Jura: I have just noticed there are two hiearchies for what we consider "duo":
We used to have a separate type for items that used to describe two persons without them actually actively working together ("group composed of two persons") [5].
Hi there! By all means, go right ahead if it helps 👍
I wish the bot would be a little more discerning when it comes to musical duos though, in fact I think we could skip creating placeholder member items for musical duos all together, they result in way too many duplicates. Musical duo members will often already exist under their proper names ("Maxi" of "Mini & Maxi" might not go by that name in real life), or they might already be connected using member of instead of part of (much better imo). Moebeus (talk) 12:27, 27 November 2021 (UTC)[reply]
@Moebeus: The query at Wikidata:Database reports/duos without parts already skips the ones with "member of". I don't think it's a problem to have the item for the person known as "Maxi" labelled "Maxi" (this can later be edited or the real name added as alias). If you think musical duos are already mostly covered, we can skipped those. --- Jura12:33, 27 November 2021 (UTC)[reply]
Thank you! I would be very happy and grateful if you skipped musical duos, I've spent quite a bit of time merging duplicate member items for those. (nothing crazy, but a minor annoyance). Enjoy your Saturdays, both Moebeus (talk) 12:43, 27 November 2021 (UTC)[reply]
Oh, sorry for that. Please don't hesitate to complain about it ..
About the unmerging, if we do, I suppose duos would typically have several P31: one about the what links the members (married couple, siblings, etc.) and another for what they do.
I assume we have rarely items about married couples that aren't active in some field together (business, arts, etc). An exception might be a few items for Commons like Q63195554.
Without really having studied the issue that makes a lot of sense to me, the word "duo" is sometimes confused with "duet" etc., while "group of two people" would be about as clear as it can get? Moebeus (talk) 21:48, 27 November 2021 (UTC)[reply]
complete removal would only lead to re-adding of the same statements… I've already cleaned hundreds of so-called "French" ethnic group, only to see them back after months - a lot of contributors tend to use P172 instead of P27… Hsarrazin (talk) 16:43, 23 October 2021 (UTC)[reply]
If we were to make it, say, a daily job, then we would not accumulate larger amounts of unsourced claims anymore and the users who add these unsourced claims would also learn quickly to adapt to the new situation. —MisterSynergy (talk) 18:30, 23 October 2021 (UTC)[reply]
I think that unsourced statement should be deleted, since sourcing of this property is mandatory. As Wikipedia-imported claim, best is to kept them as depreciated as they are the most likely to come back. Fralambert (talk) 18:51, 23 October 2021 (UTC)[reply]
disagree. this isn't a case of "claims that can't be sourced" this is a case of "claims that aren't sufficiently sourced". many of these claims probably could be correctly sourced. BrokenSegue (talk) 02:27, 30 October 2021 (UTC)[reply]
"imported from" isn't considered sourcing/proper references and for these statements is a requirement that references be added. If you think you are able to do so, please proceed. We could revisit the question in a month and clean up whatever you didn't correctly reference. --- Jura07:26, 30 October 2021 (UTC)[reply]
Apparently Wikitree has a flag "no more marriages" for this (according to User:Lesko987a), but it's generally not filled. I think that for readers this is visible by the absence of spouse unknown (even if the person has no spouse). --- Jura12:03, 27 November 2021 (UTC)[reply]
Adminbot deleting items containing only a redirect: Task description
In mid 2020 @Dexbot: (by @Ladsgroup:) was authorised to execute the following task: "Deleting items that used to have a sitelink that is deleted (and removed now) and don't have any claims or backlinks in to any items or properties."
I would propose to extend the task in the following way:
a periodical check of all items having 0 statements and 1 sitelink
This would reduce the problem of "unclassified items", an ontology issue consisting in items having no P31 and no P279. --Epìdosis13:09, 31 October 2021 (UTC)[reply]
Adminbot deleting items containing only a redirect: Discussion
The criterium "item with one sitelink that is a redirect, and no statements" applies to ~48.900 items. This is roughly the number of items we discuss here.
We should consider to keep items where the sitelink is a redirect that carries a template from Template:Soft redirect with Wikidata item (Q16956589). While this situation probably needs some work nevertheless, I think these would at least be valuable to keep.
The ~48.900 cases are distributed across ~350 Wikimedia projects. The result is not dominated by a single wiki; enwiki has most cases (~12.900), second are ptwiki and arwiki (~2000 cases each).
Most affected items are rather old; more than 80% are more than 4 years old; this indicates that we talk about a residual import problem from earlier Wikidata times. A regular job is probably not necessary as we do not have a substantial amount of new cases.
On a side note: there are ~495.000 redirect sitelinks linked to Wikidata items in total.
This sounds good and I was considering doing something like this but I would suggest a small change. Check the history of the item and make sure it wasn't vandalized to this state. I'd suggest heuristics of the form (though anything in this direction would be fine):
@MisterSynergy, BrokenSegue: Thank you very much! The statistics are very interesting (nearly 50k items is a bit more than I expected!) and the suggestion about vandalized items perfectly makes sense. Opening a discussion in WP:PC or in some WikiProject (Ontology? something else?) is fine for me, of course. --Epìdosis07:27, 1 November 2021 (UTC)[reply]
@MisterSynergy: It could be interesting to have a break-down of all items without statements and one sitelink by site or by age. Is this similar?
I'm not entirely convinced that the fact that items with redirects are generally older is conclusive that the problem no longer persists (an item needs to be created, exist for some time without statements being added and then the sitelink converted into a redirect). It's true that the flood account that used to create empty items in bulk has been replaced by a more efficient bot that attempts to add statements. --- Jura10:09, 1 November 2021 (UTC)[reply]
Good point. I was indeed assuming that these items have been created with already-existing redirect sitelinks, rather than with articles that were transformed into redirects later. After manually checking a few cases, the latter scenario is not that uncommon in fact. So, a somewhat regular job would be fine as well; fortunately, the queries are not too expensive to run. —MisterSynergy (talk) 14:08, 1 November 2021 (UTC)[reply]
Another factor might also be how well link updates on page moves work. Supposedly this doesn't always work well and some wikis might allow more new users to move pages.
Not really in scope of this task, but eventually some analysis on the 450000 other redirects should be done. --- Jura10:35, 2 November 2021 (UTC)[reply]
How should page moves be an issue here? The items we are discussing are pretty much empty anyways.
For the other 450.000 redirects, working redirect badges would be handy to have as they would make the situation much more accessible. Technically the badges are available, but we still cannot save redirect sitelinks and thus not use these badges.
If the redirects come from a page move, it can be that Wikidata didn't correctly update. This can be technical problem (possibly resolved) or a user rights problem (the user that moved the page isn't on Wikidata).
If the redirect comes from a page move, the redirect target is either not connected to Wikidata (sitelink should be updated), or connected to another item (which should be merged; or we delete the remaining empty item that only carries a redirect sitelink). There are some sanity checks to do, in order to avoid deleting something which shouldn't be deleted; this might be one aspect to look for.
The badges would be available in WDQS which would make querying much easier. Right now the only way to query redirect sitelinks is on a per-project basis via the MediaWiki SQL databases. This is in fact what I did to get the numbers initially posted here: query the ~1M items with one sitelink and no statements from WDQS once, then query each of the 900+ Wikimedia project SQL databases to provide me their redirects which are connected to Wikidata including the Wikidata item, then combine all the results with the ~1M list. Technically not overly difficult, but clearly not a straightforward way to work with redirect sitelinks.
Just out of curiosity: Would you have a subset of items to check, ideally of different ages/with sitelinks to different wikis. BTW I'm fine with the task as outlined. --- Jura21:31, 2 November 2021 (UTC)[reply]
Not sure what exactly you want to have checked. The full list of all cases is available here for a while (tab separated txt, columns "wiki, qid, redirect page title"). You can either check this by yourself, or describe more precisely what you want to see and I am going to figure out whether this is feasible. —MisterSynergy (talk) 22:47, 2 November 2021 (UTC)[reply]
Other than that, there were no pagemoves in the ones I checked. All were pages converted into redirects at some point. --- Jura09:56, 3 November 2021 (UTC)[reply]
No opinion yet, since I was not able to look at this in more detail until now. Generally, I don't think we need to tag these items in order to prepare them for further processing. We do have a list (and can regenerate it easily at any time using e.g. this script). What we need is a robust strategy how an automated process can make a decision how to treat a given item on the fly. If that was possible, we could simply let it run over the known QIDs. It would also be okay to do this incrementally, e.g. only identify merge candidates and merge them, in order to reduce the dataset step by step for further inspection. I simply need some spare time to have a closer look. —MisterSynergy (talk) 14:10, 2 December 2021 (UTC)[reply]
Noclaimsbot has a process for handling items without claims based on templates on Wikipedia. It checks the articles linked to the newest 1000 items for templates and adds corresponding statements. For some wikis, this covers almost all items without statements (nlwiki, dewiki), for others this is still far (enwiki). frwiki is getting closer. Both pipelines are currently clogged by such redirects. I doubt about the added value of merging such items. Can users really be certain that they get what they may have been expecting and why would they have used the empty item? I think this is different from HooBots past operation (cleanup of history prior to redirection of duplicates). --- Jura14:26, 2 December 2021 (UTC)[reply]
@Jura1: Redirects created by moves can be safely merged (by bot); those created by merges are usually other valid topic and should neither be merged nor deleted.--GZWDer (talk) 13:34, 2 December 2021 (UTC)[reply]
Many seem to been created by GZWDer themselves. I don't understand why it would be for other users to do so if no use had been found for the items since. --- Jura13:36, 2 December 2021 (UTC)[reply]
Previously when someone asks how to deal with items with only sitelinks to redirect, I proposed to handle them via bot similar to Hoo Bot. This proposal does not receive much positive feedback (I can not find the discussion, though). Such things would happen as long as we are still creating items from unlinked Wikipedia pages (no matter manually, semi or fully automatically).--GZWDer (talk) 13:44, 2 December 2021 (UTC)[reply]
Are you at least dealing with the items you created? (Other then telling people what they should do with them). It appears that we have to deal with piles of items that neither you nor anybody else found useful over the last 5 or so years. --- Jura13:54, 2 December 2021 (UTC)[reply]
As a compromise, I propose another solution: blank those items and redirect them to the target item. No data are actually merged and others may restore them if they want to work on these. The only disadvantage I can imagine is they will confuse external users who are using such QIDs, but the negative impact may be minimal as there are no meaningful data whatsoever in those items.--GZWDer (talk) 16:00, 2 December 2021 (UTC)[reply]
actually upon further thought I think redirecting might be problematic. we probably do not want to copy over the label as an alias (like the normal merge code does). lots of redirects are for things that really aren't the same as the thing they redirect to. BrokenSegue (talk) 03:24, 3 December 2021 (UTC)[reply]
Note: Aging is by age of item, not age of redirect or Wikipedia page creation. Wikidata started in 2012 and items for many older pages were created in 2013.
(5) Top ten wikis
wiki
Count - item
enwiki
12927
ptwiki
2026
arwiki
2011
jawiki
1735
kowiki
1602
hiwiki
1472
zhwiki
1463
tlwiki
1444
frwiki
1218
ukwiki
1197
Total Result
27095
request to add English descriptions to railway lines and stations (2021-11-10)
Could anyone create a bot that runs permanently and adds English descriptions to railway lines and stations where absent? There's a bot (User:Edoderoobot/Set-nl-description) that adds Netherlands descriptions not only to railway lines and stations but to other types of objects. However, the bot owner does not like the idea of comlicating the script code to include other languages (and I cannot blame them for that :) ).
The simplest task that I request could be as follows:
It's a pity that the descriptioner tool died in the past year, as such queries could be easly be fullfilled by this great tool.
If this is still there open next week, I'll write a small script to create those descriptions in English. Edoderoo (talk) 12:17, 17 November 2021 (UTC)[reply]
this script is acting like a small descriptioner tool. Who knows how to run something on PAWS can take benefit of it. For the railway lines it is running now, I'll to the others later on too. Edoderoo (talk) 20:14, 27 November 2021 (UTC)[reply]
Follow page moves from a wiki (e.g. enwiki) and update sitelinks on Wikidata. See Project chat discussion above. Apparently a bot already does that for dewiki.
When those items were created, the pywikibot-framework forced this to have a value. I also recently figured out that this requirement was dropped (and that is good!).
I will pick this one up in the coming days/week, unless someone else already fixed it... Edoderoo (talk) 12:13, 17 November 2021 (UTC)[reply]
We can not clean up just all of them without additional manual research, often the values are rather useless, but in many cases they were put there on purpose, and a bot can't see the difference between the two. Edoderoo (talk) 21:56, 19 November 2021 (UTC)[reply]
For paintings I do not see an issue, as the measure of a frame will be pretty precisely measured, even when it is in mm. For all other items we first need to check ... maybe you can fix those too, but not all of them. Edoderoo (talk) 14:37, 23 November 2021 (UTC)[reply]
Support as one of the editors who has done work manually matching this catalog. I have several comments:
Nomenclature is a bilingual database. The labels should be imported in both English and French. Some items have Canadian French and Canadian English labels as well; these should be added as aliases if practical. This is one reason why importing directly from MnM is not my preferred solution.
Nomeclature includes "non-preferred terms" (see example). These can be imported as aliases if practical.
The Getty Art & Architecture link (in "other references to this object" on the user interface) can be used to prevent duplicate entries. If the AAT ID does not exist in Wikidata, it should be added to the new item. If the AAT ID does exist in Wikidata, the Nomenclature ID can be added to the existing item.
Nomenclature has some blank subclasses (example) which need to be skipped.
Thanks for the support and the excellent suggestion! We'll be publishing NOM as RDF entities in SKOS/SKOSXL soon, until then it's available as big RDF dumps, and I can make any tabular export desired (eg with the 4 languages dispatched to separate columns). Pinging @Crowjane7: who's one of the main editors --Vladimir Alexiev (talk) 06:37, 23 November 2021 (UTC)[reply]
I tried to figure out who created them and found some created by QuickStatementsBot [9] without any indication of the user who requested it. Similarly by Reinheitsgebot (without any indication of a MxM catalogue). Further by IPs who didn't add additional statements .. so some cleanup would probably help. --- Jura15:50, 27 November 2021 (UTC)[reply]
A true duplicate is an item with a sitelink to the same wikipage as another item. This should be impossible, but for some technical reasons, they happened.
Generally one of them isn't editable and remains without any statements.
Ah, wait, hold you horses ;-) There *are* WikIData items already, but they are connected the old fashioned (pre Wikidata/2013) way. That indeed requires some scripting. Edoderoo (talk) 20:02, 28 November 2021 (UTC)[reply]
Hi, Thousands of files from the Cleveland Museum of Art were uploaded by Madreiling (not active since August 2019), and Wikidata items created for them. However,
Wikidata item is not added in the Artwork template;
The Wikidata item is incomplete (creator not mentioned, medium and size missing, etc.);
Just to be clear, is the issue here specific to the way the CMA items were added to Wikidata, or is it about how User:BotMultichill uploads files found in Commons compatible image available at URL (P4765)? Many of the CMA images (at least when I wrote the Wikidata bot for them in 2020) were not uploaded by them at all, but by Multichill's bot that uploads anything using that property according to its own standard format. Dominic (talk) 00:17, 7 December 2021 (UTC)[reply]
There seem to be plenty items for subpages without any claims for plwikisource (60000+).
Similar to Q108957866 (and others from that work), one could at least add a published in (P1433)-statement pointing to the item linking to the parent page. Q108959596 that is for that sample.
If the applicable instance of (P31) and/or other statements can be determined, please add that too. Q108957866 is from a bilingual work, so I could figure it out myself.
Desc: Recently we viwiki moved a bunch of categories (for instance, Thể loại:Sinh 1980 to Thể loại:Sinh năm 1980) using a bot, but since it doesn't have WD account these items didn't automatically update. Please update them.
Licence of data to import (if relevant)
Discussion
It is not clear what you expect us to do. I see both your example categories still exist, and even both have items in them.
The second category does not (yet) have a WikiData item, so do you want to move those, leaving the old/first mentioned category without a WD-item?
Or is it something completely different? Edoderoo (talk) 10:24, 15 December 2021 (UTC)[reply]
@Edoderoo: The old links are now redirects, so there's no need to keep them here. Please update items that contain Thể loại:(Sinh|Mất) \d{1,4} so that they match new links (Thể loại:(Sinh|Mất) năm \d{1,4}). NguoiDungKhongDinhDanh12:18, 15 December 2021 (UTC)[reply]
Request process
request to delete statements and sitelinks and merge items: dewiki duplicates (2021-12-16)
I've used P854 to reference Hungarian premiere dates of films from the Excel file maintained by the regulatory authorities and these references cccasionally appear in infoboxes at huwiki. In the meantime, an item has been created for the file which provides more detailed information allowing for better formatted references over at huwiki. Máté (talk) 08:37, 29 December 2021 (UTC)[reply]
Yes, Czech would be another language where this is useful (to me), but maybe we should discuss each language separately as the forms to add and the source to use can vary. --- Jura12:57, 4 February 2022 (UTC)[reply]
It would seem quite weird in Czech to have as many as 7 cases for all places in aliases. Aliases are considered to be some sort of synonyms, which grammatical cases are not, and they look funny when understood as so, such as at https://reasonator.toolforge.org/?&q=994271. Plus, it would introduce further complexity to searches because some grammatical cases are spelled exactly same as the primary versions of different places (eg. Lhotky (Q2041531) vs "Lhotky" as a grammatical case of Lhotka (Q1471978)).
I don't think it is good idea. Maybe this can help non spekesrs, but for speakers it sound weird. Much better would be to create lexemes for these names and link lexeme with name of place. BUt I don't know if this solution is usable for searching. JAn Dudík (talk) 10:20, 5 February 2022 (UTC)[reply]
The idea isn't to add all cases, but the most frequently used ones. For Polish, it should probably include the form for locative case (see w:Locative_case#Polish).
I do think it's important to find Lhotka (Q1471978) if that is sometimes referred to as "Lhotky". What would be the benefit of only finding Lhotky (Q2041531)? Checking another database first isn't really practical.
As for all aliases, it's good to eventually add them in a structured form elsewhere too, but that's a different usecase. --- Jura10:26, 5 February 2022 (UTC)[reply]
Reconciliation in OpenRefine works on both labels and aliases. Therefore, addition of ambiguous grammatical cases will lead to OpenRefine suggesting a lot of unrelated names unnecessarily. Vojtěch Dostál (talk) 16:16, 5 February 2022 (UTC)[reply]
Generally speaking (i.e. without having any specific language including Polish in mind), I consider this a very bad idea. 1) It would be semantically very confusing, as the meaning of "alias" does not include "declension". 2) If the idea is to enable adding only some cases, it has to be defined, what "most frequently used" means. To avoid even more chaos in Wikidata, it needs to be defined generally, not only for Polish, as other users may quickly follow Polish example. 2) Agree with the above mentioned "Lhotky" problem. 3) Once users start finding declensions in aliases of place names, some will surely start adding them to other kinds of items like personal names and others, opening the door for even bigger confusion and chaos: Searching for Janov (Italian city) in Slovak language might also lead to the personal name Jan (Slovak possessive: Janov).
What I would support would be adding another separate and semantically different column containing declensions into the table next to aliases and allowing people to include them in or exclude them from their search. Another possibility would be linking the items to lexemes, as suggested above. It should not be impossible to make a searching engine to use information from the linked lexemes, if the user desires it. --Jan Kameníček (talk) 18:35, 5 February 2022 (UTC)[reply]
Can you explain what you mean with "Agree with the above mentioned "Lhotky" problem"? Do you see a benefit of not finding the place? --- Jura18:39, 5 February 2022 (UTC)[reply]
Comment Interesting points to ponder and take in account. I obviously agree with the main point: Czech declensions are less useful for Czech speakers than for others. Not quite sure about what to make of problems one or the other search system may have or not have with them. I don't think we'd delete the given name "Rome", because of the city "Rome". A key point of Wikidata is that items are ambiguous and differentiated by descriptions and statements. --- Jura09:50, 7 February 2022 (UTC)[reply]
Request process
Request to fix Spanish labels wrongly copied from English labels (2022-02-09)
In a relevant number of cases I have come across items having the following problem for "es" labels: in 2013 KLBot2 copied "en" label into "es" label for people, sometimes incorrectly (e.g. for noble people); my proposal is: 1) finding all items with instance of (P31)human (Q5) where "en" and "es" label are identical and containing a sitelink to es.wikipedia; check if the title of es.wikipedia article, without eventual parts in parenthesis, corresponds to "es" label; if they are different, use the title of es.wikipedia article as "es" label and remove it from "es" aliases if already present (e.g.). --Epìdosis18:33, 9 February 2022 (UTC)[reply]
Hi, I'm noticing many inconsistencies on wikidata on cities and countries. Here are some of them:
1) https://www.wikidata.org/wiki/Talk:Q36678, many cities have as a "country" tag an object which is not a country. Many times this is due to territory conflicts where people put the region (like here "west bank") as the country instead of putting a real country. It's true for many places for example: Q4508661 includes "louhansk" as a country.
2) Almost all cities in Romania are duplicated: Q100188/Q16898189, Q16898582/Q83404, Q2716722/Q16426101 etc.
Task description
1) A bot to remove all P17 (country) property linking to an entity which is not a country.
2) A bot to remove all pages where cities have the same name/code in Romania. Another idea could be to revert everything @JWbot did as it is the bot which created these pages without paying attention.
There is a category in trwiki that is named "Taxonbars that need a Wikidata item": tr:Kategori:Vikiveri nesnesine ihtiyaç duyan taksonçubukları. I think all articles listed already created some Wikipedias, such as svwiki. Can anyone link them with a bot?
C-SPAN person ID (P2190) is transitioning to a numeric format for reliability of linking since the string format has been found to break when C-SPAN changes the string it doesn't always redirect. See property discussion. as well as on Project chat. Coordinated updates to templates using this property have been notified on Wikipedia for update and cleanup.
For the entries added prior to 26 Feb 2022 all matched numeric formats have been uploaded. Strings added after that date have not been checked.
I would like to feed Wikidata with Offizielle Deutsche Charts album ID (P10262) of all the albums on the website offiziellecharts.de/album-details-$1 (examples).
Problem: The name of the interpret and album on the website is in header 1 (h1) and header 2 (h2) in the HTML source code (example) and I don't know how to crawl this data.
Hello @Bigbossfarin, I'm not sure offiziellecharts.de really appreciate to have their whole website crawled. And I don't know if the license is ok with adding data to Wikidata. Myst (talk) 19:29, 24 March 2022 (UTC)[reply]
@Epìdosis I'm so sorry but complicated duplicate clean ups require dedicated time for coding which I really don't have among this and a million other volunteer responsibilities :( Amir (talk) 03:24, 8 May 2022 (UTC)[reply]
This query returns all statements containing such references:
[24] I would like these references to be removed. Thanks — Martin (MSGJ · talk) 12:24, 13 May 2022 (UTC)[reply]
it’s of course required to remove the ~4000 template call to {{Item documentation}} on item talk pages as they would be redundant. Delete them altogether should be fine.
Licence of data to import (if relevant)
Discussion
it’s somewhat trivial to implement, just amount to a call of something like mw:Manual:Pywikibot/template.py on the Main Talk namespace. Would do it myself if my bot still had a botflag but it’s easier to ask to someone who just has to call the command line.