Wikidata:Bot requests: Difference between revisions

Browse history interactively

← Older edit Newer edit →

Content deleted Content added

VisualWikitext

Inline

Revision as of 21:41, 25 May 2022

Translate this header box!

Add a new request

Bot requests

If you have a bot request, add a new section using the button and tell exactly what you want. To reduce the process time, first discuss the legitimacy of your request with the community in the Project chat or in the Wikiprojects's talk page. Please refer to previous discussions justifying the task in your request.

For botflag requests, see Wikidata:Requests for permissions.

Tools available to all users which can be used to accomplish the work without the need for a bot:

PetScan for creating items from Wikimedia pages and/or adding same statements to items (note: PetScan edits are made through QuickStatements)
QuickStatements for creating items and/or adding different statements to items
Harvest Templates for importing statements from Wikimedia projects
OpenRefine to import any type of data from tabular sources
WikibaseJS-cli to write shell scripts to create and edit items in batch
Programming libraries to write scripts or bots that create and edit items in batch

On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at 2024/07.

SpBot archives all sections tagged with {{Section resolved|1=~~~~}} after 2 days.

Shakeosphere person ID

Shakeosphere person ID (P2886) error. Example:

Leonhard Euler (Q7604) = 12 007, need 36645.
Peter Abelard (Q4295) = 39, need 24677.
etc.
Please, delete or reindex. --Khodakov Pavel (talk) 14:25, 24 March 2022 (UTC)[reply]

Sorry: id_new = id_old + 24638. --Khodakov Pavel (talk) 14:37, 24 March 2022 (UTC)[reply]

Import Treccani IDs

Request date: 6 February 2019, by: Epìdosis

Task description

At the moment we have four identifiers referring to http://www.treccani.it/: Treccani's Biographical Dictionary of Italian People ID (P1986), Treccani ID (P3365), Treccani's Enciclopedia Italiana ID (P4223), Treccani's Dizionario di Storia ID (P6404). Each article of these works has, in the right column "ALTRI RISULTATI PER", a link to the articles regarding the same topic in other works (e.g. Ugolino della Gherardesca (Q706003)Treccani ID (P3365)conte-ugolino, http://www.treccani.it/enciclopedia/conte-ugolino/ has links also to Enciclopedia Italiana (Treccani's Enciclopedia Italiana ID (P4223) and Dizionario di Storia (Treccani's Dizionario di Storia ID (P6404)). This cases are extremely frequent: many items have Treccani's Biographical Dictionary of Italian People ID (P1986) and not Treccani ID (P3365)/Treccani's Enciclopedia Italiana ID (P4223); others have Treccani ID (P3365) and not Treccani's Enciclopedia Italiana ID (P4223); nearly no item has Treccani's Dizionario di Storia ID (P6404), recently created.

My request is: check each value of these identifiers in order obtain values for the other three identifiers through the column "ALTRI RISULTATI PER".

Discussion

Treccani Vocabulary ID (P5844) can be present too; e.g., at https://www.treccani.it/enciclopedia/bandiera_%28Enciclopedia-Italiana%29/
There seems to be no impediment to crawling the site in https://www.treccani.it/robots.txt

William Avery (talk) 23:18, 23 November 2021 (UTC)[reply]

The Enciclopedia Italiana has a number of appendici, and there can be ids for a number of articles, as already present at Q298#P4223 and Q298#P4223. Are those good models? @Epìdosis: are you still interested in this? William Avery (talk) 19:21, 25 November 2021 (UTC)[reply]
Yes of course; thanks for resuming this. The appendici of Treccani's Enciclopedia Italiana ID (P4223) are of course very useful, and surely Q298#P4223 is a very good model for their insertion. I am available for any other question or comment, of course. Good evening, --Epìdosis 19:55, 25 November 2021 (UTC)[reply]

I have used a script to run a few test edits that added Enciclopedia Italiana and other ids to the following items:

I am wondering whether it will always be acceptable to add the ids that are found to the item that has the original Treccani id. It seems to me that the correspondence is good for instances of human beings, human settlements, geographical features, biological taxa, to name a few. However, I think it would be unwise to attempt an automated bot run for countries:

Kingdom of Sicily (Q188586) should have the id for SICILIA e PUGLIA, Regno di?
Algeria (Q262) should not have an id for the 1929 Enciclopedia Italiana article. Correct item is perhaps French Algeria (Q218272).

I will continue to run against single items for testing purposes, and add features to the script to control it running against chosen batches of items. William Avery (talk) 22:22, 28 November 2021 (UTC)[reply]

Request process

WD:BRFA § William Avery Bot 3 – to run my script as a bot, for instances of humans only.

@Epìdosis: This task has been approved and User:William Avery Bot has been scheduled to process small batches. Just an initial 100 items a day for now. William Avery (talk) 12:05, 5 January 2022 (UTC)[reply]

Fix local dialing code (P473) wrongly inserted

Request date: 7 November 2019, by: Andyrom75

Task description

Several entities has a wrong value for the local dialing code (P473) according to the format as a regular expression (P1793) specified in it: [\d\- ]+, as clarified "excluded, such as: ,/;()+"

Typical examples of wrong values, easily identified are the following two:

local dialing code (P473) that includes at the beginning the country calling code (P474)
local dialing code (P473) that include at the beginning the "optional" zero

Case 1 can be checked looking for "+", when present, should be compared with the relevant country calling code (P474) and if matched, it should be removed
Case 2 can be checked looking for "(" and ")" with zeros inside. If matched it should be removed

Discussion

Request process

Cleaning of streaming media services urls

Request date: 12 December 2020, by: Swicher

I'm not sure if this is the best place to propose it but when reviewing the urls of a query with this script:

import requests
from concurrent.futures import ThreadPoolExecutor

# Checks the link of an item, if it is down then saves it in the variable "novalid"
def check_url_item(item):
    # Some sites may return error if a browser useragent is not indicated
    useragent = 'Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77'
    item_url = item["url"]["value"]
    print("Checking %s" % item_url, end="\r")
    req = requests.head(item_url, headers = {'User-Agent': useragent}, allow_redirects = True)
    if req.status_code == 404:
        print("The url %s in the element %s returned error" % (item_url, item["item"]["value"]))
        novalid.append(item)

base_query = """SELECT DISTINCT ?item ?url ?value
{
%s
  BIND(IF(ISBLANK(?dbvalue), "", ?dbvalue) AS ?value)
  BIND(REPLACE(?dbvalue, '(^.*)', ?url_format) AS ?url)
}"""
union_template = """  {{
    ?item p:{0} ?statement .
    OPTIONAL {{ ?statement ps:{0} ?dbvalue }}
    wd:{0} wdt:P1630 ?url_format.
  }}"""
properties = [
    "P2942", #Dailymotion channel
    "P6466", #Hulu movies
    "P6467", #Hulu series
]
# Items with links that return errors will be saved here
novalid = []

query = base_query % "\n  UNION\n".join([union_template.format(prop) for prop in properties])
req = requests.get('https://query.wikidata.org/sparql', params = {'format': 'json', 'query': query})
data = req.json()

# Schedule and run 25 checks concurrently while iterating over items
check_pool = ThreadPoolExecutor(max_workers=25)
result = check_pool.map(check_url_item, data["results"]["bindings"])

I have noticed that almost half are invalid. I do not know if in these cases it is better to delete or archive them but a bot should periodically perform this task since the catalogs of streaming services tend to be very changeable (probably many of these broken links are due to movies/series whose license was not renewed). Unfortunately I could only include Hulu and Dailymotion since the rest of the services have the following problems:

Always return OK (even with invalid links): Xfinity Stream ID (P8823), Hoopla title ID (P5680), YouTube video ID (P1651), YouTube channel ID (P2397), YouTube playlist ID (P4300)
Return error 403: Netflix ID (P1874), Max ID (P8298) (this seems to work only in countries where the service is available).

For those sites it is necessary to perform a more specialized check than a HEAD request (like using youtube-dl (Q28401317) for Youtube).

In the case of Hulu I have also noticed that some items can have two valid values in Hulu movie ID (P6466) and Hulu series ID (P6467) (see for example The Tower of Druaga (Q32256)) so you should take that into account when cleaning links.

Request process

Ontario public school contact info

Request date: 27 December 2020, by: Jtm-lis

Link to discussions justifying the request

https://www.wikidata.org/wiki/Wikidata:Dataset_Imports/_Ontario_public_school_contact_information

Task description

https://www.wikidata.org/wiki/Wikidata:Dataset_Imports/_Ontario_public_school_contact_information

Licence of data to import (if relevant)

Discussion

reference URL (P854) → Holocaust.cz person ID (P9109) (2021-02-05)

Request date: 5 February 2021, by: Daniel Baránek

Task description

After intoducing Holocaust.cz person ID (P9109), reference URL (P854) in references can be replaced by this new identificator. The result of edits should be like this. It is 285,282 references. You can see all references, their reference URL (P854) value and value for Holocaust.cz person ID (P9109) here:

SELECT ?ref ?url ?id WHERE {
  ?ref prov:wasDerivedFrom [ pr:P248 wd:Q104074149 ; pr:P854 ?url ].
  BIND (REPLACE(STR(?url),"^.*/([0-9]+)[-/].*$","$1") as ?id)
  }

Try it!

Discussion

I believe I can use the following SPARQL to get a list of items to update:

SELECT DISTINCT ?item ?id WHERE {
  ?item ?prop ?claim.
  ?claim prov:wasDerivedFrom [ pr:P248 wd:Q104074149 ; pr:P854 ?url ].
  BIND (REPLACE(STR(?url),"^.*/([0-9]+)[-/].*$","$1") as ?id)
  }

Try it!

William Avery (talk) 17:58, 25 May 2022 (UTC)[reply]
Request process

Accepted by (William Avery (talk) 17:58, 25 May 2022 (UTC)) and under process[reply]

request to add identifiers from FB (2021-02-11)

Thanks to a recent import, we currently have more than >1.2 items where the only identifier is Freebase ID (P646). However, checking https://freebase.toolforge.org/ some of them have identifiers available there.

Samples:

Pass the Plate (Q1537223) → https://freebase.toolforge.org/m/03yff7k → /source/videosurf/211021 /source/clicker/tv/pass-the-plate
Strongest Chil Woo (Q484295) → https://freebase.toolforge.org/m/0h1d8vd → /authority/thetvdb/series/83143 /user/ovguide/tvdb_show_id/83143
Dear Mother...Love Albert (Q5246901) → https://freebase.toolforge.org/m/07cfvgz → /authority/thetvdb/series/274836 /authority/tvrage/series_numeric/8341
Sisters Over Flowers (Q15116349) → https://freebase.toolforge.org/m/011sn4j8 → /user/ovguide/tvdb_show_id/285342
Naked and Funny (Q50927) → https://freebase.toolforge.org/m/05bzv2w → /authority/thetvdb/series/119391 /user/ovguide/tvdb_show_id/119391

See Wikidata:Project_chat#Freebase_(bis) for discussion.

Task description

Import ids where available. Map keys to properties if not available at Wikidata:WikiProject_Freebase/Mapping.

Discussion

Request process

request to fix parliamentary group = caucus, != party (2021-05-12)

Request date: 12 May 2021, by: Jura1

Link to discussions justifying the request

Wikidata:Project_chat/Archive/2021/01#parliamentary_group_(P4100)_for_Angus_King_(Q544464)_and_Bernie_Sanders_(Q359442)

Task description

Change the values of parliamentary group (P4100) to
- Senate Democratic Caucus (Q3117832) or Senate Republican Conference (Q3117916) for all current members of the US senate. Eventually fix others as well.

Discussion

Request process

request to automate marking preferred_rank for full dates. (2021-05-28)

Request date: 28 May 2021, by: Richard Arthur Norton (1958- )

Task description

We have year only dates and full dates for date_of_birth and date_of_death. See for instance Eliot Blackwelder (Q16785350). We need to mark the full date as "preferred rank" and add in the reason_for_preferred_rank=most complete record (Q105749746). The problem is when we have two dates of equal rank, both display in infoboxes. --RAN (talk) 04:45, 28 May 2021 (UTC)[reply]

Discussion

Request process

@Richard Arthur Norton (1958- ): What about references though? What if the less complete date has a reference and the other does not? Should we still do this? I might be able to find time to do this. BrokenSegue (talk) 05:21, 28 May 2021 (UTC)[reply]

I guess in the case where the two dates disagree we should not perform the update. BrokenSegue (talk) 05:22, 28 May 2021 (UTC)[reply]

I think it's already being done by @Matěj Suchánek:, if references are present.--- Jura 07:07, 28 May 2021 (UTC)[reply]

That would be great, I haven't seen the bot in action yet, I am still plugging away by hand as I come across them. --RAN (talk) 20:20, 28 May 2021 (UTC)[reply]

No, my bot does not manipulate ranks. --Matěj Suchánek (talk) 11:52, 29 May 2021 (UTC)[reply]

Maybe it was someone else's. Sorry then. --- Jura 11:59, 29 May 2021 (UTC)[reply]
@Matěj Suchánek: I think I had this in mind. --- Jura 09:33, 30 May 2021 (UTC)[reply]
Indeed, my bot still does that (every Wednesday). In fact, it has evolved since, it also merges (seemingly) duplicate dates (that issue with -00-00 vs. -01-01 etc.). But it does not change ranks, and it even avoids statements with non-normal rank. --Matěj Suchánek (talk) 10:26, 30 May 2021 (UTC)[reply]

@Matěj Suchánek: Are you interested in picking this task up? It does kinda overlap with the task Jura mentioned. Actually, hmm, there is some subtlety here that I can see being tricky (multiple dates with different qualifiers sometimes shouldn't be merged e.g. for start time (P580)s with a applies to part (P518)). If not I may still do it. BrokenSegue (talk) 12:40, 30 May 2021 (UTC)[reply]

Sorry, I am not right now. I guess it's easy now that we have Ranker (Q105394978), which can be driven by SPARQL. (Or maybe not that easy if the qualifier is also required, but QS can do this part.) I made up a query which can be used as basis.

What if the less complete date has a reference and the other does not? Preferred statements should always be sourced. If there is no evidence for the more precise date, it should be either removed or sourced (and then up-rank'd). --Matěj Suchánek (talk) 13:12, 30 May 2021 (UTC)[reply]

Thanks for the query; you're a SPARQL wizard. I write my bot actions self-contained in python so I don't need ranker. BrokenSegue (talk) 14:07, 30 May 2021 (UTC)[reply]

@Richard Arthur Norton (1958- ), Jura1, Matěj Suchánek: I created a request for permission for this task at Wikidata:Requests for permissions/Bot/BorkedBot 5. Once that's approved we should be good to go. I chose to use a slightly different qualifier but I don't think it matters. BrokenSegue (talk) 07:04, 31 May 2021 (UTC)[reply]

Excellent! I know there are several bots trying to fill in references for dates, but they are mostly pulling data from sources that give year-only dates. At one time I calculated that about 20% of year-only dates are off by a year because they are back calculated from the age at death in an obituary. --RAN (talk) 00:37, 1 June 2021 (UTC)[reply]

Do you know who is operating these bots? Wikibase in theory supports adding uncertainty in dates but in practice I believe the correct way to add a date with that kind of uncertainty is to use e.g. earliest date (P1319). BrokenSegue (talk) 01:31, 1 June 2021 (UTC)[reply]

@Vojtěch Dostál: it seems that preferred rank is also added when dates aren't in the same year. I don't think this should be done.

(I tried to find the sample that came up on Wikidata:Database reports/identical birth and death dates/1, but couldn't find it) --- Jura 11:54, 27 November 2021 (UTC)[reply]

@Jura1 Did you mean to tag @Matěj Suchánek? Vojtěch Dostál (talk) 14:57, 27 November 2021 (UTC)[reply]

No, I think it was an edit of yours, but I might be mistaken. If the request is being done by Matěj, I suppose we can close this anyways. --- Jura 15:00, 27 November 2021 (UTC)[reply]

I'm not involved in this request by working on it. --Matěj Suchánek (talk) 09:01, 28 November 2021 (UTC)[reply]

request to replace qualifiers in GND ID (2021-06-07)

Request date: 7 June 2021, by: Kolja21

Link to discussions justifying the request

Help talk:P227

Task description

Please replace in GND ID object named as (P1932) with subject named as (P1810)

GND ID (P227) delete qualifier object named as (P1932)
import name of object from GND with qualifier subject named as (P1810)
add retrieved (P813)

Scope: 5.161 qualifiers object named as (P1932), see Wikidata:Database reports/Constraint violations/P227#Properties statistics.

Comment (in German): Man könnte hinzufügen, dass man über die OpenRefine Reconciliation oder über https://d-nb.info/gnd/100045642/about/lds.ttl (gndo:preferredNameForThePerson) recht einfach und schnell die aktuelle Version abfragen kann. (User:Emu)

Example

P1932 → P1810

Discussion

Support
The reason for this is the somewhat tricky difference between object named as (P1932) with subject named as (P1810) – see my illustration. --Emu (talk) 23:43, 7 June 2021 (UTC)[reply]

Request process

Accepted by (Ammarpad (talk) 14:01, 10 June 2021 (UTC)) and under process[reply]

@Ammarpad Are you still working on this? Vojtěch Dostál (talk) 20:32, 25 November 2021 (UTC)[reply]

request to cleanup DOI only items (2021-07-04)

Request date: 4 July 2021, by: Jura1

Task description

Items like Q57554778 consist mainly of DOI: the DOI is repeated as title and label.

identify all such items. https://www.wikidata.org/w/index.php?search=10.1023&fulltext=1&ns0=1 finds some
determine a datasource to complete them
fix at least title and label
ideally add author and published in (P1433) statements

@Daniel Mietchen: who created some or all of them. @Trilotat: who mentioned some on Wikidata:Request_a_query#Items_with_DOI_(P356)_that_start_with_10.1023/A:_without_a_Label_or_a_title_(P1476). --- Jura 13:24, 4 July 2021 (UTC)[reply]

@Jura1: To be precise, I was looking for items without a label, but I had seen this and did some research. A web search for any of the "DOI as title" DOIs will find that they are all or almost all noted in ResearchGate publication ID (P5875) items associated with Entomologia Experimentalis et Applicata (Q15753202) journal. These items are published in (P1433) CrossRef Listing of Deleted DOIs (Q53952674).

Q57554778 is 10.1023/A:1003902321787 and that DOI is mentioned in ResearchGate publication ID (P5875) 226608108. That researchgate item mentions the title and article details as Q107413498.
I added the deleted DOI to that matched item as deprecated (as withdrawn identifier value).
They should be merged, but I didn't as I thought it might confuse this bot request.

In the future, I think we can add the new DOI to the bad items and then rerun SourceMD as I did with Q57030816, right? Trilotat (talk) 14:54, 4 July 2021 (UTC)[reply]

List of items: User:Jura1/DOI as label. It was done using regexp 10\..+/ for title (P1476) values. — Ivan A. Krestinin (talk) 20:15, 21 July 2021 (UTC)[reply]

@Jura1 It seems that all the items listed in Ivan's query have defunct DOIs... Am I right? What would be the correct course of action there? Vojtěch Dostál (talk) 20:35, 25 November 2021 (UTC)[reply]

Request process

request to add reference (2021-07-04)

Request date: 4 July 2021, by: Data Gamer

Link to discussions justifying the request

Task description

Hello. In all items (56 items) that have position held (P39) -> member of the House of Representatives of Cyprus (Q19801674) with qualifier parliamentary term (P2937) -> 12th Cypriot Parliament (Q107003549)

I want to add the reference to above statement:

reference URL (P854) -> http://www.parliament.cy/el/general-information/%CE%B2%CE%BF%CF%85%CE%BB%CE%B5%CF%85%CF%84%CE%B9%CE%BA%CE%AD%CF%82-%CE%B5%CE%BA%CE%BB%CE%BF%CE%B3%CE%AD%CF%82/%CE%B5%CE%BA%CE%BB%CE%BF%CE%B3%CE%AD%CF%82-30%CE%AE%CF%82-%CE%BC%CE%B1%CE%90%CE%BF%CF%85-2021

title (P1476) -> Εκλογές 30ής Μαΐου 2021 (in Greek (el) Language)

retrieved (P813) -> 2021-07-04

archive URL (P1065) -> https://web.archive.org/web/20210704152630/http://www.parliament.cy/el/general-information/%CE%B2%CE%BF%CF%85%CE%BB%CE%B5%CF%85%CF%84%CE%B9%CE%BA%CE%AD%CF%82-%CE%B5%CE%BA%CE%BB%CE%BF%CE%B3%CE%AD%CF%82/%CE%B5%CE%BA%CE%BB%CE%BF%CE%B3%CE%AD%CF%82-30%CE%AE%CF%82-%CE%BC%CE%B1%CE%90%CE%BF%CF%85-2021

archive URL (P1065) -> https://archive.is/loRfw

archive date (P2960) -> 2021-07-04

language of work or name (P407) -> Greek (Q9129)

publisher (P123) -> House of Representatives (Q1112381)

Thanks.

Licence of data to import (if relevant)

Discussion

Request process

Proliferate external-IDs from qualifiers and references to main statement (2021-07-06)

Request date: 6 July 2021, by: Vladimir Alexiev

Link to discussions justifying the request

none yet:
Vladimir Alexiev Jonathan Groß Andy Mabbett Jneubert Sic19 Wikidelo ArthurPSmith PKM Ettorerizza Fuzheado Daniel Mietchen Iwan.Aucamp Epìdosis Sotho Tal Ker Bargioni Carlobia Pablo Busatto Matlin Msuicat Uomovariabile Silva Selva 1-Byte Alessandra.Moi CamelCaseNick Songceci moz AhavaCohen Kolja21 RShigapov Jason.nlw MasterRus21thCentury Newt713 Pierre Tribhou Ahatd JordanTimothyJames Silviafanti Back ache AfricanLibrarian M.roszkowski Rhagfyr 沈澄心 MrBenjo S.v.Mering Hiperterminal (talk) מקף Lovelano Ecravo
Notified participants of WikiProject Authority control

Task description

Take a prop like ORCID: Property_talk:P496 says that 22.7% of uses are as reference, and 0.1% as qualifier.

I bet that some of those uses are not reflected as main statement.

Eg the paper Crossref: The sustainable source of community-owned scholarly metadata (Q86246932) states the ORCID iD of author ordinal 1 as 0000-0002-0353-2702, but that author Ginny Hendricks (Q56085877) has no ORCID stated.
I tried to make a query to find refs using external-ID that are not reflected as main statement, but hit timeouts. If you uncomment the first commented-out line below, you get a timeout, even though ?statement has only one incoming link:

SELECT ?itemLabel ?wdt ?wdLabel ?id { # ?ref ?wdr ?statement {
  ?wd wikibase:propertyType wikibase:ExternalId; wikibase:directClaim ?wdt; wikibase:reference ?wdr.
  ?ref ?wdr ?id.
  ?statement prov:wasDerivedFrom ?ref.
  # ?item ?prop ?statement
  # filter not exists {?item ?wdt ?id}
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
} limit 10

Try it!

Of course, sifting through all those external-IDs used as refs will be a huge task. WD times out even on a count query:

SELECT (count(*) as ?c) {
  ?wd wikibase:propertyType wikibase:ExternalId; wikibase:reference ?wdr.
  ?ref ?wdr ?id.
}

Try it!

Discussion

Request process

Request to change lexeme forms' grammatical features (2021-07-08)

Request date: 8 July 2021, by: Bennylin

Link to discussions justifying the request

Hello, my bot, OrophinBot is adding several thousand Indonesian lexemes. In one of the batch (500 lexemes), I made a slight mistake, where the grammatical features are listed as active (Q1317831), where it should've been passive (Q1194697). Example: tercapak (L576679) 1456262214 -> [1]

Task description

How can I change grammatical features of form? (I operate bot, I just need to know the commands). I have the list of lexemes. I reckon this should be not too hard, I'm just not familiar with the command to do the changes.

Licence of data to import (if relevant)

Discussion

Request process

Help Bota .. (2021-07-27)

Request date: 27 July 2021, by: Takhirgeran Umar

Link to discussions justifying the request

If the bot can make such edits

Task description

Licence of data to import (if relevant)

Discussion

Comment There are around 120,000 changes. --Matěj Suchánek (talk) 16:20, 6 August 2021 (UTC)[reply]

to clarify you want all items with that description replaced with that other description? Is there discussion around this? I can do it easily but no idea if this is an "Accepted" change. BrokenSegue (talk) 19:37, 15 August 2021 (UTC)[reply]

@Takhirgeran Umar Can you please answer the question? Vojtěch Dostál (talk) 15:45, 25 November 2021 (UTC)[reply]

The fact is that we have long changed the space of the name "Куцкеп" on "Кеп" (as in the dictionary) Takhirgeran Umar (talk) 17:00, 25 November 2021 (UTC)[reply]

@Takhirgeran Umar So you'd like all descriptions which have this precise string in Chechen: "куцкеп Википеди" to be replaced with this precise string: "Викимедин проектан кеп" OK? Is that so? @BrokenSegue can probably do that very easily but we need to be sure what we're doing because Google Translate isn't very useful there so we have to take your word for it. Vojtěch Dostál (talk) 20:41, 25 November 2021 (UTC)[reply]

@Vojtěch Dostál I opened a discussion. After a short time, write here. Takhirgeran Umar (talk) 20:55, 25 November 2021 (UTC)[reply]

@Vojtěch Dostál We decided that spelling will correctly "Викимеди проектан кеп" (On the forum). Takhirgeran Umar (talk) 10:15, 28 November 2021 (UTC)[reply]

Request process

Parts for duos (14 September 2021)

A while back, we generated missing parts for duos. Each duo would generally have one item for each member. This finds some that lack parts. Maybe some more filtering needs to be done.

Sample items: Q6161933, Q52375494.

For a list, see Wikidata:WikiProject Q5/lists/duos.

Previous request: Wikidata:Bot_requests/Archive/2016/12#duos_without_parts. @Matěj Suchánek: --- Jura 14:56, 14 September 2021 (UTC)[reply]

@Jura1: There was one iteration in October [2] and I've scheduled one more for November. But there is a hard constraint for the bot that it must find at least one label for both new items. It can be helped by adding a more specific class items, like Special:Diff/1518556965, Special:Diff/1518552635. But this must be done by hand and the information needs to be in the Wikipedia articles.

Do you think it makes sense to create an item for "a cousins duo", like José and Francisco Díaz (Q6294250) or Mary and Molly Bell (Q6781014)? --Matěj Suchánek (talk) 10:07, 28 October 2021 (UTC)[reply]

Thanks for looking into this.

Both samples have a Wikipedia article linked to them. I think it does make sense to create an item for each individual, but that we have one for them together depends on Wikipedia.

I will try to do the remaining ones manually once the November run has gone through. --- Jura 12:39, 29 October 2021 (UTC)[reply]

What I meant with an item for "a cousins duo" was a class item for these pairs, i.e. an item with label "cousins duo". --Matěj Suchánek (talk) 15:25, 30 October 2021 (UTC)[reply]

Sorry for reading this too quickly and assuming you meant the other. If it helps your bot, sure. Personally I tend to add the more general "duo", but additional processing might be easier with more specific items. As "cousin" can mean many things, I generally avoid adding it to kinship to subject (P1039). --- Jura 12:28, 31 October 2021 (UTC)[reply]

@Jura1: November run [3]. Another for December is scheduled. --Matěj Suchánek (talk) 16:44, 7 November 2021 (UTC)[reply]

Thanks, I had a look at the remaining ones at Wikidata:Database reports/duos without parts (950 left) and did some stats at Wikidata:WikiProject Q5/numbers/duos.

Seems that some include "part of", but not "has part". A few might use "member of". --- Jura 21:12, 7 November 2021 (UTC)[reply]

@Jura1: December run [4]. I might use some help with cleanup, especially with entries like Queen Elizabeth II and Prince Philip (Q63195764) (Elizabeth II (Q109851574) & Prince Philip, Duke of Edinburgh (Q109851578)). --Matěj Suchánek (talk) 15:12, 1 December 2021 (UTC)[reply]

┌────────────────────────────────────────────────────────────────────────────────────────────────────┘ At Wikidata:Database reports/duos without parts, I filtered those for now. P31 with "double act" (Q1141470) isn't included yet. There are now columns that identify from P31 the subtype (sibling, couple, other) and the field (currently music only). Also, aliases are displayed. That can make it easier to create new items. Also, to complete the ones I created, I added "inferred from" "duo" (diff, query). --- Jura 09:48, 9 November 2021 (UTC)[reply]

@Jura: I have just noticed there are two hiearchies for what we consider "duo":

couple (Q219160) (with e.g. married couple (Q3046146))
duo (Q10648343) (with e.g. sibling duo (Q14073567))

My bot had only worked with (2). (It isn't problem to have the bot run once for each hiearchy.)

Also I would often remove P31: duo (Q10648343) where I set P31: married couple (Q3046146). Perhaps I am wrong? --Matěj Suchánek (talk) 11:26, 14 November 2021 (UTC)[reply]

We used to have a separate type for items that used to describe two persons without them actually actively working together ("group composed of two persons") [5].

Compare with "two people that work together" [6].

There are Wikipedia articles that mainly describe two persons that happen to do the same thing, without actively collaborating together.

At some point @Moebeus: merged them [7]. @Matěj Suchánek:--- Jura 12:15, 14 November 2021 (UTC)[reply]

@Matěj Suchánek What about collective pseudonym (Q16017119), would that be eligible for those fixes too? Vojtěch Dostál (talk) 20:43, 25 November 2021 (UTC)[reply]

@Vojtěch Dostál: collective pseudonym (Q16017119) is a subclass of group of humans (Q16334295), not necessarily a duo. [8] shows only a few possible samples, like Mizuki Tachibana & Rinko Sakura (Q11544311). But I can try it, it doesn't cost anything. --Matěj Suchánek (talk) 11:04, 28 November 2021 (UTC)[reply]

@Moebeus: can we undo the merge? --- Jura 12:04, 27 November 2021 (UTC)[reply]
Hi there! By all means, go right ahead if it helps 👍
I wish the bot would be a little more discerning when it comes to musical duos though, in fact I think we could skip creating placeholder member items for musical duos all together, they result in way too many duplicates. Musical duo members will often already exist under their proper names ("Maxi" of "Mini & Maxi" might not go by that name in real life), or they might already be connected using member of instead of part of (much better imo). Moebeus (talk) 12:27, 27 November 2021 (UTC)[reply]
- @Moebeus: The query at Wikidata:Database reports/duos without parts already skips the ones with "member of". I don't think it's a problem to have the item for the person known as "Maxi" labelled "Maxi" (this can later be edited or the real name added as alias). If you think musical duos are already mostly covered, we can skipped those. --- Jura 12:33, 27 November 2021 (UTC)[reply]
  Thank you! I would be very happy and grateful if you skipped musical duos, I've spent quite a bit of time merging duplicate member items for those. (nothing crazy, but a minor annoyance). Enjoy your Saturdays, both Moebeus (talk) 12:43, 27 November 2021 (UTC)[reply]
  Oh, sorry for that. Please don't hesitate to complain about it ..
  
  About the unmerging, if we do, I suppose duos would typically have several P31: one about the what links the members (married couple, siblings, etc.) and another for what they do.
  
  I assume we have rarely items about married couples that aren't active in some field together (business, arts, etc). An exception might be a few items for Commons like Q63195554.
  
  Maybe we should label one "group of two persons" instead of "duo" as previously. --- Jura 16:01, 27 November 2021 (UTC)[reply]
  Without really having studied the issue that makes a lot of sense to me, the word "duo" is sometimes confused with "duet" etc., while "group of two people" would be about as clear as it can get? Moebeus (talk) 21:48, 27 November 2021 (UTC)[reply]
I didn't consider member of (P463), thanks for this suggestion. --Matěj Suchánek (talk) 11:41, 28 November 2021 (UTC)[reply]

request to depreciated ethnic group only sourced with P143 (2021-10-23)

Request date: 23 October 2021, by: Fralambert

Link to discussions justifying the request

Task description

Hi, since ethnic group (P172) is a highly contencious subject, the property already mandate a source and imported from Wikimedia project (P143) is not a reliable source, it would be fine if a bot put a depreciated rank when the statement in ethnic group (P172) use only imported from Wikimedia project (P143) as a source. Also the bot could add reason for deprecated rank (P2241) and source known to be unreliable (Q22979588)as a qualifier. We could also only remove statement with this source, but they are likely to come back, so depreciated them would be a best. --Fralambert (talk) 15:12, 23 October 2021 (UTC)[reply]

Licence of data to import (if relevant)

Discussion

Support - it would also be nice to deprecate or remove all those same statements, when there is no source at all… --Hsarrazin (talk) 15:19, 23 October 2021 (UTC)[reply]
Support, this could be an ongoing task, since this is not the first time this property needs to be cleaned up: Property talk:P172. — eru ^{[Talk] [french wiki]} 15:32, 23 October 2021 (UTC)[reply]
Support; I would even support a complete removal of unsourced or Wikipedia-imported claims. —MisterSynergy (talk) 15:58, 23 October 2021 (UTC)[reply]
complete removal would only lead to re-adding of the same statements… I've already cleaned hundreds of so-called "French" ethnic group, only to see them back after months - a lot of contributors tend to use P172 instead of P27… Hsarrazin (talk) 16:43, 23 October 2021 (UTC)[reply]
If we were to make it, say, a daily job, then we would not accumulate larger amounts of unsourced claims anymore and the users who add these unsourced claims would also learn quickly to adapt to the new situation. —MisterSynergy (talk) 18:30, 23 October 2021 (UTC)[reply]
I think that unsourced statement should be deleted, since sourcing of this property is mandatory. As Wikipedia-imported claim, best is to kept them as depreciated as they are the most likely to come back. Fralambert (talk) 18:51, 23 October 2021 (UTC)[reply]
delete them. As Help:Ranking#Deprecated_rank says, deprecation isn't an option for claims that can't be sourced. --- Jura 14:01, 25 October 2021 (UTC)[reply]
Agree. Deprecation would cause a mess. Remove if you think that's necessary. Vojtěch Dostál (talk) 06:01, 29 October 2021 (UTC)[reply]
disagree. this isn't a case of "claims that can't be sourced" this is a case of "claims that aren't sufficiently sourced". many of these claims probably could be correctly sourced. BrokenSegue (talk) 02:27, 30 October 2021 (UTC)[reply]
"imported from" isn't considered sourcing/proper references and for these statements is a requirement that references be added. If you think you are able to do so, please proceed. We could revisit the question in a month and clean up whatever you didn't correctly reference. --- Jura 07:26, 30 October 2021 (UTC)[reply]
Oppose I strongly disagree, deleting is not option, I also don't see that it would be a highly contentious issue. Germartin1 (talk) 10:17, 21 December 2021 (UTC)[reply]
Support - deprecate all wikipedia-sourced statements, remove all unsourced statements.Matthias Winkelmann (talk) 21:44, 22 March 2022 (UTC)[reply]

Request process

request to find references for novalue statements in "spouse" (P26) (2021-10-31)

Request date: 31 October 2021, by: Jura1

Task description

After creating new items for missing spouse (P26) at Wikidata:Database reports/top missing properties by number of sitelinks/P26, these tend to get additional statements and identifiers fairly frequently.

This is less so for novalue statements. Sample: Q12325#P26 for James Buchanan (Q12325).

Maybe there is a way to reference them with one or the other source.

The proposed task is to find a suitable source and added references to such statements. --- Jura 12:57, 31 October 2021 (UTC)[reply]

Discussion

Apparently Wikitree has a flag "no more marriages" for this (according to User:Lesko987a), but it's generally not filled. I think that for readers this is visible by the absence of spouse unknown (even if the person has no spouse). --- Jura 12:03, 27 November 2021 (UTC)[reply]

Request process

(1) Query to find them:

Items used: human (Q5)   , Latin Catholic priest (Q1469535)

Properties used: instance of (P31)   , date of death (P570)   , occupation (P106)   , spouse (P26)   , imported from Wikimedia project (P143)

SELECT DISTINCT
  ?item ?itemLabel ?itemDescription
WHERE
{
  ?st a wdno:P26 .
  ?item p:P26 ?st .
  OPTIONAL { ?st prov:wasDerivedFrom ?source . 
            FILTER NOT EXISTS { ?source pr:P143 [] } 
           } 
  FILTER(!bound(?source)) 
  ?item wdt:P31 wd:Q5 ; wdt:P570 ?d . 
  FILTER ( YEAR(?d) > 1600 ) 
  FILTER NOT EXISTS { ?item wdt:P106 wd:Q1469535 }
  FILTER NOT EXISTS { ?item wdt:P26 []  }  
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
LIMIT 100

Try it!

(2) Currently 549 items on 12:57, 31 October 2021 (UTC)

Adminbot deleting items containing only a redirect

Request date: 31 October 2021, by: Epìdosis

Link to discussions justifying the request

This is an extension of Wikidata:Requests for permissions/Bot/Dexbot 13

Adminbot deleting items containing only a redirect: Task description

In mid 2020 @Dexbot: (by @Ladsgroup:) was authorised to execute the following task: "Deleting items that used to have a sitelink that is deleted (and removed now) and don't have any claims or backlinks in to any items or properties."

I would propose to extend the task in the following way:

a periodical check of all items having 0 statements and 1 sitelink
if the only sitelink is a redirect
and if the item has no backlinks
then the item gets deleted

Example: Q459602.

This would reduce the problem of "unclassified items", an ontology issue consisting in items having no P31 and no P279. --Epìdosis 13:09, 31 October 2021 (UTC)[reply]

Adminbot deleting items containing only a redirect: Discussion

Sounds good. --- Jura 13:16, 31 October 2021 (UTC)[reply]

I wrote some queries and combined the results:

The criterium "item with one sitelink that is a redirect, and no statements" applies to ~48.900 items. This is roughly the number of items we discuss here.
We should consider to keep items where the sitelink is a redirect that carries a template from Template:Soft redirect with Wikidata item (Q16956589). While this situation probably needs some work nevertheless, I think these would at least be valuable to keep.
The ~48.900 cases are distributed across ~350 Wikimedia projects. The result is not dominated by a single wiki; enwiki has most cases (~12.900), second are ptwiki and arwiki (~2000 cases each).
Most affected items are rather old; more than 80% are more than 4 years old; this indicates that we talk about a residual import problem from earlier Wikidata times. A regular job is probably not necessary as we do not have a substantial amount of new cases.
On a side note: there are ~495.000 redirect sitelinks linked to Wikidata items in total.

I would be able to take this job on, but I think there should be some more discussion in advance (WD:RFBOT, WD:RFA, WD:PC, …). I do already have adminbot experience with User:MsynABot. —MisterSynergy (talk) 18:00, 31 October 2021 (UTC)[reply]

This sounds good and I was considering doing something like this but I would suggest a small change. Check the history of the item and make sure it wasn't vandalized to this state. I'd suggest heuristics of the form (though anything in this direction would be fine):
1. less than 5 total edits
2. less than 3 editors
3. last edit more than 30 days ago

otherwise I worry people will use this to get valid items deleted. BrokenSegue (talk) 18:46, 31 October 2021 (UTC)[reply]

@MisterSynergy, BrokenSegue: Thank you very much! The statistics are very interesting (nearly 50k items is a bit more than I expected!) and the suggestion about vandalized items perfectly makes sense. Opening a discussion in WP:PC or in some WikiProject (Ontology? something else?) is fine for me, of course. --Epìdosis 07:27, 1 November 2021 (UTC)[reply]

@MisterSynergy: It could be interesting to have a break-down of all items without statements and one sitelink by site or by age. Is this similar?

I'm not entirely convinced that the fact that items with redirects are generally older is conclusive that the problem no longer persists (an item needs to be created, exist for some time without statements being added and then the sitelink converted into a redirect). It's true that the flood account that used to create empty items in bulk has been replaced by a more efficient bot that attempts to add statements. --- Jura 10:09, 1 November 2021 (UTC)[reply]

Good point. I was indeed assuming that these items have been created with already-existing redirect sitelinks, rather than with articles that were transformed into redirects later. After manually checking a few cases, the latter scenario is not that uncommon in fact. So, a somewhat regular job would be fine as well; fortunately, the queries are not too expensive to run. —MisterSynergy (talk) 14:08, 1 November 2021 (UTC)[reply]

Another factor might also be how well link updates on page moves work. Supposedly this doesn't always work well and some wikis might allow more new users to move pages.

Not really in scope of this task, but eventually some analysis on the 450000 other redirects should be done. --- Jura 10:35, 2 November 2021 (UTC)[reply]

How should page moves be an issue here? The items we are discussing are pretty much empty anyways.
For the other 450.000 redirects, working redirect badges would be handy to have as they would make the situation much more accessible. Technically the badges are available, but we still cannot save redirect sitelinks and thus not use these badges.

—MisterSynergy (talk) 20:06, 2 November 2021 (UTC)[reply]

If the redirects come from a page move, it can be that Wikidata didn't correctly update. This can be technical problem (possibly resolved) or a user rights problem (the user that moved the page isn't on Wikidata).

Even some automatic badge on any redirect would help (and we wouldn't need to edit any of them). --- Jura 20:21, 2 November 2021 (UTC)[reply]

If the redirect comes from a page move, the redirect target is either not connected to Wikidata (sitelink should be updated), or connected to another item (which should be merged; or we delete the remaining empty item that only carries a redirect sitelink). There are some sanity checks to do, in order to avoid deleting something which shouldn't be deleted; this might be one aspect to look for.
The badges would be available in WDQS which would make querying much easier. Right now the only way to query redirect sitelinks is on a per-project basis via the MediaWiki SQL databases. This is in fact what I did to get the numbers initially posted here: query the ~1M items with one sitelink and no statements from WDQS once, then query each of the 900+ Wikimedia project SQL databases to provide me their redirects which are connected to Wikidata including the Wikidata item, then combine all the results with the ~1M list. Technically not overly difficult, but clearly not a straightforward way to work with redirect sitelinks.

—MisterSynergy (talk) 20:49, 2 November 2021 (UTC)[reply]

Just out of curiosity: Would you have a subset of items to check, ideally of different ages/with sitelinks to different wikis. BTW I'm fine with the task as outlined. --- Jura 21:31, 2 November 2021 (UTC)[reply]

Not sure what exactly you want to have checked. The full list of all cases is available here for a while (tab separated txt, columns "wiki, qid, redirect page title"). You can either check this by yourself, or describe more precisely what you want to see and I am going to figure out whether this is feasible. —MisterSynergy (talk) 22:47, 2 November 2021 (UTC)[reply]

Well, I'm diving into them and came across User:Nickpolk --- Jura 09:10, 3 November 2021 (UTC)[reply]

Other than that, there were no pagemoves in the ones I checked. All were pages converted into redirects at some point. --- Jura 09:56, 3 November 2021 (UTC)[reply]

I think it would be interesting move ahead with this. Would help reduce Wikidata:Database_reports/without_claims_by_site/frwiki. --- Jura 11:40, 29 November 2021 (UTC)[reply]

I did a list for frwiki at Wikidata:Database_reports/to_delete/without_statements_sitelink_to_redirect/frwiki and listed them for deletion. --- Jura 10:50, 1 December 2021 (UTC)[reply]

For enwiki, one could use w:Template:Redirect template (and NoclaimsBot) to tag all these empty items with instance of (P31)=Wikimedia redirect (Q21528878) so they could be more easily deleted. --- Jura 10:58, 2 December 2021 (UTC)[reply]

@MisterSynergy, Epìdosis: what's your view on that later option? It might handle about 50% of the 13000 for enwiki. Obviously, an admin still would have to go through a list like Wikidata:Database_reports/to_delete/Wikimedia_redirects. BTW enwiki has currently 43000 items without any statements. I'm trying to reduce that with various approaches. --- Jura 13:10, 2 December 2021 (UTC)[reply]

Good for me, thank you very much! --Epìdosis 13:13, 2 December 2021 (UTC)[reply]

No opinion yet, since I was not able to look at this in more detail until now.
Generally, I don't think we need to tag these items in order to prepare them for further processing. We do have a list (and can regenerate it easily at any time using e.g. this script). What we need is a robust strategy how an automated process can make a decision how to treat a given item on the fly. If that was possible, we could simply let it run over the known QIDs. It would also be okay to do this incrementally, e.g. only identify merge candidates and merge them, in order to reduce the dataset step by step for further inspection.
I simply need some spare time to have a closer look. —MisterSynergy (talk) 14:10, 2 December 2021 (UTC)[reply]

Noclaimsbot has a process for handling items without claims based on templates on Wikipedia. It checks the articles linked to the newest 1000 items for templates and adds corresponding statements. For some wikis, this covers almost all items without statements (nlwiki, dewiki), for others this is still far (enwiki). frwiki is getting closer. Both pipelines are currently clogged by such redirects. I doubt about the added value of merging such items. Can users really be certain that they get what they may have been expecting and why would they have used the empty item? I think this is different from HooBots past operation (cleanup of history prior to redirection of duplicates). --- Jura 14:26, 2 December 2021 (UTC)[reply]

@MisterSynergy, BrokenSegue, Epìdosis, Jura1: Hoo Bot previously merged such items to items of redirect target.--GZWDer (talk) 13:26, 2 December 2021 (UTC)[reply]

In my opinion this could be a good solution, if restored. Thanks for adding this information! --Epìdosis 13:30, 2 December 2021 (UTC)[reply]

@Jura1: Redirects created by moves can be safely merged (by bot); those created by merges are usually other valid topic and should neither be merged nor deleted.--GZWDer (talk) 13:34, 2 December 2021 (UTC)[reply]

Many seem to been created by GZWDer themselves. I don't understand why it would be for other users to do so if no use had been found for the items since. --- Jura 13:36, 2 December 2021 (UTC)[reply]

- Previously when someone asks how to deal with items with only sitelinks to redirect, I proposed to handle them via bot similar to Hoo Bot. This proposal does not receive much positive feedback (I can not find the discussion, though). Such things would happen as long as we are still creating items from unlinked Wikipedia pages (no matter manually, semi or fully automatically).--GZWDer (talk) 13:44, 2 December 2021 (UTC)[reply]
  - Are you at least dealing with the items you created? (Other then telling people what they should do with them). It appears that we have to deal with piles of items that neither you nor anybody else found useful over the last 5 or so years. --- Jura 13:54, 2 December 2021 (UTC)[reply]

I have checked part of frwiki list and many are valid topics that should not be merged with the target.--GZWDer (talk) 14:48, 2 December 2021 (UTC)[reply]

If you find them useful, can you merge them with possibly other items (or complete P31)? --- Jura 15:07, 2 December 2021 (UTC)[reply]

As a compromise, I propose another solution: blank those items and redirect them to the target item. No data are actually merged and others may restore them if they want to work on these. The only disadvantage I can imagine is they will confuse external users who are using such QIDs, but the negative impact may be minimal as there are no meaningful data whatsoever in those items.--GZWDer (talk) 16:00, 2 December 2021 (UTC)[reply]
yeah that also seems ok. maybe better because it can be undone more easily. BrokenSegue (talk) 18:15, 2 December 2021 (UTC)[reply]
actually upon further thought I think redirecting might be problematic. we probably do not want to copy over the label as an alias (like the normal merge code does). lots of redirects are for things that really aren't the same as the thing they redirect to. BrokenSegue (talk) 03:24, 3 December 2021 (UTC)[reply]
@BrokenSegue, Jura1: What I mean is blank (i.e. clear=1) and redirect. Even if they are not the same topic, there are few information of such items to confuse external users.--GZWDer (talk) 09:37, 3 December 2021 (UTC)[reply]

Adminbot deleting items containing only a redirect: Request process

(1) query to find all items with 0 statements and 1 sitelink:

SELECT (COUNT(*) as ?count) { ?item wikibase:statements 0 ; wikibase:sitelinks 1 }

Try it!

(2) There are 1003096 items (including actual sitelinks) on 13:16, 31 October 2021 (UTC)

(3) Redirects: here

(4) Aging:

all
year	Count - item
2012	632
2013	20580
2014	4727
2015	5125
2016	6439
2017	2987
2018	2596
2019	2427
2020	2496
2021	898
Total Result	48907

enwiki
year	Count - item
2012	16
2013	5614
2014	1429
2015	535
2016	1718
2017	769
2018	829
2019	431
2020	851
2021	735
Total Result	12927

ptwiki
year	Count - item
2012	1
2013	1190
2014	344
2015	95
2016	102
2017	22
2018	109
2019	20
2020	88
2021	55
Total Result	2026

arwiki
year	Count - item
2012	1
2013	718
2014	99
2015	306
2016	272
2017	216
2018	178
2019	108
2020	97
2021	16
Total Result	2011

Note: Aging is by age of item, not age of redirect or Wikipedia page creation. Wikidata started in 2012 and items for many older pages were created in 2013.

(5) Top ten wikis

wiki	Count - item
enwiki	12927
ptwiki	2026
arwiki	2011
jawiki	1735
kowiki	1602
hiwiki	1472
zhwiki	1463
tlwiki	1444
frwiki	1218
ukwiki	1197
Total Result	27095

request to add English descriptions to railway lines and stations (2021-11-10)

Request date: 10 November 2021, by: Michgrig

Link to discussions justifying the request

None needed

Task description

Hi,

Could anyone create a bot that runs permanently and adds English descriptions to railway lines and stations where absent? There's a bot (User:Edoderoobot/Set-nl-description) that adds Netherlands descriptions not only to railway lines and stations but to other types of objects. However, the bot owner does not like the idea of comlicating the script code to include other languages (and I cannot blame them for that :) ).

The simplest task that I request could be as follows:

If instance of (P31) = railway line (Q728937), then add description "railway line in <name of the country (P17)>"
If instance of (P31) = railway station (Q55488), then add description "railway station in <name of the located in the administrative territorial entity (P131)>, <name of the country (P17)>"
If instance of (P31) = railway stop (Q55678), then add description "railway stop in <name of the located in the administrative territorial entity (P131)>, <name of the country (P17)>"
If instance of (P31) = passing loop (Q784159), then add description "passing loop in <name of the located in the administrative territorial entity (P131)>, <name of the country (P17)>"

Notified participants of WikiProject Railways

Licence of data to import (if relevant)

Discussion

It's a pity that the descriptioner tool died in the past year, as such queries could be easly be fullfilled by this great tool. If this is still there open next week, I'll write a small script to create those descriptions in English. Edoderoo (talk) 12:17, 17 November 2021 (UTC)[reply]

Edoderoo, kindly reminder about this. --Michgrig (talk) 17:57, 26 November 2021 (UTC)[reply]
- this script is acting like a small descriptioner tool. Who knows how to run something on PAWS can take benefit of it. For the railway lines it is running now, I'll to the others later on too. Edoderoo (talk) 20:14, 27 November 2021 (UTC)[reply]
  Thank you! Michgrig (talk) 14:14, 28 November 2021 (UTC)[reply]

Request process

Task completed (12:21, 6 December 2021 (UTC))

request to merge MNAC dups. (2021-11-13)

Request date: 13 November 2021, by: Jura1

Task description

Back in 2016, there seems to have been some duplication between two bots. Compare:

It showed up for several works at Wikidata:WikiProject_sum_of_all_paintings/Creator/Ramon_Casas_i_Carbó in Museu Nacional d'Art de Catalunya (Q861252) and Museu Nacional d'Art de Catalunya (Q23681318).

The idea to identify all of them (for other artists as well) and merge them.

Discussion

Request process

request to mirror Wikipedia page moves: enwiki and/or others (2021-11-14)

Request date: 14 November 2021, by: Jura1

Link to discussions justifying the request

Wikidata:Project_chat#Duplicates_on_Wikidata_(sitelink_and_sitelink_to_redirect)

Task description

Follow page moves from a wiki (e.g. enwiki) and update sitelinks on Wikidata. See Project chat discussion above. Apparently a bot already does that for dewiki.

Discussion

User:Krdbot is doing it for enwiki (in addition to dewiki). --- Jura 11:59, 27 November 2021 (UTC)[reply]

Request process

request to remove ±1 (in width and height) from paintings (2021-11-15)

Request date: 16 November 2021, by: Jura1

Link to discussions justifying the request

Wikidata_talk:WikiProject_sum_of_all_paintings#±1_(in_width_and_height)

Task description

There are some 10000 items that still have legacy ±1 in values. I don't think any of these correspond to actual information. This request is to remove it from height (P2048) and width (P2049)} statements for items with instance of (P31) = painting (Q3305213) --- Jura 09:35, 16 November 2021 (UTC)[reply]

Discussion

When those items were created, the pywikibot-framework forced this to have a value. I also recently figured out that this requirement was dropped (and that is good!). I will pick this one up in the coming days/week, unless someone else already fixed it... Edoderoo (talk) 12:13, 17 November 2021 (UTC)[reply]

Just noting that this occurs on many museum objects that are not paintings as well. - PKM (talk) 21:13, 18 November 2021 (UTC)[reply]

It's still present in many fields .. don't hesitate to formulate cleanup requests .. --- Jura 21:17, 18 November 2021 (UTC)[reply]

We can not clean up just all of them without additional manual research, often the values are rather useless, but in many cases they were put there on purpose, and a bot can't see the difference between the two. Edoderoo (talk) 21:56, 19 November 2021 (UTC)[reply]

@Edoderoo: is your comment about paintings or about other fields? If it's for paintings (this request), could you provide some samples? --- Jura 11:33, 23 November 2021 (UTC)[reply]

For paintings I do not see an issue, as the measure of a frame will be pretty precisely measured, even when it is in mm. For all other items we first need to check ... maybe you can fix those too, but not all of them. Edoderoo (talk) 14:37, 23 November 2021 (UTC)[reply]

It's probably worth doing them one type at a time. My current focus is paintings. --- Jura 11:55, 27 November 2021 (UTC)[reply]

Request process

request to import the rest of Nomenclature for Museum Cataloging (P7749) (2021-11-18)

Request date: 18 November 2021, by: Vladimir Alexiev

Link to discussions justifying the request

https://www.wikidata.org/wiki/Property_talk:P7749#Help_needed_to_import_the_rest_of_Nomenclature

Task description

See discussion. In brief:

Import the remaining 9.2k entries from this thesaurus
While linking intelligently into the WD class hierarchy and adding qualifiers "of" or "use"
Need programming & algorithmic knowledge, and ideally a bit of NLP

Licence of data to import (if relevant)

Open Data, see https://www.nomenclature.info/droitauteur-copyright.app?lang=en

Discussion

Support as one of the editors who has done work manually matching this catalog. I have several comments:

Nomenclature is a bilingual database. The labels should be imported in both English and French. Some items have Canadian French and Canadian English labels as well; these should be added as aliases if practical. This is one reason why importing directly from MnM is not my preferred solution.
Nomeclature includes "non-preferred terms" (see example). These can be imported as aliases if practical.
The Getty Art & Architecture link (in "other references to this object" on the user interface) can be used to prevent duplicate entries. If the AAT ID does not exist in Wikidata, it should be added to the new item. If the AAT ID does exist in Wikidata, the Nomenclature ID can be added to the existing item.
Nomenclature has some blank subclasses (example) which need to be skipped.
Nomenclature has a hierarchy of sports equipment by sport which is not like anything in Wikidata, where it seems the standard is to use sport (P641) to qualify a type of equipment - see the list of direct subclasses of sports equipment. We need to decide if we want to import all of these classes or not.

- PKM (talk) 21:08, 18 November 2021 (UTC)[reply]

Thanks for the support and the excellent suggestion! We'll be publishing NOM as RDF entities in SKOS/SKOSXL soon, until then it's available as big RDF dumps, and I can make any tabular export desired (eg with the 4 languages dispatched to separate columns). Pinging @Crowjane7: who's one of the main editors --Vladimir Alexiev (talk) 06:37, 23 November 2021 (UTC)[reply]

@Vladimir Alexiev: I hadn't realized you were involved with Nomenclature on the tech side! :-) PKM (talk) 23:01, 27 November 2021 (UTC)[reply]

Support Thank you for initiating this request. --Crowjane7 (talk) 20:30, 23 November 2021 (UTC)[reply]
Support --Scarey1 (talk) 16:09, 24 November 2021 (UTC)[reply]
Support --NathGué (talk) 16:40, 24 November 2021 (UTC)[reply]
Support --Illipmich (talk) 13:38, 25 November 2021 (UTC)[reply]

Request process

Adminbot deleting non-notable Semantic Scholar authors

Request date: 26 November 2021, by: Epìdosis

Link to discussions justifying the request

Task description

Given the following query

SELECT DISTINCT ?item ?itemLabel
WHERE { 
  ?item wdt:P4012 ?sesc .
  ?item wikibase:identifiers 1 .
  MINUS { ?other ?id ?item } .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
ORDER BY ?itemLabel

Try it!

delete all these items (5874 as of now).

All these items are based uniquely on Semantic Scholar author ID (P4012) and have no incoming links; SemanticScholar itself is not sufficient for implying notability; moreover, there are strange cases of SemanticScholar IDs redirected to different names (e.g. Q64864826 "Aleksey Buzykaev" has https://www.semanticscholar.org/author/104224927 which now redirects to https://www.semanticscholar.org/author/W.-Buttinger/4134655 "W. Buttinger"), which could lead to future conflations here on Wikidata, and other IDs are inexistent (e.g. Q64855725 links to inexistent https://www.semanticscholar.org/author/95221647). --Epìdosis 11:32, 26 November 2021 (UTC)[reply]

Discussion

I tried to figure out who created them and found some created by QuickStatementsBot [9] without any indication of the user who requested it. Similarly by Reinheitsgebot (without any indication of a MxM catalogue). Further by IPs who didn't add additional statements .. so some cleanup would probably help. --- Jura 15:50, 27 November 2021 (UTC)[reply]
https://w.wiki/4Td4 shows when they were created. Seems to be mostly June and Sept 2019. --- Jura 16:09, 27 November 2021 (UTC)[reply]

Request process

request to merge true duplicates (2021-11-27)

Tracked in Phabricator
Task T299422#7712720

Request date: 27 November 2021, by: Jura1

Link to discussions justifying the request

Wikidata:Report_a_technical_problem#quadruplicate

Task description

A true duplicate is an item with a sitelink to the same wikipage as another item. This should be impossible, but for some technical reasons, they happened.
Generally one of them isn't editable and remains without any statements.
The technical problem said to be solved (see Wikidata:Report_a_technical_problem#quadruplicate). So no new true duplicates should be created.
There is unknown quantity of true duplicates to be merged. We currently have >1,000,000 items without any statements.
https://w.wiki/4TZd finds ca. 1700 created in one month by one user.
Maybe an update of Wikidata:True duplicates can identify more. This was requested at Wikidata:Report_a_technical_problem#quadruplicate.
Maybe Wikidata:Request a query can help too.

Discussion

Marius run the report to find all the identical sitelinks. It has now completed. (phab:T299422#7712720) -Mohammed Sadat (WMDE) (talk) 18:18, 21 February 2022 (UTC)[reply]

Thanks. I will have look. --- Jura 12:23, 4 March 2022 (UTC)[reply]

Request process

Import pages, add cewiki sitelinks to Wikidata items (2021-11-28)

Request date: 28 November 2021, by: Takhirgeran Umar

Link to discussions justifying the request

Please make imports of pages from here and from here

Task description

Licence of data to import (if relevant)

Discussion: @Takhirgeran Umar: What data should be imported? Please describe your request in more detail. --Matěj Suchánek (talk) 11:08, 28 November 2021 (UTC)[reply]

@Matěj SuchánekI higher gave links to the category. All articles from these categories must be transferred to Wikidata. Takhirgeran Umar (talk) 17:10, 28 November 2021 (UTC)[reply]

Previously, it did the bot Emausbot Takhirgeran Umar (talk) 17:12, 28 November 2021 (UTC)[reply]

These edits he did Takhirgeran Umar (talk) 17:14, 28 November 2021 (UTC)[reply]

Link to articles. Link to articles Takhirgeran Umar (talk) 17:20, 28 November 2021 (UTC)[reply]

You can easily do this yourself with PetScan, and PetScan. Note that this will at first create <empty> items with no claims. Edoderoo (talk) 19:54, 28 November 2021 (UTC)[reply]

Ah, wait, hold you horses ;-) There *are* WikIData items already, but they are connected the old fashioned (pre Wikidata/2013) way. That indeed requires some scripting. Edoderoo (talk) 20:02, 28 November 2021 (UTC)[reply]

@Edoderoo I think these tools do not associate articles with Turkish Takhirgeran Umar (talk) 20:05, 28 November 2021 (UTC)[reply]

OK, see Q31190027 and [this diff] and this script. I can run this over the two categories, if you like. Edoderoo (talk) 21:40, 28 November 2021 (UTC)[reply]

@Edoderoo. Thanks! If you can do import. Takhirgeran Umar (talk) 08:27, 29 November 2021 (UTC)[reply]

@Edoderoo This script is launched through Python? I would have learned to run it for the future. Thanks! Takhirgeran Umar (talk) 18:09, 29 November 2021 (UTC)[reply]

[10]My bot stopped. On Vikidat not postpone. Takhirgeran Umar (talk) 19:52, 29 November 2021 (UTC)[reply]

@Edoderoo Makes one editing and stops (screen). Edit. Takhirgeran Umar (talk) 02:48, 30 November 2021 (UTC)[reply]

Hey @Edoderoo! I launched your script. Please tell me why he makes editing for so long? Link to script.--Takhirgeran Umar (talk) 09:51, 1 December 2021 (UTC)[reply]

"Sleeping for 7.6 seconds, 2021-12-01 12:40:07

Sleeping for 9.8 seconds, 2021-12-01 12:40:15

Sleeping for 9.8 seconds, 2021-12-01 12:40:25

Sleeping for 9.8 seconds, 2021-12-01 12:40:35

Sleeping for 9.8 seconds, 2021-12-01 12:40:45

Sleeping for 9.8 seconds, 2021-12-01 12:40:55

Sleeping for 9.8 seconds, 2021-12-01 12:41:05

Sleeping for 9.8 seconds, 2021-12-01 12:41:15"

Request process

Accepted by (Edoderoo (talk) 21:40, 28 November 2021 (UTC)) and under process[reply]
Task completed (21:51, 6 December 2021 (UTC))

request to fix articles by Gerasimos Siasos (2021-12-01)

Request date: 1 December 2021, by: Epìdosis

Link to discussions justifying the request

Task description

The following query works until KrBot fixes links to Gerasimos Siasos (Q73484034) in links to Gerasimos Siasos (Q54165911)

SELECT DISTINCT ?item
WHERE {
  ?item wdt:P50 wd:Q54165911 .
  ?item wdt:P50 wd:Q73484034 .
}

Try it!

after the intervention of KrBot, use the following:

SELECT DISTINCT ?item
WHERE {
  ?item p:P50 [ ps:P50 wd:Q54165911 ; pq:P1545 ?n1 ] .
  ?item p:P50 [ ps:P50 wd:Q54165911 ; pq:P1545 ?n2 ] .
  FILTER(?n1 != ?n2)
}

Try it!

Request: in all these cases, Gerasimos Siasos is counted as author twice; please remove the second occurrence and diminish the qualifiers series ordinal (P1545) by 1 for all the following authors. --Epìdosis 12:51, 1 December 2021 (UTC) P.S. @Harej, Research Bot, Daniel Mietchen, GZWDer (flood): who imported some of the articles[reply]

Discussion

Request process

Bot work for Cleveland Museum of Art

Request date: 1 December 2021, by: Yann

Task description

Hi, Thousands of files from the Cleveland Museum of Art were uploaded by Madreiling (not active since August 2019), and Wikidata items created for them. However,

Wikidata item is not added in the Artwork template;
The Wikidata item is incomplete (creator not mentioned, medium and size missing, etc.);
Sometimes a JPEG version, but with a lower resolution, was uploaded in addition to the TIFF file (e.g. File:Baron A. Faÿs - Santa Maria della Salute, View of the Grand Canal, Venice - 1997.40 - Cleveland Museum of Art.jpg);

Could it be possible to fix that with a bot? i.e.

Add the WD Q number in the Artwork template;
Complete the WD item with information on Commons (at least the creator);
Upload a JPEG version with the highest possible resolution;
Add the JPEG version to the P18 property;
Mark the JPEG version as preferred rank in WD;
Use a Creator template for the Artist in Commons.

The list of item can be accessed via [11], [12], etc., and c:Category:Images from Cleveland Museum of Art. See also related discussion at c:Commons talk:Structured data#CC-0 conflicts with PD statement. Thanks, Yann (talk) 15:48, 1 December 2021 (UTC)[reply]

Discussion

Support I've done some cleanup on these - mostly tapestries and baskets - but there is much work to be done.- PKM (talk) 22:23, 2 December 2021 (UTC)[reply]
Just to be clear, is the issue here specific to the way the CMA items were added to Wikidata, or is it about how User:BotMultichill uploads files found in Commons compatible image available at URL (P4765)? Many of the CMA images (at least when I wrote the Wikidata bot for them in 2020) were not uploaded by them at all, but by Multichill's bot that uploads anything using that property according to its own standard format. Dominic (talk) 00:17, 7 December 2021 (UTC)[reply]
Project chat discussion about this is now archived at Wikidata:Project_chat/Archive/2021/11#Bot_work_for_Cleveland_Museum_of_Art. Requests for Commons uploads/SDC maintenance are probably better made on Commons. Once SDC is complete, updating Wikidata should be easy. --- Jura 17:56, 13 December 2021 (UTC)[reply]

Request process

request to add "published in" (P1433) to subpages of plwikisource (2021-12-02)

Request date: 2 December 2021, by: Jura1

Link to discussions justifying the request

Wikidata:Kafejka#items_without_claims_by_site

Task description

There seem to be plenty items for subpages without any claims for plwikisource (60000+).
Similar to Q108957866 (and others from that work), one could at least add a published in (P1433)-statement pointing to the item linking to the parent page. Q108959596 that is for that sample.
https://w.wiki/4UCp finds the parent page. The most frequent one is currently Słownik etymologiczny języka polskiego (Q7667418).
If the applicable instance of (P31) and/or other statements can be determined, please add that too. Q108957866 is from a bilingual work, so I could figure it out myself.

Discussion

Request process

request to undo merge EC meetings (2021-12-02)

Request date: 2 December 2021, by: Jura1

Link to discussions justifying the request

Wikidata:Project_chat#meeting_and_list_of_meetings_mixup

Task description

Undo https://iw.toolforge.org/editgroups/b/KrBotResolvingRedirect/Q50877551_Q1526506

Licence of data to import (if relevant)

Discussion

Request process

Mass updating links from viwiki (2021-12-15)

Request date: 15 December 2021, by: NguoiDungKhongDinhDanh

Link to discussions justifying the request

Task description

Items: Category:2019 births (Q9724071) and the like
Desc: Recently we viwiki moved a bunch of categories (for instance, Thể loại:Sinh 1980 to Thể loại:Sinh năm 1980) using a bot, but since it doesn't have WD account these items didn't automatically update. Please update them.

Licence of data to import (if relevant)

Discussion

It is not clear what you expect us to do. I see both your example categories still exist, and even both have items in them. The second category does not (yet) have a WikiData item, so do you want to move those, leaving the old/first mentioned category without a WD-item? Or is it something completely different? Edoderoo (talk) 10:24, 15 December 2021 (UTC)[reply]

@Edoderoo: The old links are now redirects, so there's no need to keep them here. Please update items that contain Thể loại:(Sinh|Mất) \d{1,4} so that they match new links (Thể loại:(Sinh|Mất) năm \d{1,4}). NguoiDungKhongDinhDanh 12:18, 15 December 2021 (UTC)[reply]

Request process

request to delete statements and sitelinks and merge items: dewiki duplicates (2021-12-16)

Request date: 16 December 2021, by: Jura1

Task description

The items mentioned at Wikidata:Project_chat/Archive/2021/12#items_for_redirects_with_subject_has_role_(P2868)_=_Wikimedia_redirect_page_(Q21528878) mostly include items with a sitelink to dewiki, a series of statements and an indication that it's a duplicate of another item with permanent duplicated item (P2959) (sample: [13])
delete sitelink [14]
delete statements [15][16]
merge the two items [17][18]

Discussion

Request process

request to replace URL references with Q99587855 (2021-12-29)

Request date: 29 December 2021, by: Máté

Link to discussions justifying the request

hu:Wikipédia-vita:Sablonműhely/Archív39#Wikidata (gondolom)
I've used P854 to reference Hungarian premiere dates of films from the Excel file maintained by the regulatory authorities and these references cccasionally appear in infoboxes at huwiki. In the meantime, an item has been created for the file which provides more detailed information allowing for better formatted references over at huwiki. Máté (talk) 08:37, 29 December 2021 (UTC)[reply]

Task description

Replacing references

⟨ subject ⟩ reference URL (P854) ⟨ <http://nmhh.hu/dokumentum/198182/terjesztett_filmalkotasok_art_filmek_nyilvantartasa.xlsx> ⟩

with

⟨ subject ⟩ stated in (P248) ⟨ Terjesztésre kerülő filmalkotások és artfilmek nyilvántartása (Q99587855)  

 ⟩

Update #1

There are a few more of these:

Discussion

Request process

request to make buildings searchable by address (2022-01-05)

Request date: 5 January 2022, by: Jura1

Problem

When adding these statements Special:Search/haswbstatement:P669=Q688477 (currently 62), I noticed that most buildings can't be found by merely searching for the address: Special:Search/Getreidegasse Salzburg (currently 21).

This despite that most items include street address (P6375) with the building address: Q37970986#P6375. This as P6375 isn't indexed.

The easiest solution would have been to index the statement for full text search, but apparently this wont happen any time soon (see Wikidata:Report_a_technical_problem/WDQS_and_Search#index_"street_address"_(P6375)_strings).

The alternative would be to add the address as alias. Sample: Q37998050 with alias "Getreidegasse 11, Salzburg".

Task description

select items with buildings and P6375
check if address is in label or alias
if not, add address as alias (without postal code)

Discussion

Request process

request to refine class of items about cadastral municipalities/areas according to country (2022-01-10)

Request date: 10 January 2022, by: UV

Link to discussions justifying the request

Task description

For all items with
country (P17)Austria (Q40) AND

instance of (P31)cadastral municipality (Q253326),

please change instance of (P31)cadastral municipality (Q253326) to instance of (P31)cadastral municipality of Austria (Q17376095).
For all items with
country (P17)Czech Republic (Q213) AND

instance of (P31)cadastral municipality (Q253326),

please change instance of (P31)cadastral municipality (Q253326) to instance of (P31)cadastral area in the Czech Republic (Q20871353).
For all items with
country (P17)North Macedonia (Q221) AND

instance of (P31)cadastral municipality (Q253326),

please change instance of (P31)cadastral municipality (Q253326) to instance of (P31)cadastral municipality of North Macedonia (Q98497401).
For all items with
country (P17)Serbia (Q403) AND

instance of (P31)cadastral municipality (Q253326),

please change instance of (P31)cadastral municipality (Q253326) to instance of (P31)cadastral municipality of Serbia (Q28822032).

Thank you!

Licence of data to import (if relevant)

Discussion

Support Vojtěch Dostál (talk) 11:56, 11 January 2022 (UTC)[reply]

Request process

request to fix MS politician dates (2022-01-30)

Request date: 30 January 2022, by: Jura1

Link to discussions justifying the request

User_talk:M2545#Dates_of_Massachusetts_politicians

Task description

There are series of items created at the same time with date of birth (P569) and date of death (P570) "January 1".
https://w.wiki/4mWn (currently 133 items, sample: Q106889187)
precision should probably be changed from 11 (day) to 9 (year)

Discussion

Request process

Request to add frequently used Polish declensions of place names as alias (2022-02-03)

Request date: 4 February 2022, by: Jura1

Request to add frequently used Polish declensions of place names as alias

Addition of such aliases simplifies finding the relevant items for non-Polish speakers
It avoids finding only places that use a name in that form (sample "Opoczna" should find "Opoczno" and not only "Opoczna Góra")

Task description

Determine declension(s)
Add as alias
Sample edit: [19] as used on [20][21]. Possibly category names can be used to generate these.

Discussion

I don’t speak any Slavic languages, so I don’t want to cast a support vote, but I think it’s a good idea. Maybe we could also do this for Czech place names, this would me (and probably others) in adding place of birth (P19) / place of death (P20) values from NL CR AUT ID (P691). CC Vojtěch Dostál --Emu (talk) 12:54, 4 February 2022 (UTC)[reply]

Yes, Czech would be another language where this is useful (to me), but maybe we should discuss each language separately as the forms to add and the source to use can vary. --- Jura 12:57, 4 February 2022 (UTC)[reply]
It would seem quite weird in Czech to have as many as 7 cases for all places in aliases. Aliases are considered to be some sort of synonyms, which grammatical cases are not, and they look funny when understood as so, such as at https://reasonator.toolforge.org/?&q=994271. Plus, it would introduce further complexity to searches because some grammatical cases are spelled exactly same as the primary versions of different places (eg. Lhotky (Q2041531) vs "Lhotky" as a grammatical case of Lhotka (Q1471978)).
Tobias1984 Vojtěch Dostál YjM Jklamo Walter Klosse Sintakso Matěj Suchánek JAn Dudík Skim Frettie Jura1913 Mormegil Jedudedek marv1N Sapfan Daniel Baránek Draceane Michal Josef Špaček (WMCZ) The photonaut Hartasek Zelenymuzik Gumruch Shadster Dænča M.Rejha Janek Jan Kameníček Eva Vele Linda.jansova Lukša Packa Fukejs Hugo Xmorave2 J.Broukal Lenkakrizova Steam Flow Pavel Bednařík Sanqui
Notified participants of WikiProject Czech Republic or am I the only one to think so? Maybe an item could better be linked to an appropriate lexeme (although that would not help with search as such). Vojtěch Dostál (talk) 08:51, 5 February 2022 (UTC)[reply]

PS: @Emu For your interest, you can find all grammatical cases for each municipality in RUIAN database, when you click on Czech municipality ID (P7606). Vojtěch Dostál (talk) 08:54, 5 February 2022 (UTC)[reply]

I don't think it is good idea. Maybe this can help non spekesrs, but for speakers it sound weird. Much better would be to create lexemes for these names and link lexeme with name of place. BUt I don't know if this solution is usable for searching. JAn Dudík (talk) 10:20, 5 February 2022 (UTC)[reply]
The idea isn't to add all cases, but the most frequently used ones. For Polish, it should probably include the form for locative case (see w:Locative_case#Polish).

I do think it's important to find Lhotka (Q1471978) if that is sometimes referred to as "Lhotky". What would be the benefit of only finding Lhotky (Q2041531)? Checking another database first isn't really practical.

As for all aliases, it's good to eventually add them in a structured form elsewhere too, but that's a different usecase. --- Jura 10:26, 5 February 2022 (UTC)[reply]

It will be useful (z Prahy, v Praze), mostly for manual edits. "Zlíně" != "Zlín", i support that, for few cases. --Frettie (talk) 11:39, 5 February 2022 (UTC)[reply]

Reconciliation in OpenRefine works on both labels and aliases. Therefore, addition of ambiguous grammatical cases will lead to OpenRefine suggesting a lot of unrelated names unnecessarily. Vojtěch Dostál (talk) 16:16, 5 February 2022 (UTC)[reply]

Generally speaking (i.e. without having any specific language including Polish in mind), I consider this a very bad idea. 1) It would be semantically very confusing, as the meaning of "alias" does not include "declension". 2) If the idea is to enable adding only some cases, it has to be defined, what "most frequently used" means. To avoid even more chaos in Wikidata, it needs to be defined generally, not only for Polish, as other users may quickly follow Polish example. 2) Agree with the above mentioned "Lhotky" problem. 3) Once users start finding declensions in aliases of place names, some will surely start adding them to other kinds of items like personal names and others, opening the door for even bigger confusion and chaos: Searching for Janov (Italian city) in Slovak language might also lead to the personal name Jan (Slovak possessive: Janov).
What I would support would be adding another separate and semantically different column containing declensions into the table next to aliases and allowing people to include them in or exclude them from their search. Another possibility would be linking the items to lexemes, as suggested above. It should not be impossible to make a searching engine to use information from the linked lexemes, if the user desires it. --Jan Kameníček (talk) 18:35, 5 February 2022 (UTC)[reply]
Can you explain what you mean with "Agree with the above mentioned "Lhotky" problem"? Do you see a benefit of not finding the place? --- Jura 18:39, 5 February 2022 (UTC)[reply]
Agreeing here with Vojtěch Dostál. Wikidata's aliases shall not be misused to compensate imperfect search engine. --YjM | _d^c 20:07, 6 February 2022 (UTC)[reply]
Comment Interesting points to ponder and take in account. I obviously agree with the main point: Czech declensions are less useful for Czech speakers than for others. Not quite sure about what to make of problems one or the other search system may have or not have with them. I don't think we'd delete the given name "Rome", because of the city "Rome". A key point of Wikidata is that items are ambiguous and differentiated by descriptions and statements. --- Jura 09:50, 7 February 2022 (UTC)[reply]

Request process

Request to fix Spanish labels wrongly copied from English labels (2022-02-09)

Request date: 9 February 2022, by: Epìdosis

Task description

In a relevant number of cases I have come across items having the following problem for "es" labels: in 2013 KLBot2 copied "en" label into "es" label for people, sometimes incorrectly (e.g. for noble people); my proposal is: 1) finding all items with instance of (P31)human (Q5) where "en" and "es" label are identical and containing a sitelink to es.wikipedia; check if the title of es.wikipedia article, without eventual parts in parenthesis, corresponds to "es" label; if they are different, use the title of es.wikipedia article as "es" label and remove it from "es" aliases if already present (e.g.). --Epìdosis 18:33, 9 February 2022 (UTC)[reply]

Discussion

Maybe [22] could be a starting point.

Request process

Request to add stats: number of statements with a single external-id (2022-02-09)

Request date: 9 February 2022, by: Jura1

Task description

Currently a bot generates statistics of property uses, as main statements among others. This is included in property documentation on talk pages.
For external-id properties, it could be interesting to have a slightly modified version: one that shows the number of items with a single external-id.
Sample 1:
- VIAF ID (P214) has currently 2,791,472 uses as main statement
- 15395 (0.6 %) have only a VIAF ID.
Sample 2:
- IMDb ID (P345) has currently 801,061 uses
- 61,681 (7.7 %) have only an IMDB ID.
Sample 3:
- TCM Movie Database film ID (P2631) has currently 31,697 uses
- 2 (0 %) have only this id.

Discussion

Request process

Request to fix cities and countries (2022-02-12)

Request date: 12 February 2022, by: Sifalot

Link to discussions justifying the request

Hi, I'm noticing many inconsistencies on wikidata on cities and countries. Here are some of them:

1) https://www.wikidata.org/wiki/Talk:Q36678, many cities have as a "country" tag an object which is not a country. Many times this is due to territory conflicts where people put the region (like here "west bank") as the country instead of putting a real country. It's true for many places for example: Q4508661 includes "louhansk" as a country.
2) Almost all cities in Romania are duplicated: Q100188/Q16898189, Q16898582/Q83404, Q2716722/Q16426101 etc.

Task description

1) A bot to remove all P17 (country) property linking to an entity which is not a country.
2) A bot to remove all pages where cities have the same name/code in Romania. Another idea could be to revert everything @JWbot did as it is the bot which created these pages without paying attention.

Discussion

Oppose Task 2 is complete nonsense. They even reference each other. Please don't place requests if you don't know what you ask. Anyway first should be discussed. See explanation 84.236.58.117 10:54, 17 February 2022 (UTC)[reply]

Request process

Wikidata link bot request (2022-02-19)

Request date: 19 February 2022, by: ToprakM

There is a category in trwiki that is named "Taxonbars that need a Wikidata item": tr:Kategori:Vikiveri nesnesine ihtiyaç duyan taksonçubukları. I think all articles listed already created some Wikipedias, such as svwiki. Can anyone link them with a bot?

Discussion

Many pages in that category have corresponding automatic taxonomy templates, which should also be linked to wikidata items. See tr:Kategori:Taksonomi şablonları. William Avery (talk) 12:13, 25 May 2022 (UTC)[reply]

Request process

Request to fix "politicienne allemande" (2022-03-04)

Request date: 4 March 2022, by: Jura1

Task description

[23] (3902 results) should either have "politicien allemand" or something better. Sample: Q47068365

Discussion

Request process

Request to deprecate P2190 string formats as property should use numeric format (2022-03-15)

Request date: 15 March 2022, by: Wolfgang8741

Link to discussions justifying the request

C-SPAN person ID (P2190) is transitioning to a numeric format for reliability of linking since the string format has been found to break when C-SPAN changes the string it doesn't always redirect. See property discussion. as well as on Project chat. Coordinated updates to templates using this property have been notified on Wikipedia for update and cleanup.

For the entries added prior to 26 Feb 2022 all matched numeric formats have been uploaded. Strings added after that date have not been checked.

Task description

1. Deprecate all existing statements using a string for the value in C-SPAN person ID (P2190) and add qualifier reason for deprecated rank (P2241) with withdrawn identifier value (Q21441764)

2. Remove any strings added for the property after 14 March 2022 when the property officially started validating for numeric IDs.

3. For string IDs added between 26 Feb and 14 March, resolve the string to the C-SPAN url. Parse the url response and extract the numeric ID.

Discussion

Request process

Request to extract music titels from headline (2022-03-21)

Request date: 21 March 2022, by: Bigbossfarin

Link to discussions justifying the request

I would like to feed Wikidata with Offizielle Deutsche Charts album ID (P10262) of all the albums on the website offiziellecharts.de/album-details-$1 (examples).
Problem: The name of the interpret and album on the website is in header 1 (h1) and header 2 (h2) in the HTML source code (example) and I don't know how to crawl this data.

Task description

I need a list of the headers with ID number:

URL	ID	h1	h2
...	...	...	...
https://www.offiziellecharts.de/album-details-12	12	Michael Jackson	Thriller
https://www.offiziellecharts.de/album-details-13	13	ZZ Top	Eliminator
...	...	...	...

the same thing would be fine for artists

URL	ID	h1
...	...	...
https://www.offiziellecharts.de/suche/person-978	978	Michael Jackson
...	...	...

and songs

URL	ID	h1	h2
...	...	...	...
https://www.offiziellecharts.de/titel-details-1680	1680	Michael Jackson	Bad
...	...	...	...

Licence of data to import (if relevant)

Discussion

Hello @Bigbossfarin, I'm not sure offiziellecharts.de really appreciate to have their whole website crawled. And I don't know if the license is ok with adding data to Wikidata. Myst (talk) 19:29, 24 March 2022 (UTC)[reply]

Request process

Request to import P4055 from enwiki (2022-04-05)

Request date: 5 April 2022, by: MSGJ

Task description

Request to import data to Norwegian List of Lights ID (P4055) from pages in en:Category:Norsk fyrliste not in Wikidata using parameter countrynumber in Template:Infobox lighthouse. Thanks — Martin (MSGJ · talk) 19:38, 5 April 2022 (UTC)[reply]

Discussion

Some values may not conform to the expected format \d{6} but I will get those fixed later — Martin (MSGJ · talk) 19:52, 5 April 2022 (UTC)[reply]

Is anyone able to help with this? Thanks — Martin (MSGJ · talk) 15:41, 20 April 2022 (UTC)[reply]

Request process

Set preferred rank on census population of Quebec's municipalities

Request date: 6 April 2022, by: YanikB

Link to discussions justifying the request

Task description

1-remove Preferred rank of population (P1082) for 2016 census.

Items used: local municipality of Quebec (Q3327873)   , unorganized area in Quebec (Q81066200)   , census (Q39825)

Properties used: instance of (P31)   , subclass of (P279)   , population (P1082)   , determination method (P459)   , point in time (P585)

SELECT ?item 
WHERE {
       {?item wdt:P31/wdt:P279* wd:Q3327873} UNION {?item wdt:P31 wd:Q81066200}
       ?item p:P1082 [ ps:P1082 ?population; pq:P459 wd:Q39825; pq:P585  ?date  ] .
       FILTER (?date  = "2016-01-01T00:00:00Z"^^xsd:dateTime )
}
LIMIT 1500

Try it!

2-Change population (P1082) to Preferred rank for 2021 census.

Items used: local municipality of Quebec (Q3327873)   , unorganized area in Quebec (Q81066200)   , census (Q39825)

Properties used: instance of (P31)   , subclass of (P279)   , population (P1082)   , determination method (P459)   , point in time (P585)

SELECT ?item 
WHERE {
       {?item wdt:P31/wdt:P279* wd:Q3327873} UNION {?item wdt:P31 wd:Q81066200}
       ?item p:P1082 [ ps:P1082 ?population; pq:P459 wd:Q39825; pq:P585  ?date  ] .
       FILTER (?date  >= "2021-01-01T00:00:00Z"^^xsd:dateTime )
}
LIMIT 1500

Try it!

thx

Licence of data to import (if relevant)

Discussion

Request process

Request to add Russian descriptions (2022-04-10)

Request date: 10 April 2022, by: 217.117.125.83

Task description

Please, add Russian description «

вид птиц

» to all the items that have English description «species of bird». 217.117.125.83 17:27, 10 April 2022 (UTC)[reply]

Discussion

Request process

Tracking parameters in reference URLs

Do we have a bot that makes edits like this, to trim fbclid and other tracking parameters from URLs in citations, and if not, could someone's bot do that, please? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:20, 24 April 2022 (UTC)[reply]

Bot to regularly substitute certain templates (2022-04-29)

Request date: 29 April 2022, by: GZWDer

Task description

Currently there are 1981 pages transcluding Template:Unsigned. We should have a bot to automatically substitute it regularly.

Discussion

Request process

Accademia delle Scienze di Torino multiple references (01-05-2022)

Request date: 1 May 2022, by: Epìdosis

Link to discussions justifying the request

Property talk:P8153#Multiple problems in recent import

Task description

Given the following query:

SELECT DISTINCT ?item
WHERE {
  ?item wdt:P8153 ?ast .
  ?item p:P570 ?statement.
  ?reference1 pr:P248 wd:Q107212659.
  ?reference2 pr:P248 wd:Q107212659.
  ?statement prov:wasDerivedFrom ?reference1.
  ?statement prov:wasDerivedFrom ?reference2.
  FILTER (?reference1 != ?reference2)
}

Try it!

In many items there are multiple references to date of death (P570) referring to www.accademiadellescienze.it (Q107212659)=Accademia delle Scienze di Torino ID (P8153). Cases:

three references: maintain the first (stated in (P248)+Accademia delle Scienze di Torino ID (P8153)+subject named as (P1810)), delete the second (stated in (P248)+Accademia delle Scienze di Torino ID (P8153)), delete the third (stated in (P248)+retrieved (P813)) transferring the retrieved (P813) to the first
1. three references bis: if the first is stated in (P248)+Accademia delle Scienze di Torino ID (P8153)+subject named as (P1810)+retrieved (P813), the second and the third get simply deleted
2. three references ter: if there is a reference with reference URL (P854) containing a string "accademiadellescienze", it should be deleted; maintain the second (stated in (P248)+Accademia delle Scienze di Torino ID (P8153)), delete the third (stated in (P248)+retrieved (P813)) transferring the retrieved (P813) to the first
two references: maintain the second (stated in (P248)+Accademia delle Scienze di Torino ID (P8153)), delete the third (stated in (P248)+retrieved (P813)) transferring the retrieved (P813) to the first

Repeat the above query substituting date of birth (P569) to date of death (P570). Cases:

two references: maintain the first (stated in (P248)+Accademia delle Scienze di Torino ID (P8153)+subject named as (P1810)), delete the second (stated in (P248)+Accademia delle Scienze di Torino ID (P8153)+retrieved (P813)) transferring the retrieved (P813) to the first
1. two references bis: if the first is stated in (P248)+Accademia delle Scienze di Torino ID (P8153)+subject named as (P1810)+retrieved (P813), the second gets simply deleted
2. two references ter: if there is a reference with reference URL (P854) containing a string "accademiadellescienze", it should be deleted; maintain the second (stated in (P248)+Accademia delle Scienze di Torino ID (P8153)+retrieved (P813))

Discussion

@Ladsgroup: as his bot is probably ready for doing this; the first request was archived despite not being solved, as well as the second. --Epìdosis 21:45, 1 May 2022 (UTC)[reply]

@Epìdosis I'm so sorry but complicated duplicate clean ups require dedicated time for coding which I really don't have among this and a million other volunteer responsibilities :( Amir (talk) 03:24, 8 May 2022 (UTC)[reply]

Request process

Clear watchlist (2022-05-05)

Request date: 5 May 2022, by: Popperipopp

Link to discussions justifying the request

Task description

I need to run a script to clear my watchlist, as it's not possible to do this manually for the number of items on it. This is the script: https://github.com/lucaswerkmeister/tool-unwatch/blob/main/unwatch

Licence of data to import (if relevant)

Discussion

Request process

Request to remove references to Baltisches Biographisches Lexikon (2022-05-12)

Request date: 13 May 2022, by: MSGJ

Link to discussions justifying the request

Task description

Please remove all references to P2580 (P2580) in accordance with Wikidata:Properties for deletion/P2580.

This query returns all statements containing such references: [24] I would like these references to be removed. Thanks — Martin (MSGJ · talk) 12:24, 13 May 2022 (UTC)[reply]

Licence of data to import (if relevant)

Discussion

@MSGJ: Just to double check an example: on Peter Otto Goetze (Q54506847) the whole first reference should be removed from both date of birth (P569) and date of death (P570)?

@William Avery: No! I am only asking for P2580 (P2580) to be removed. Sorry if that was unclear — Martin (MSGJ · talk) 22:12, 24 May 2022 (UTC)[reply]

No problem. I have long experience in misinterpreting simple requirements. William Avery (talk) 11:01, 25 May 2022 (UTC)[reply]

Request process

Accepted by (William Avery (talk) 19:02, 24 May 2022 (UTC)) and under process[reply]
BRFA filed at WD:BRFA § William Avery Bot 4. William Avery (talk) 11:01, 25 May 2022 (UTC)[reply]

Request to run a script to remove a template call on items talk pages .. (2022-05-19)

Request date: 19 May 2022, by: TomT0m

Link to discussions justifying the request

It’s not to be run just yet, but as part of the deployment of {{Item documentation}} to Talkpageheader, see this discussion

Task description

it’s of course required to remove the ~4000 template call to {{Item documentation}} on item talk pages as they would be redundant. Delete them altogether should be fine.

Licence of data to import (if relevant)

Discussion

it’s somewhat trivial to implement, just amount to a call of something like mw:Manual:Pywikibot/template.py on the Main Talk namespace. Would do it myself if my bot still had a botflag but it’s easier to ask to someone who just has to call the command line.

Comment Is all we have to do is remove {{Item documentation}} from the top of all item talk pages, like this? --Kanashimi (talk) 21:41, 25 May 2022 (UTC)[reply]

Request process

@@ Line 1,695: / Line 1,695: @@
 ;{{int:Talk}}
 * it’s somewhat trivial to implement, just amount to a call of something like [[:mw:Manual:Pywikibot/template.py]] on the Main Talk namespace. Would do it myself if my bot still had a botflag but it’s easier to ask to someone who just has to call the command line.
+: {{comment}} Is all we have to do is remove {{tl|Item documentation}} from the top of all item talk pages, like [[Talk:Q3950|this]]? --[[User:Kanashimi|Kanashimi]] ([[User talk:Kanashimi|<span class="signature-talk">{{int:Talkpagelinktext}}</span>]]) 21:41, 25 May 2022 (UTC)
 ;Request process <!-- Section for bot operator only -->

Wikidata:Bot requests: Difference between revisions

Revision as of 21:41, 25 May 2022

Shakeosphere person ID

Import Treccani IDs

Fix local dialing code (P473) wrongly inserted

Cleaning of streaming media services urls

Ontario public school contact info

reference URL (P854) → Holocaust.cz person ID (P9109) (2021-02-05)

request to add identifiers from FB (2021-02-11)

request to fix parliamentary group = caucus, != party (2021-05-12)

request to automate marking preferred_rank for full dates. (2021-05-28)

request to replace qualifiers in GND ID (2021-06-07)

request to cleanup DOI only items (2021-07-04)

request to add reference (2021-07-04)

Proliferate external-IDs from qualifiers and references to main statement (2021-07-06)

Request to change lexeme forms' grammatical features (2021-07-08)

Help Bota .. (2021-07-27)

Parts for duos (14 September 2021)

request to depreciated ethnic group only sourced with P143 (2021-10-23)

request to find references for novalue statements in "spouse" (P26) (2021-10-31)

Adminbot deleting items containing only a redirect

Adminbot deleting items containing only a redirect: Task description

Adminbot deleting items containing only a redirect: Discussion

Adminbot deleting items containing only a redirect: Request process

request to add English descriptions to railway lines and stations (2021-11-10)

request to merge MNAC dups. (2021-11-13)

request to mirror Wikipedia page moves: enwiki and/or others (2021-11-14)

request to remove ±1 (in width and height) from paintings (2021-11-15)

request to import the rest of Nomenclature for Museum Cataloging (P7749) (2021-11-18)

Adminbot deleting non-notable Semantic Scholar authors

request to merge true duplicates (2021-11-27)

Import pages, add cewiki sitelinks to Wikidata items (2021-11-28)

request to fix articles by Gerasimos Siasos (2021-12-01)

Bot work for Cleveland Museum of Art

request to add "published in" (P1433) to subpages of plwikisource (2021-12-02)

request to undo merge EC meetings (2021-12-02)

Mass updating links from viwiki (2021-12-15)

request to delete statements and sitelinks and merge items: dewiki duplicates (2021-12-16)

request to replace URL references with Q99587855 (2021-12-29)

request to make buildings searchable by address (2022-01-05)

request to refine class of items about cadastral municipalities/areas according to country (2022-01-10)

request to fix MS politician dates (2022-01-30)

Request to add frequently used Polish declensions of place names as alias (2022-02-03)

Request to fix Spanish labels wrongly copied from English labels (2022-02-09)

Request to add stats: number of statements with a single external-id (2022-02-09)

Request to fix cities and countries (2022-02-12)

Wikidata link bot request (2022-02-19)

Request to fix "politicienne allemande" (2022-03-04)

Request to deprecate P2190 string formats as property should use numeric format (2022-03-15)

Request to extract music titels from headline (2022-03-21)

Request to import P4055 from enwiki (2022-04-05)

Set preferred rank on census population of Quebec's municipalities

Request to add Russian descriptions (2022-04-10)

Tracking parameters in reference URLs

Bot to regularly substitute certain templates (2022-04-29)

Accademia delle Scienze di Torino multiple references (01-05-2022)

Clear watchlist (2022-05-05)

Request to remove references to Baltisches Biographisches Lexikon (2022-05-12)

Request to run a script to remove a template call on items talk pages .. (2022-05-19)

Navigation menu

Search