Purple exclamation mark.svg Planning the future of Botwiki! - Help us bring Botwiki up to date, contribute to our strategy discussion, add bot scripts, and contribute manuals, guides, and tutorials! Almost anything related to bots, particularly those used to edit mediawiki, is welcome.

Red exclamation mark.svg UNABLE TO EDIT? - We've experienced attacks by spambots lately and now require you to confirm your e-mail before you can edit (go to your preferences, enter an e-mail address, and request a confirmation e-mail, then go to your e-mail and click on the confirmation link). We also require new accounts to make a few edits and wait a few minutes before before you can create a page; however, if this is a problem contact us in #botwiki and we can manually confirm your account. Sorry for the inconvenience.

Manual talk:Replacer.py

From Botwiki
Jump to: navigation, search

Replacer: replaces image links on any Wikimedia wiki. Tasks are fetched from a sysop only page on Wikimedia Commons. AnyImageTypeButSVG to SVG is not supported because of a pending SupersededSVG deletion debate. Success and failure of edits is logged into a MySQL database on the toolserver. Edit messages can be localised and are read from the local wiki. If Template:Stop is present on the wiki page, the bot will stop working until that text has been removed.

Flaws: The bots use a lot of similar code. Because of that, similar issues may be documented.

  • Fails if CheckUsage is too busy. This should be recognised and retries should be performed until a complete CheckUsage was obtained
  • Each toolserver user is granted 15 connections to the MySQL database. Sometimes this limit is reached and in that case success and failure is not logged. This should not happen.
  • If a lot of files have to be replaced that are used a lot on a wiki, a lot of edits can be made to that one wiki in a time unit. This can (and has) upset users in a community. An edit limit of 3 edits per minute per wiki should be in place.
  • From the bot output it looks like it keeps on fetching the edit summary on multiple edits on a wiki. This causes unnecessary server load and should not occur.
  • Most main projects are supported, but especially the meta projects can give issues because of the implementation of project recognition.
  • Image name recognition is such errors can occur in special cases. Example: A page contains images "Test 1.jpg" and "1.jpg". If "1.jpg" is removed, "Test 1.jpg" may end up as "Test ". (example unfortunately lost)
  • Although there is code present to prevent this issues, it has happened that a replacement was made when a local image with the same name was present. This should not happen. [1]
  • Bot crashes on 'Unhandled exception in thread started'
  • Sometimes when toolserver is really busy (or whatever the reason may be), a thread cannot be created. In this case a task is not executed. This should not happen.
  • If an image has more than 50 uses on a wiki and all uses are from a template that is not displayed in the first 50 pages displayed, no text replacement will occur (can this be fixed at all)

[1] http://en.wikipedia.org/w/index.php?title=User:TomStar81/World_War_II&diff=prev&oldid=129050292

General wishes

  • Simple web interface for database logs (last edits, last edits per language, all edits for a particular image)
  • Some kind of registration for the bots to enable starting it from cron if it is no longer running

Reported errors by script:

Unhandled exception

Critical. Makes bot stop/crash.

Unhandled exception in thread started by <function reemplazar_imagen at 0x2b05f50c9668>
Traceback (most recent call last):
  File "replacer.py", line 205, in reemplazar_imagen
    userpage.put('#Redirect[[m:User:CommonsDelinker]]', '')
  File "/home/siebrand/pywikipedia/wikipedia.py", line 966, in put
    self.site().forceLogin()
  File "/home/siebrand/pywikipedia/wikipedia.py", line 2886, in forceLogin
    if loginMan.login(retry = True):
  File "/home/siebrand/pywikipedia/login.py", line 168, in login
    self.password = getpass.getpass(s.encode(config.console_encoding))
  File "/usr/lib/python2.4/getpass.py", line 37, in unix_getpass
No idea what causes this issue. I'm trying to reproduce it. Siebrand 01:22, 4 mag 2007 (GMT)
Maybe is a codec's bug... i mean, that utf-8 hasn't all characters so maybe there is a strange charachter that gives this error (but i'm not sure...) --Filnik 12:10, 5 mag 2007 (GMT)
Am I missing something, but where is the excact error? I think you forgot to paste one line. Bryan 20:12, 22 mag 2007 (UTC)

max_user_connections

Appears to be non-critical. Bot continues.

Recording...
Unhandled exception in thread started by <function record at 0x2b2788007398>
Traceback (most recent call last):
  File "replacer.py", line 38, in record
    conn = MySQLdb.connect(host="sql",user="orgullo", passwd="*****",db="u_orgullo_logs", charset='utf8', use_unicode=1)
  File "/usr/lib/python2.4/site-packages/MySQLdb/__init__.py", line 75, in Connect
    return Connection(*args, **kwargs)
  File "/usr/lib/python2.4/site-packages/MySQLdb/connections.py", line 164, in __init__
    super(Connection, self).__init__(*args, **kwargs2)
_mysql_exceptions.OperationalError: (1226, "User 'orgullo' has exceeded the 'max_user_connections' resource (current value: 15)")
Solution would probably to cap the connections and queue work. No idea how to implement. Siebrand 23:55, 3 mag 2007 (GMT)

I've to speak with you... i don't know what the bot does (i haven't heard about a MySQLdb...) but maybe you can change the code with:

        conn = ''
        while conn = '':
            try:
                conn = MySQLdb.connect(host="sql",user="orgullo", passwd="********",db="u_orgullo_logs", charset='utf8', use_unicode=1)
                continue
            except:
                wikipedia.output("MySQLerror: Sleeping for 10 seconds, than retry..."
                time.sleep(10)
                continue

--Filnik 12:15, 5 mag 2007 (GMT)

The code looks like it closes the connection when it is done, but mytop tells me something different. I think the bug may be in the (not) closing of the connection. See below. Siebrand 18:54, 5 mag 2007 (GMT)
 1534541   orgullo         hemlock u_orgullo_         1  Sleep
 1534716   orgullo         hemlock u_orgullo_         1  Sleep
 1534816   orgullo         hemlock u_orgullo_         1  Sleep
 1534775   orgullo         hemlock u_orgullo_         2  Sleep
 1534771   orgullo         hemlock u_orgullo_         3  Sleep
 1534937   orgullo         hemlock u_orgullo_         4  Sleep
 1534555   orgullo         hemlock u_orgullo_         7  Sleep
 1534562   orgullo         hemlock u_orgullo_         8  Sleep
 1534639   orgullo         hemlock u_orgullo_         8  Sleep
 1534508   orgullo         hemlock u_orgullo_        10  Sleep
 1534545   orgullo         hemlock u_orgullo_        10  Sleep
 1534685   orgullo         hemlock u_orgullo_        10  Sleep
 1534907   orgullo         hemlock u_orgullo_        19  Sleep
 1534613   orgullo         hemlock u_orgullo_        25  Sleep

Uhm... i don't know... maybe Martinp23 can help you with MySQL ^__^ --Filnik 20:23, 5 mag 2007 (GMT)

Personal tools
Share