Planning the future of Botwiki! - Help us bring Botwiki up to date, contribute to our strategy discussion, add bot scripts, and contribute manuals, guides, and tutorials! Almost anything related to bots, particularly those used to edit mediawiki, is welcome.
UNABLE TO EDIT? - We've experienced attacks by spambots lately and now require you to confirm your e-mail before you can edit (go to your preferences, enter an e-mail address, and request a confirmation e-mail, then go to your e-mail and click on the confirmation link). We also require new accounts to make a few edits and wait a few minutes before before you can create a page; however, if this is a problem contact us in #botwiki and we can manually confirm your account. Sorry for the inconvenience.
Manual talk:Replacer.py
Replacer: replaces image links on any Wikimedia wiki. Tasks are fetched from a sysop only page on Wikimedia Commons. AnyImageTypeButSVG to SVG is not supported because of a pending SupersededSVG deletion debate. Success and failure of edits is logged into a MySQL database on the toolserver. Edit messages can be localised and are read from the local wiki. If Template:Stop is present on the wiki page, the bot will stop working until that text has been removed.
Flaws: The bots use a lot of similar code. Because of that, similar issues may be documented.
- Fails if CheckUsage is too busy. This should be recognised and retries should be performed until a complete CheckUsage was obtained
- Each toolserver user is granted 15 connections to the MySQL database. Sometimes this limit is reached and in that case success and failure is not logged. This should not happen.
- If a lot of files have to be replaced that are used a lot on a wiki, a lot of edits can be made to that one wiki in a time unit. This can (and has) upset users in a community. An edit limit of 3 edits per minute per wiki should be in place.
- From the bot output it looks like it keeps on fetching the edit summary on multiple edits on a wiki. This causes unnecessary server load and should not occur.
- Most main projects are supported, but especially the meta projects can give issues because of the implementation of project recognition.
- Image name recognition is such errors can occur in special cases. Example: A page contains images "Test 1.jpg" and "1.jpg". If "1.jpg" is removed, "Test 1.jpg" may end up as "Test ". (example unfortunately lost)
- Although there is code present to prevent this issues, it has happened that a replacement was made when a local image with the same name was present. This should not happen. [1]
- Bot crashes on 'Unhandled exception in thread started'
- Sometimes when toolserver is really busy (or whatever the reason may be), a thread cannot be created. In this case a task is not executed. This should not happen.
- If an image has more than 50 uses on a wiki and all uses are from a template that is not displayed in the first 50 pages displayed, no text replacement will occur (can this be fixed at all)
[1] http://en.wikipedia.org/w/index.php?title=User:TomStar81/World_War_II&diff=prev&oldid=129050292
General wishes
- Simple web interface for database logs (last edits, last edits per language, all edits for a particular image)
- Some kind of registration for the bots to enable starting it from cron if it is no longer running
Reported errors by script:
Unhandled exception
Critical. Makes bot stop/crash.
Unhandled exception in thread started by <function reemplazar_imagen at 0x2b05f50c9668>
Traceback (most recent call last):
File "replacer.py", line 205, in reemplazar_imagen
userpage.put('#Redirect[[m:User:CommonsDelinker]]', '')
File "/home/siebrand/pywikipedia/wikipedia.py", line 966, in put
self.site().forceLogin()
File "/home/siebrand/pywikipedia/wikipedia.py", line 2886, in forceLogin
if loginMan.login(retry = True):
File "/home/siebrand/pywikipedia/login.py", line 168, in login
self.password = getpass.getpass(s.encode(config.console_encoding))
File "/usr/lib/python2.4/getpass.py", line 37, in unix_getpass
- No idea what causes this issue. I'm trying to reproduce it. Siebrand 01:22, 4 mag 2007 (GMT)
- Maybe is a codec's bug... i mean, that utf-8 hasn't all characters so maybe there is a strange charachter that gives this error (but i'm not sure...) --Filnik 12:10, 5 mag 2007 (GMT)
- Am I missing something, but where is the excact error? I think you forgot to paste one line. Bryan 20:12, 22 mag 2007 (UTC)
- Maybe is a codec's bug... i mean, that utf-8 hasn't all characters so maybe there is a strange charachter that gives this error (but i'm not sure...) --Filnik 12:10, 5 mag 2007 (GMT)
max_user_connections
Appears to be non-critical. Bot continues.
Recording...
Unhandled exception in thread started by <function record at 0x2b2788007398>
Traceback (most recent call last):
File "replacer.py", line 38, in record
conn = MySQLdb.connect(host="sql",user="orgullo", passwd="*****",db="u_orgullo_logs", charset='utf8', use_unicode=1)
File "/usr/lib/python2.4/site-packages/MySQLdb/__init__.py", line 75, in Connect
return Connection(*args, **kwargs)
File "/usr/lib/python2.4/site-packages/MySQLdb/connections.py", line 164, in __init__
super(Connection, self).__init__(*args, **kwargs2)
_mysql_exceptions.OperationalError: (1226, "User 'orgullo' has exceeded the 'max_user_connections' resource (current value: 15)")
- Solution would probably to cap the connections and queue work. No idea how to implement. Siebrand 23:55, 3 mag 2007 (GMT)
I've to speak with you... i don't know what the bot does (i haven't heard about a MySQLdb...) but maybe you can change the code with:
conn = ''
while conn = '':
try:
conn = MySQLdb.connect(host="sql",user="orgullo", passwd="********",db="u_orgullo_logs", charset='utf8', use_unicode=1)
continue
except:
wikipedia.output("MySQLerror: Sleeping for 10 seconds, than retry..."
time.sleep(10)
continue
--Filnik 12:15, 5 mag 2007 (GMT)
- The code looks like it closes the connection when it is done, but mytop tells me something different. I think the bug may be in the (not) closing of the connection. See below. Siebrand 18:54, 5 mag 2007 (GMT)
1534541 orgullo hemlock u_orgullo_ 1 Sleep 1534716 orgullo hemlock u_orgullo_ 1 Sleep 1534816 orgullo hemlock u_orgullo_ 1 Sleep 1534775 orgullo hemlock u_orgullo_ 2 Sleep 1534771 orgullo hemlock u_orgullo_ 3 Sleep 1534937 orgullo hemlock u_orgullo_ 4 Sleep 1534555 orgullo hemlock u_orgullo_ 7 Sleep 1534562 orgullo hemlock u_orgullo_ 8 Sleep 1534639 orgullo hemlock u_orgullo_ 8 Sleep 1534508 orgullo hemlock u_orgullo_ 10 Sleep 1534545 orgullo hemlock u_orgullo_ 10 Sleep 1534685 orgullo hemlock u_orgullo_ 10 Sleep 1534907 orgullo hemlock u_orgullo_ 19 Sleep 1534613 orgullo hemlock u_orgullo_ 25 Sleep
Uhm... i don't know... maybe Martinp23 can help you with MySQL ^__^ --Filnik 20:23, 5 mag 2007 (GMT)