/dev/random
As part of the 2011 Wikimedia Summer of Research,  we uncovered a possible correlation between the decline in new active  editors that began in 2007 and the rise of warnings issued to new users  by bots and automated tools, which started in 2006.
http://blog.wikimedia.org/2012/03/27/analysis-of-the-quality-of-newcomers-in-wikipedia-over-time/

L=A=N=G=U=A=G=E 
James Joyce dictionary 
deception   
detection of alzheimer's = Memories of my nervous illness by Daniel Paul Schreber
http://en.wikipedia.org/wiki/Daniel_Paul_Schreber
http://www.luftgangster.de/schreber/start.html



http://en.wikipedia.org/wiki/War_on_Terror

Wikipedia / Mediawiki API: http://www.mediawiki.org/wiki/API:Main_page

documentation english pattern: http://www.clips.ua.ac.be/pages/pattern-en


Tasks!

http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&format=xml&titles=Main%20Page

Getting the history of a wikipedia page.

Properties Revision
http://www.mediawiki.org/wiki/API:Properties#revisions_.2F_rv


Useful wiki API URLs:

Python diffing
http://localhost/doc/python2.7/html/library/difflib.html?highlight=diff#difflib

how subjective is the Wikipedia article for Neutrality?
Wikipedia article: Neutrality
  
'Neutral' is 0.39375 subjective. 
'Politics and social science' is 0.195238095238 subjective. 
'Mathematics and natural science' is 0.338293650794 subjective. 
'Geographic locations' is 0.0 subjective. 
'Other and related senses' is 0.3875 subjective.

Wikipedia article: Subjectivity

'Subjectivity' is 0.405013736264 subjective and -0.00144230769231 positive. 
'Society' is 0.414166666667 subjective and 0.015 positive. 
'Self' is 0.37 subjective and -0.0388888888889 positive. 
'See also' is 0.333333333333 subjective and -0.166666666667 positive. 
'References' is 0.0 subjective and 0.0 positive. 
'Further reading' is 0.176136363636 subjective and 0.0340909090909 positive.

# ;) / tourette.py
from pattern.en.wordlist import PROFANITY 
import os

for word in PROFANITY:
    print word
    os.system('echo "'+word+'" | festival --tts --pipe')



#!/usr/bin/python #getting: time|user|content import urllib import json from csv import writer pagetitle = 'War_on_Terror' #baseq = 'http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvli baseq = 'http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvlim q=baseq count = 0 csvfile = open('revisions.csv', 'wb') w = writer(csvfile, dialect='excel') while True: results = json.load(urllib.urlopen(q)) p = results['query']['pages'] for key in p: pass revs = p[key]['revisions'] count += len(revs) print revs[-1]['timestamp'] print len(revs) for r in revs: w.writerow((r['revid'], r['timestamp'], r['user'].encode('utf-8'), r['co rvcontinue = None if 'query-continue' in results: if 'revisions' in results['query-continue']: if 'rvcontinue' in results['query-continue']['revisions']: rvcontinue = results['query-continue']['revisions']['rvcontinue'] q = baseq+"&rvcontinue="+str(rvcontinue) if rvcontinue==None: break break print "done" print count, "total revs" csvfile.close()


# GETTING DATASETS FOR USING TAXONOMIES