Using Google Visualization API with own data source

The Google Visualization API allows you to create charts and maps based on data you provide. This data can be in a Google Spreadsheet or be something you provide yourself. The visualizations themselves are mostly written in Javascript, although there are some written in Flash.

In this article we will create an HTML page with multiple charts in it and a data source that is generated by a Python program. We start with the HTML page that has two named div’s in it (visualization1 and visualization2), including the Javascript code that loads the two charts into these div’s.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
   <head>
      <meta http-equiv="content-type" content="text/html; charset=utf-8"/>
      <title>
         Google Visualization API
      </title>
      <script type="text/javascript" src="http://www.google.com/jsapi"></script>
      <script type="text/javascript">
         google.load('visualization', '1', {packages: ['columnchart', 'linechart']});
      </script>
      <script type="text/javascript">

         var query1, visualization1;
         var query2, visualization2;

         function initialize() {
            visualization1 = new google.visualization.ColumnChart(document.getElementById('visualization1'));
            query1 = new google.visualization.Query('http://jansipke.nl/res/visualization/chart-data.py');
            query1.setRefreshInterval(5);
            query1.send(drawVisualization1);

            visualization2 = new google.visualization.LineChart(document.getElementById('visualization2'));
            query2 = new google.visualization.Query('http://jansipke.nl/res/visualization/chart-data.py');
            query2.setRefreshInterval(5);
            query2.send(drawVisualization2);
         }

         function drawVisualization1(response) {
            if (response.isError()) {
               alert('Error in query: ' + response.getMessage() + ' ' + response.getDetailedMessage());
               return;
            }
            visualization1.draw(response.getDataTable(), {legend: 'bottom', title: 'ColumnChart'});
         }

         function drawVisualization2(response) {
            if (response.isError()) {
               alert('Error in query: ' + response.getMessage() + ' ' + response.getDetailedMessage());
               return;
            }
            visualization2.draw(response.getDataTable(), {legend: 'bottom', title: 'LineChart'});
         }

         google.setOnLoadCallback(initialize);
      </script>
   </head>
   <body>
      <div>
         <div id="visualization1" style="height: 250px; width: 400px; border: 1px solid; float: left;" />
      </div>
      <div>
         <div id="visualization2" style="height: 250px; width: 400px; border: 1px solid; float: left; margin-left: 10px" />
      </div>
   </body>
</html>

There are two things that you really need to do before this works correctly:

  • Make sure that the parent node of the div that holds your chart, is not also the parent of one of the other div’s that holds a chart. In the HTML page above we surrounded the chart div with another div to accomplish this.
  • The chart is updated with new data every 5 seconds. It is a known bug of the Google Visualization API that the data needs to be fetched from another server then the one that is hosting the HTML page for the refresh to actually work. In the HTML page above the data source chart-data.py is therefore fetched from jansipke.nl and the HTML is fetched from www.jansipke.nl.

We will create our own data source in Python. There is a description of how you should do this on the Writing Your Own Data Source page from Google. One of the most common mistakes in writing these data sources is the need for the program to read the request identifier (reqId) and return this value in the response. The API needs this to distinguish between responses for different charts on the same page.

import cgi, random

def index(req):
   reqId = None
   if (req.args):
      for arg in req.args.split("&"):
         (key, value) = arg.split("=")
         if (key == "tqx"):
            for parameter in value.split(";"):
               if (parameter.find("%3A") > 0):
                  (par_key, par_value) = parameter.split("%3A")
               if (par_key == "reqId"):
                  reqId = par_value

   a = str(random.randint(1, 3))
   b = str(random.randint(1, 3))
   c = str(random.randint(1, 3))
   d = str(random.randint(1, 3))

   s = ""
   s += "google.visualization.Query.setResponse(n"
   s += "{n"
   if (reqId != None):
      s += "   reqId:'" + reqId + "',n"
   s += "   status:'ok',n"
   s += "   table:n"
   s += "   {n"
   s += "      cols:n"
   s += "      [n"
   s += "         {id:'Col1',label:'',type:'string'},n"
   s += "         {id:'Col2',label:'Label1',type:'number'},n"
   s += "         {id:'Col3',label:'Label2',type:'number'},n"
   s += "         {id:'Col4',label:'Label3',type:'number'}n"
   s += "      ],n"
   s += "      rows:n"
   s += "      [n"
   s += "         {c:[{v:'a',f:'a'},{v:1.0,f:'1'},{v:1.0,f:'1'},{v:" + a + ",f:'1'}]},n"
   s += "         {c:[{v:'b',f:'b'},{v:2.0,f:'2'},{v:1.5,f:'1'},{v:" + b + ",f:'1'}]},n"
   s += "         {c:[{v:'c',f:'c'},{v:3.0,f:'3'},{v:2.5,f:'1'},{v:" + c + ",f:'1'}]},n"
   s += "         {c:[{v:'d',f:'d'},{v:4.0,f:'1'},{v:2.0,f:'1'},{v:" + d + ",f:'1'}]}n"
   s += "      ]n"
   s += "   }n"
   s += "});"

   return s

We can test this data source by following the link without parameters and following the link with the reqId parameter present:

Notice that the first one does not have reqId present in the response, but the second one does.

Update: it seems that the refreshing of data only happens correctly in Firefox and Opera. IE doesn’t refresh at all and Chrome only refreshes once. Oh joy!

Using FANN with Python

neural-network
In the previous article we showed how to install the FANN artificial neural network library on Ubuntu. In this article we will use the library.

There are typically two parts in using artificial neural networks:

  • A training part, where the neural network is trained with a training dataset. This dataset is chosen in such a way that it is representative of the real cases that it will see in the running part.
  • An execution part, where the neural network is executed on a real dataset. If the neural network was trained correctly, it will now be used to give answers to input it has seen in the training dataset, but also to input it has never seen.

The example we talk about is the well-known XOR operation. Let’s say that -1 represents a false value and 1 represents a true value, then the XOR operation will give the following output based on the given input:
xor
We will start by creating a training dataset. FANN expects a dataset to be in a specific format. The first line of this file contains three numbers. The first number tells FANN how many samples it can expect. The second number tells it how many input values there are for one sample. The third number tells FANN how many outputs there are for one sample. The rest of the file contains the samples, where the inputs are placed on one line and the corresponding output is placed on the next line. Our training dataset then looks like this:

4 2 1
-1 -1
-1
-1 1
1
1 -1
1
1 1
-1

We save this file as xor.data. Now we can write the training part:

#!/usr/bin/python

import libfann

connection_rate = 1
learning_rate = 0.7
num_input = 2
num_hidden = 4
num_output = 1

desired_error = 0.0001
max_iterations = 100000
iterations_between_reports = 1000

ann = libfann.neural_net()
ann.create_sparse_array(connection_rate, (num_input, num_hidden, num_output))
ann.set_learning_rate(learning_rate)
ann.set_activation_function_output(libfann.SIGMOID_SYMMETRIC_STEPWISE)

ann.train_on_file("xor.data", max_iterations, iterations_between_reports, desired_error)

ann.save("xor.net")

The neural network is saved in the file xor.net. We use this file in the execution part:

#!/usr/bin/python

import libfann

ann = libfann.neural_net()
ann.create_from_file("xor.net")

print ann.run([1, -1])

The output of this run is 1, which is the correct answer.

Installing FANN with Python bindings on Ubuntu

fann

The Fast Artificial Neural Network Library (FANN) is a neural network library, which implements multilayer artificial neural networks in C with support for both fully connected and sparsely connected networks. It has a Python binding that allows you to use its functionality from within Python, but with the bits that need speed implemented in C.

We are going to install the FANN library on Ubuntu and install the Python binding. Get and unzip the library:

wget http://downloads.sourceforge.net/project/fann/fann/2.1.0beta/fann-2.1.0beta.zip
sudo apt-get install unzip
unzip fann-2.1.0beta.zip

Configure, make and install the library:

cd fann-2.1.0/
sudo apt-get install gcc make
./configure
make
sudo make install

Install the Python bindings:

cd python/
sudo apt-get install g++ python-dev swig
sudo python setup.py install

The Python files are now located in a build directory. Copy them to a place where you can use them, e.g. your home directory:

cd build/lib.linux-i686-2.6/pyfann/
cp libfann.py ~
cp _libfann.so ~

And finally test that Python can now work with the library, start up Python and type:

import libfann
print dir(libfann)

This should print out all the functions of the library.

Examples of public SOAP web services

soap

In the previous article we showed how to use the suds library in Python to access SOAP web services. Here are some extra examples of public SOAP web services. Some of the outputs are edited for readability. We start with a recap of the one we used in the previous article.

Map IP address to country

import suds

url = "http://www.ecubicle.net/iptocountry.asmx?wsdl"
client = suds.client.Client(url)

print client.service.FindCountryAsString("194.145.200.104")
<?xml version="1.0"?>
<IPAddressService>
   <country>Netherlands</country>
</IPAddressService>

Country Information

import suds

url = "http://webservices.oorsprong.org/websamples.countryinfo/CountryInfoService.wso?WSDL"
client = suds.client.Client(url)

print client.service.ListOfCountryNamesByCode()
print client.service.CountryISOCode("Netherlands")
print client.service.CapitalCity("NL")
print client.service.CountryCurrency("NL")
(ArrayOftCountryCodeAndName){
   tCountryCodeAndName[] =
      (tCountryCodeAndName){
         sISOCode = "AD"
         sName = "Andorra"
      },
...
     (tCountryCodeAndName){
         sISOCode = "ZW"
         sName = "Zimbabwe"
      },
 }

NL

Amsterdam

(tCurrency){
   sISOCode = "EUR"
   sName = "Euro"
 }

Whois Service

import suds

url = "http://www.ecubicle.net/whois_service.asmx?WSDL"
client = suds.client.Client(url)

print client.service.Whois("whois.tucows.com", 43, "google.com")
Whois Server Version 2.0

Domain names in the .com and .net domains can now be registered
with many different competing registrars. Go to http://www.internic.net
for detailed information.

   Domain Name: GOOGLE.COM
   Registrar: MARKMONITOR INC.

...

ISBN Test

import suds

url = "http://webservices.daehosting.com/services/isbnservice.wso?WSDL"
client = suds.client.Client(url)

print client.service.IsValidISBN13("9789059650886")
True

US Zip Validator

import suds

url = "http://www.webservicemart.com/uszip.asmx?WSDL"
client = suds.client.Client(url)

print client.service.ValidateZip("90210")
<result code="200">
   <item
      zip="90210"
      state="CA"
      latitude="34.0888"
      longitude="-118.40612"
   />
</result>

Number Conversion

import suds

url = "http://www.dataaccess.com/webservicesserver/numberconversion.wso?WSDL"
client = suds.client.Client(url)

print client.service.NumberToWords(2931)
two thousand nine hundred and thirty one

Python SOAP client with suds

suds
The library suds allows Python to make SOAP calls (that is, Python is the web service client).

We start by installing the suds library on an Ubuntu machine. The Python setuptools are needed to install suds.

sudo apt-get install python-setuptools

Then we download, unpack and install suds.

wget https://fedorahosted.org/releases/s/u/suds/python-suds-0.3.7.tar.gz
tar -zxvf python-suds-0.3.7.tar.gz
cd python-suds-0.3.7
sudo python setup.py install

The library is now ready to use. We start by importing the suds library, creating a client based on a SOAP url, and asking the library to print the SOAP web service methods that are available to us.

import suds
url = "http://www.ecubicle.net/iptocountry.asmx?wsdl"
client = suds.client.Client(url)
print client

From the output of the last print command, we learn that there is a method called FindCountryAsString that takes one argument: the IP address.

print client.service.FindCountryAsString("194.145.200.104")

And it shows (edited for readability):

<?xml version="1.0"?>
<IPAddressService>
   <country>Netherlands</country>
</IPAddressService>

Normally you want to have the contents of the SOAP body. This is what suds provides in a very elegant way. However, you’re a bit stuck when you want to get something from the SOAP header. The author of suds realised this and made a backdoor to get the information anyway. We start by showing what the function last_received contains:

print client.last_received()
<?xml version="1.0" encoding="UTF-8"?>
<soap:Envelope>
   <soap:Header>
      <ResponseHeader xmlns="">
         <resultCode>1000</resultCode>
         <resultDescription>Success</resultDescription>
      </ResponseHeader>
   </soap:Header>
   <soap:Body>
...
   </soap:Body>
</soap:Envelope>

We can get portions of this data by doing some XML handling. Let’s say we want to print the resultCode:

print client.last_received().getChild("soap:Envelope").getChild("soap:Header").getChild("ResponseHeader").getChild("resultCode").getText()

Using Python to add new posts in WordPress

wordpress
Wordpress is a web publishing platform, which allows you to add and edit content to your website. This can be done by visiting the administration interface, wp-admin. However, there is also an XML RPC programming interface that allows you to do the same from any programming language that supports this interface.

In this short article I will show how to add new posts to WordPress from Python. There are several libraries available that allow you to view and edit content, but they were all inappropriate for me. I wanted to be able to convert my old website into WordPress posts and have the published date set to the date I originally posted items on my old site. However, every library I tried either didn’t have the possibility to set the published date, or WordPress didn’t understand it.

Here’s the short piece of code that allowed me to add new posts to WordPress, without the use of any 3rd party library:

import datetime, xmlrpclib

wp_url = "http://www.example.com/xmlrpc.php"
wp_username = "someuser"
wp_password = "secret"
wp_blogid = ""

status_draft = 0
status_published = 1

server = xmlrpclib.ServerProxy(wp_url)

title = "Title with spaces"
content = "Body with lots of content"
date_created = xmlrpclib.DateTime(datetime.datetime.strptime("2009-10-20 21:08", "%Y-%m-%d %H:%M"))
categories = ["somecategory"]
tags = ["sometag", "othertag"]
data = {'title': title, 'description': content, 'dateCreated': date_created, 'categories': categories, 'mt_keywords': tags}

post_id = server.metaWeblog.newPost(wp_blogid, wp_username, wp_password, data, status_published)

Threads in Python

python

A thread, sometimes called an execution context or a lightweight process, is a single sequential flow of control within a program. You use threads to isolate tasks. Each thread is a sequential flow of control within the same program. Each thread runs independently from the others, but at the same time.

We want to run a certain function a specified number of times per second. A class should be given such a function and some parameters, like the maximum number of threads, the number of active threads per second and a total duration.

import threading, time

class Fire:

   def soldier(self, soldier_nr):

      while (1):

         self.work_lock.acquire()

         if (self.still_working):

            self.write_lock.acquire()
            self.nr_active_soldiers += 1
            self.write_lock.release()

            start_time_counter = self.time_counter

            self.function.__call__(self.arguments)

            end_time_counter = self.time_counter

            self.write_lock.acquire()
            self.nr_active_soldiers -= 1
            self.statistics.append((soldier_nr, start_time_counter, end_time_counter))
            self.write_lock.release()

         else:

            break;

   def __init__(self, function, arguments, nr_threads, per_second, duration):

       self.function           = function
       self.arguments          = arguments
       self.nr_threads         = nr_threads
       self.per_second         = per_second
       self.duration           = duration
       self.work_lock          = threading.Semaphore(0)
       self.write_lock         = threading.Semaphore(1)
       self.still_working      = 1
       self.nr_active_soldiers = 0
       self.time_counter       = 0
       self.statistics         = []

   def fire(self):

       print 'Start'

       for i in range(self.nr_threads):
          thread1 = threading.Thread(target=self.soldier, args=(i, ))
          thread1.start()

       for second in range(self.duration):
          for i in range(self.per_second):
             self.work_lock.release()
          self.write_lock.acquire()
          if (len(self.statistics) > 0):
             (soldier_nr, start_time_counter, end_time_counter) = self.statistics[-1]
             delta_time = end_time_counter - start_time_counter
          else:
             delta_time = -1
          print 'Active', self.nr_active_soldiers, 'Time', delta_time
          self.write_lock.release()
          time.sleep(1)
          self.time_counter += 1

       self.still_working = 0
       for i in range(self.nr_threads):
          self.work_lock.release()

       time.sleep(5)

       output_file = open('output.txt', 'wb')

       for statistic in self.statistics:
           output_file.write(str(statistic) + 'n')

       output_file.close()

       print 'End'

if (__name__ == '__main__'):

   def waste_time(arguments):

       time.sleep(2.5)

   nr_threads = 20
   per_second = 3
   duration   = 15

   firing_squad = Fire(waste_time, None, nr_threads, per_second, duration)
   firing_squad.fire()