I’ve done quite a bit of work doing web development with Django, and one thing that doesn’t really come up when doing standard CRUD work is that mostly all web sites are linear – take a request, process it and spit out a response. Sure you’re underlying architecture uses threading, but that’s invisible to the humble web developer, and most of the time is completely unused.

Recently I’ve been working on an application, a genuine client/server system that didn’t HTTP, that didn’t use SQL databases and relied solely on good ol’ memory to function – continuously – and respond to multiple requests quickly and speedily without locking up a client.

This is where I learned that threading is awesome, and learned to love the Queue.

What is threading?

Threading is when you take a function or a process that takes a long time and is quite independent form the rest of your application and let it run in it’s own memory space, or ‘thread’ – the function will split off from your main execution loop and finish its business all of its own accord. The benefits? It frees up your application to stay responsive, and not ignore other user inputs while it’s executing.

Take for example a web server – if it were a linear application it would not acept requests while it was processing a page request from a user until it was done. In the case of a web server – it will take the request, launch a thread and immediately return to a listening state to handle the next one, without having to actually manage the request.

A threading example in Python

import threading
import time

class count_stuff(threading.Thread):
    """
        A thread class that will count a number, sleep and output that number
    """

   
    def __init__ (self, start_num, end):
        self.num = start_num
        threading.Thread.__init__ (self)
   
    def run(self):
        while True:
            if self.num != end:
                self.num += 1
                print "Outputting: ", str(self.num)
                time.wait(5)
            else:
                break
       
       
myThread = count_stuff(1, 5)
myThread.start()

Let’s walk through the code:

  1. First we import threading and time – threading is the base class that allows you to run your function/object as a thread
  2. Since we want to pass values to intitialise the thread, we override the __init__() function – remember to call the base class init function afterward otherwise you will get an error
  3. The run() function is the real meat of your class – this is where you put the code that will kick off the thread – in this case we take a number, increment it – if it equals our end value we break otherwise, keep going and sleep 5 seconds

Pretty simple right?

The kicker is that you might want to do something with the output of your calculation separately in another part of your application – this is where a Queue comes in.

What is a Queue?

A Queue is a memory structure that is thread-safe, it will not cause collisions in your application or memory space (for example, if you started 5 threads of the above and wrote to a text file it could cause an error as two or more threads try to write to the file at the same time). Queues work on three principles: LIFO, FIFO and Sorted.

So, how do we write and pull data from a queue?

import threading
import Queue
import time

myQueue = Queue.Queue()

class count_stuff(threading.Thread):
    """
        A thread class that will count a number, sleep and output that number
    """

   
    def __init__ (self, start_num, end, q):
        self.num = start_num
        self.q = q
        threading.Thread.__init__ (self)
   
    def run(self):
        while True:
            if self.num != end:
                self.num += 1
                self.q.put(self.num)
                time.wait(5)
            else:
                break
       
       
myThread = count_stuff(1, 5, myQueue)
myThread.start()

while True:
    if not myQueue.empty():
        val = myQueue.get()
        print "Outputting: ", val
    time.wait(2)

Let’s walk through the differences here:

  1. First we import Queue
  2. We create a Queue object
  3. We modify the __init__() function to take a new parameter – here we pass it the queue object for modification
  4. When the thread runs, it will put its output into the queue using the put() method
  5. In our main execution section we have added a loop that will poll the queue and pull out data – Queue’s are sensitive things and can behave in complex ways, so I’d recommend reading the docs to get more detail (after all this is just an intro)

The most interesting part of this code is the last section where we poll the queue – we are using a loop to continually poll the queue – naturally a queue could be polled by other threads in your application to allow for message passing between running processes.

Queue’s are everywhere – most notably (and a great example) is in Hadoop and AWS Simple Messaging Service – both have simple messaging systems that replicate the functionality of a Queue on a cluster level, which is awesome.

Some really handy links:

Enjoy,
Martin