17 Line Markov Chain

14/03/2017

In my most recent project ./code --poetry I created a code poem called irc.py that (as well as looking like an IRC chat itself) mimics IRC conversations when run. It looks something like this:

irc code poem

The output is randomized each time, but for a single instance, running the program looks something like this:

irc

Essentially the program uses a simple Markov chain and a database of the IRC logs accumated from various IRC channels over the years to simulate the conversations. Now, reading that Wikipedia article, you would be forgiven for thinking that Markov chains are something limited to the realms of Mathematics graduates and PhDs - but actually Markov chains are pretty simple to implement in a language like Python, and can be coded up in just a few lines. Let us see what the Markov chain I implemented above looks like without all the poetry and decoration obfuscating it. But first, let us switch up the dataset a bit. Let us use this big database of poetry I collected a while back instead.

With the data downloaded, the first step is to read it in and separate it out into words. Newlines are important in poetry so for this data we'll use a regex to make sure we only split on spaces.

import re
with open('poetry.txt') as f:
    words = re.split(' +', f.read())

The essence of a Markov Chain is a transition function - a function which takes some previously generated words and produces the next one. To emulate this function we'll just use a lookup table. The input to the table will be the two previous words and the output will be a list of potential following words from which to choose from.

We can do this easily in Python using a defaultdict - a dictionary which has a default value for missing entries. In our case we'll use an empty list default. We'll loop over all the words in our database in groups of three and append the third word to the list of potential outputs produced from the first two words.

from collections import defaultdict

transition = defaultdict(list)

for w0, w1, w2 in zip(words[0:], words[1:], words[2:]):
    transition[w0, w1].append(w2)

We'll then pick some random starting location and find the three words located there:

from random import randint, choice
i = randint(0, len(words)-3)
w0, w1, w2 = words[i:i+3]

Then all we need to do is print out our current word w2 (followed by a space), update our previous word variables w0 and w1, and choose a next word from the list found via our transition function.

for _ in range(500):
    stdout.write(w2+' ')
    w0, w1, w2 = w1, w2, choice(transition[w1, w2])

That's it! Here is the whole thing.

import re
from sys import stdout
from random import randint, choice
from collections import defaultdict

with open('poetry.txt') as f:
    words = re.split(' +', f.read())

transition = defaultdict(list)
for w0, w1, w2 in zip(words[0:], words[1:], words[2:]):
    transition[w0, w1].append(w2)

i = randint(0, len(words)-3)
w0, w1, w2 = words[i:i+3]
for _ in range(500):
    stdout.write(w2+' ')
    w0, w1, w2 = w1, w2, choice(transition[w1, w2])

You can adjust the value 500 to produce shorter or longer poetry. Now you can produce some beautiful poetry like this!

Half hidden in helmet and shield, -
Point me the happiness that's mine;
That I am not
Going on with light!
Flowers she had, all she sees.
They cannot light the lakes,
When trumpets loose the string
And let them bring her.

In the long, long way and the jelly-bean
Play cockalorum-and-the-hen,
When the cool sound of sin ?
And him who would attack with terror seemed to shake.
There lyeth the body and my limbs gushed full of tunes that were born on the golden morrow
Beam'd upward from the world.

O contentment, make me cry on Willie, my boy is leading, calls his mate, ne any does not recede.
I foment them
With earnest eyes and I am full of thieves for thieves' use: give me room! give loneliness and fidgety revenge.
Nobody knew where he lay
A band of very great boon
To all God's utterance,
We hardly knew a dearer rate;
And took, but not a beggar, in fixed eyes,
Lifting distressful hands as in the painted leaves of the Nightingale.

Along the sprawled city's modern swarm and propagate the sure esteem
To cheated youth's midsummer dream,
When every friend is one and the gay, man,
Fal, la, la, &c.;

Here's a jest!
what word will come to the following two distichs of my heart,
Lord of my own breath. I'm alone and none other than him, so I am but a richer emerald glows:
Into each flower-cup
Her cool dews are round her name.

Bowl of gems
Transports me with your blocks?
Castles and palaces, temples and towers,
Cut shorter many a Nymph or Goddess of the hands of unknown shores
And laughed as he was, are we.