19.11.2017 17:17

Getting Started With Tries in Python

Author: Elliot Forbes | @Elliot_F

This tutorial uses Python 3.6 in order to convey the concepts that I will be covering.

A Trie in computer science is a tree structure that allows you to do things such as very quick lookup of words within the english language. Typically if you were to write a word processor that did spell checks against words in a document, you would implement a trie and perform a very quick lookup to check whether or not the words in your word document are indeed valid words.

Implementing a Very Simple Trie

When it comes to implementing a trie we typically use a series of nested hash tables. In Python the closest thing we have to a hash table is a dict() which enables us to store over 1 billion different key-value pairs, enough for the 172,000 currently employed in the English language.

Let’s take a quick look at how we would implement a simple trie structure in Python. We’ll define a make_trie() function which will take in a list of words and then from this list of words, it will create a trie structure.

_end = '*'

# takes in a list of words
def make_trie(*words):
    # creates our root dict() 
    trie = dict()

    for word in words:
        # create a temporary dict based off our root
        # dict object
        temp_dict = trie
        
        for letter in word:
            # update our temporary dict and add our current
            # letter and a sub-dictionary
            temp_dict = temp_dict.setdefault(letter, {})
        
        # If our word is finished, add {'*': '*'}  
        # this tells us our word is finished
        temp_dict[_end] = _end
    return trie

# Test our trie creation
trie = make_trie('hi', 'hello', 'howdy')
# print out our new trie 
print(trie)

If we then attempt to run this you will see that we create a trie that has one root element: h.

 $ python3.6 trie.py
{'h': {'i': {'*': '*'}, 'e': {'l': {'l': {'o': {'*': '*'}}}}, 'o': {'w': {'d': {'y': {'*': '*'}}}}}}

Implementing a find_word() function

So now that we have created a trie structure we need to implement a mechanism that checks to see if a word exists within our trie.

def find_word(trie, word):
    sub_trie = trie
    for letter in word:
        if letter in sub_trie:
            sub_trie = sub_trie[letter]
        else: 
            return False
    else:
        if _end in sub_trie:
            return True
        else:
            return False

Implementing an add_word() function

If we wanted to implement a function that took in an existing trie and returned a new trie that contained a new word that has been passed in, we could do something similar to this:

def add_word(trie, word):
    # We want to catch if a word already
    # exists within our trie structure
    if find_word(trie, word):
        print("Word Already Exists")
        # if it does just return our original trie
        return trie

    # if it doesn't we want to add the new word
    temp_trie = trie
    for letter in word:
        # iterate through our word and see how
        # much of the word we already have within our
        # structure
        if letter in temp_trie:
            temp_trie = temp_trie[letter]
        else:
            temp_trie = temp_trie.setdefault(letter, {})
    
    # Terminate our new word
    temp_trie[_end] = _end
    # return our new trie object
    return temp_trie

Complete Code Sample

Below you’ll find the complete code sample for this tutorial. I’ve fleshed this out into a Python class that you can use much like you would a list or a queue.

class Trie():
    
    def __init__(self):
        self._end = '*'
        self.trie = dict()

    def __repr__(self):
        return repr(self.trie)

    def make_trie(self, *words):
        trie = dict()
        for word in words:
            temp_dict = trie
            for letter in word:
                temp_dict = temp_dict.setdefault(letter, {})
            temp_dict[self._end] = self._end
        return trie

    def find_word(self, word):
        sub_trie = self.trie

        for letter in word:
            if letter in sub_trie:
                sub_trie = sub_trie[letter]
            else: 
                return False
        else:
            if self._end in sub_trie:
                return True
            else:
                return False

    def add_word(self, word):
        if self.find_word(word):
            print("Word Already Exists")
            return self.trie

        temp_trie = self.trie
        for letter in word:
            if letter in temp_trie:
                temp_trie = temp_trie[letter]
            else:
                temp_trie = temp_trie.setdefault(letter, {})
        temp_trie[self._end] = self._end
        return temp_trie



my_trie = Trie()
my_trie.add_word('head')
my_trie.add_word('hi')
my_trie.add_word('howdy')
print(my_trie)

print(my_trie.find_word("hi"))
print(my_trie.find_word("how"))
print(my_trie.find_word("head"))

Conclusion

If you found this tutorial useful then please let me know in the comments section below. Conversely if you need anything further explained then I will also be happy to give you additional help!

Subscribe to the Youtube Channel!

Subscribe to our YouTube channel which is constantly being updated with new programming related tutorials.

Subscribe!