Security Fundamentals > Hash Functions

Part 1: A Gentle Introduction

Motivation

I learned about hash functions in the same way that I imagine many of you did: a side-note in the implementation of a Hash Table. I was in University, and we were studying data structures and alogrithms. Hash Tables appeared alongside arrays, linked lists, queues, stacks, trees, and more. We spent a lot of time learning about the remarkable amortized runtime perormance of Hash Tables, and my Professor gave us a short lesson describing how to build a Hash Table with an array, a linked list and a hash function (keep your eyes peeled for part 2 in this series if you are interested!).

It was amazing to see the impressive algorithmic applications of Hash Tables, but I failed to recognize that the hash function had other applications!

Twenty years later, and I look back at that moment with humour. I had seen the forest, but missed the trees (certainly not the first, nor the last time that I did that). Hash Tables are one of the most important tools that we have in algorithmic optimization, they power our Objects in JavaScript and dictionaries in both Ruby and Python. It’s all too easy to focus on the Hash Table, with all it’s glamour, and to overlook the humble hash function at the heart of it all.

Hash functions have broad applications in software development, but they are especially important in the field of security. From data-integrity checking, to cryptographic signing, to proof of work: hash functions are one of the three primitive operations in security!

I am fascinated by hash functions, and I truly believe that understanding them has made me a more thoughtful and potent developer.

In this blog post I want to introduce the hash function, play around with a couple simple ones, and hopefully set you on the journey of discovery that I almost missed all those years ago.

Definitions

Hash functions have a simple, humble definition:

“A hash function is a function that maps an arbitrary input to a fixed-size value”

One input, one output

It is easy to see then, why they are so easy to overlook. Let’s take a second to make sure we understand what this definition means.


read more...

Blog

Thoughts, Ideas and Lessons from the Common Sense Software team