Hash Tables

Objectives

  • Define a hash table
  • Define a hash function
  • Describe when to use hash tables instead of arrays

Definition

A hash table is a data structure that maps a set of keys to a set of values. A hash table is often called by many different names including:

  • Hash Table
  • Hash
  • Hash map
  • Map
  • Table
  • Dictionary

Hash Tables in Javascript

You have already been using them. Where?

myObj = {};
myObj['key'] = 'value';
myObj[1] = 'another value';
myObj['1'] // returns the string 'another value'
test = {}
test[myObj] = 'the last value';
test['[object Object]'] // returns the string 'the last value'

Internal Implentation

Fast lookup is very important in hashtables. What data-structure do you know that provides this ability?

how?

What's wrong with using an Array with numbers used to index into it?

Your array may end up being very sparse

Hash Functions

Maps a Set of Keys to a Smaller Space

hash_key = (key * LARGE_PRIME) % smaller_array_size

Hash Collisions

A collision occurs in the hash table when two keys map to the same index.

Chaining

Chaining is a way to resolve collisions in a hash table. Instead of starting with an empty array, each array element contains a data structure to store collisions. A common data structure to use is a linked list, but others can be used such as a binary search tree or even another hash table. Whenever an element is inserted, both the key and the value are inserted into the data structure at that index.

Linear Probing

Rather than solve a collision with an extra data structure, the scheme tries to put the key and value in a different spot in the array. With linear probing, if there is a collision at index i, the algorithm tries to put the key and value at index i + 1, then index i + 2, etc. Until it finds an open slot. To find out if a key is in the hash, the algorithm must hash to an index. If the key and value exists at that index, then it is found. If the key and value do not exist at that index, then continue looking linearly through the array until the key and value are found, or an empty space is found in the array. If there is an empty space, you know the key and value are not in the array.

Big O Runtime of Hash Tables

  • Inserting O(1)
  • Removing O(1) at best. O(n) at worst. Average: O(1).  
  • Accessing a Value Using a Key? average: o(1) but up to o(n)
  • Finding A Value (without key)? O(n)
  • Space Complexity? O(n)
Made with Slides.com