CRICOS Data Structures and AlgorithmsHash Tables

r78b6d發表於2024-10-08

Data Structures and AlgorithmsHash Tables

Page 1 of 3CRICOS Provide Code: 00301J

Note:

  • hashArray stores the key, value and state (used, free, or previously-used)of every hashEntry.We must store both the key and value since we need to check hashArrayto tell if there is a collision and we should keep probing until we find theright key.
  • put(), hasKey() and get() must take the passed-in key and call hash() toconvert the key into an integer. This integer is then used as the index forhashArray.Java Students: If you use a private inner class for DSAHashEntry, thenput(DSAHashEntry will need to be private, otherwise it will be public.
  • There are many hash functions in existence, but all hash functions mustbe repeatable (i.e., the same key will always give the same index). A goodhash function is fast and will distribute keys evenly inside hashArray.

Hash Tables

Updated: 21st July, 2023

Aims

  • To implement a hash table.
  • To make the above hash table automatically resize.
  • To save the hash table and reload it from a file.

Before the Practical

  • Read this practical sheet fully before starting.

Activities

  1. Hash Table ImplementationFollowing the lecture slides as a guide, Create DSAHashTable class and a companionlass called DSAHashEntry to implement a hash table with a simple hash function. Useinear probing first since it’s easier to think about, then convert to double-hashing.Assume the keys arestrings and the values are Objects.Data Structures and AlgorithmsHash Tables

Page 2 of 3CRICOS Provide Code: 00301J

Note:

  • Of course, the latter depends on the distribution of the keys as well, so it’snot easy to say what a good hash function will be without knowing thekeys.For the purpose of this practical, just use one of the hash functions fromthe lecture notes.
  • Use linear probing or double-hashing to handle collisions when inserting.
  • hasKey(), get() and remove() will need to use the same approach sincethey also need to find the right item.t’s probably a good idea to try make a private find() method that doesthe probing for these three functions and returns the index to use. Use theDSAHashEntry state to tell you when to stop probing.
  • Be aware that remove() with probing methods adds the代 寫Data Structures and AlgorithmsHash Tables problem that itcan break probing unless additional measures are taken.In particular, say we added Key1, then Key2 which collides withKey1, so we linearly probe and add Key2 to thenext entry.If we remove Key1, later attempts to get Key2 will fail because Key2maps to where Key1 used to be.Since it is now null, probing will abort and imply that Key2 doesn’texist.The solution is to use the state filed in DSAHashEntry that trackswhether the entry has been used before or not.
  1. Resizing a Hash Table

Modify your DSAHashTable to allow it to resize. There are various ways to determinewhen to and how to resize a hash table.The simplest way to determine when is to set an upper and lower threshold valuefor the load factor. When the number of elements is outside of this, the put() or

remove() methods should call resize(size) automatically.

  • Remember, this will be computationally expensive (what is it it in Big-O?), so it isimportant not to set the threshold too low. Also, collisions occur more frequentlyat higher load factors, thus it is equally important to not set the threshold toohigh. Do some research to find "good" values.

A simple way to resize is to create a new array, then iterate over hashArray (ignoringunused and previously used slots) and re-hashing (using put().

  • To select a suitable size for the new array, you can either use a "look up" table ofsuitable primes or re-calculate a new prime after doubling/halving the previoussize.Test your resize functionality with a small hash table size, just so you know it will

work when you increase the size of the table.Data Structures and AlgorithmsHash TablesPage 3 of 3CRICOS Provide Code: 00301J

  1. File I/OTo truly test your hash table implementation, you will need a large dataset. Read inthe RandomNames7000.csv as input to insert values into your hash table. There are someduplicates in the file, so your program should beable to handle them.It is alsouseful to be able to save the hash table. The save order is not important,

o just iterate through the keys and values in the order they are stored in the hash

table and write it to a .csv.

Submission Deliverable

  • Your code are due 2 weeks from your current tutorial session.You will demonstrate your work to your tutors during that sessionIf you have completed the practical earlier, you can demonstrate your workduring the next session
  • You must submit your code and any test data that you have been using electronically via Blackboard under the Assessments section before your demonstration.Java students, please do not submit the *.class files

Marking Guide

Your submission will be marked as follows:

  • [6] Your DSAHashTable and DSAHashEntry are implemented correctly.
  • [4] Your hash function is well thought out and properly implemented.This means that it meets at least the first three criteria of a good hash function andyou can argue that it at least partially meets the last.
  • [5] Your hash table resizes as you put and remove hash entries.
  • [5] You can read in and save .csv files.

End of Worksheet

相關文章