Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

In this assignment, you are going to build a hash table for integers using the f

ID: 3575470 • Letter: I

Question

In this assignment, you are going to build a hash table for integers using the file.dat as input.

The input file contains 200,000 integers. You will read in the first 100,000 integers and hash them into an array using modulo-division hashing. The values range from 100,000 to 500,000 and contain duplicate numbers. You will use quadratic probe collision detection. In order to simplify the problem, you will only probe 10 times. If you can’t store (or find) the result after 10 probs, you won’t store the value. If you can’t store a value, you need to generate output that tells the number can’t be stored. After you create the hash table, you will read the next 100,000 items and see if you can find the key in the hashed array. You need to indicate how many times you probed to find the data.

Some things to consider:

How large should the array be? (Hint: Remember that we discussed using storage that is about 25% larger. )

What would be a good number to use for hashing? (Hint: Prime numbers work better. Consider using 100,003.

Input will be from file.dat. You will need to read in 100,000 and hash them.

Output will be to a file. There could be 2 parts to the output. First, any numbers that can’t be stored. Second, any numbers that can’t be found in the hash table or that take more than 1 probe to find.

Sample output file:

Creating Hash Table

Not stored:

150,000

450,000

Searching Hash Table

Found               123457              5 probes

Not found         150,000

Explanation / Answer

main.cpp

// T4.LC++
/*******************************************************************************

Description of the problem:
    In this assignment, you are going to build a hash table for integers using
    the file.dat as input. The input file contains 200,000 integers. You will
    read in the first 100,000 integers and hash them into an array using
    modulo-division hashing. The values range from 100,000 to 500,000 and
    contain duplicate numbers. You will use quadratic probe collision detection.
    In order to simplify the problem, you will only probe 10 times. If you can’t
    store (or find) the result after 10 probes, you won’t store the value. If
    you can’t store a value, you need to generate output that tells the number
    can’t be stored. After you create the hash table, you will read the next
    100,000 items and see if you can find the key in the hashed array. You need
    to indicate how many times you probed to find the data.

Pseudo Code:
    int main
        Opens data file (input)
        opens results file (output)
        Stores items that aren't placed into the hash table
        after probing, show results of how many times an item had to probe
            before being found

Notes:

*******************************************************************************/

// %%%%% libraries %%%%%
#include <iostream>
#include <fstream>
#include <string>
#include <cstdlib>

// %%%%% files %%%%%
#include "table.hpp"

using namespace std;

// %%%%% definitions %%%%%
#define DEBUG false
#define debugVar(n,var) if(DEBUG) cout<<" > DEBUG: "<<n<<" = "<<var<<" <";
#define debugSay(msg) if(DEBUG) cout<<" > DEBUG: "<<msg<<" <";

#define arraySize 125000 // Set initial array size

// %%%%% code %%%%%

int main()
{
    Hash myHash;

    int num;
    ifstream in("file.dat");
    // Opens the data file
    ofstream out("results.dat");
    // Opens the output file

    out << "Not stored: ";
    for(int i = 0; i < 100000; i++){
        in >> num;
        if( !myHash.insert( num ) ){
            out << num << " ";
        }
    }

    out << " Searching Hash Table ";
    for(int i = 0; i < 100000; i++){
        in >> num;
        int found = myHash.search( num );
        if( found == -1 ){
            out << "Not found " << num << " ";
        }
        else{
            out << "Found " << num << " " << found << " probes" << " ";
        }
    }

    return 0;
}

table.cpp

#include "table.hpp"

using namespace std;

// %%%%% definitions %%%%%
#define arraySize 125000 // Set initial array size

// %%%%% code %%%%%

Hash::Hash()
{
    for(int i = 0; i < arraySize; i++)
    {
        array[i] = -1; // -1 meaning no number stored
    }
    MOD = 100003;
}

bool Hash::insert(int num)
{
    int loc = num%MOD;
    if( array[loc] != -1 )
    {
        for(int i = 1; i <= 10; i++)
        {
            int newLoc = loc + i*i;
            if( array[ newLoc ] == -1 )
            {
                array[ newLoc ] = num;
                return true;
            }
        }
        // All probes failed, the value will not be stored
        return false;
    }
    else{
        array[ loc ] = num;
    }
    return true;
}

int Hash::search( int num )
{
    int loc = num%MOD;
    if( array[loc] == -1 )
    {
        return -1;
    }
    if( array[loc] == num )
    {
        return 0;
    }
    else
    {
        for(int i = 1; i <= 10; i++ )
        {
            int newLoc = loc + i * i ;
            if( array[newLoc] == -1 )
            {
                return -1;
            }
            if( array[newLoc] == num )
            {
                return i;
            }
        }
    }
    return -1;
}


table.hpp


using namespace std;

#define arraySize 125000 // Set initial array size

class Hash
{
    int MOD;
    int array[arraySize]; // arraySize defined in definitions
public:
    Hash();
    bool insert(int num);
    int search(int num);
};