Hashing in DBMS


In a large database, data is stored at various locations. It becomes hectic and time-

consuming when locating a specific type of data in a database via linear search or


binary search. This problem is solved by “Hashing”.

Hashing is an advantageous technique which uses a hash function to find the exact

location of a data record in minimum amount of time.

For example, we recorded data of multiple students in an alphabetical format in a

database of college. But, it is still difficult to locate data every time using linear

search.

Why do we need Hashing?

In DBMS, hashing is a technique to directly search the location of desired data on

the disk without using index structure. Data is stored in the form of data blocks

whose address is generated by applying a hash function in the memory location

where these records are stored known as a data block or data bucket.

Here, are the situations in the DBMS where you need to apply the Hashing method:

For a huge database structure, it's tough to search all the index values through all its

level and then you need to reach the destination data block to get the desired data.

Hashing method is used to index and retrieve items in a database as it is faster to

search that specific item using the shorter hashed key instead of using its original

value.

Hashing is an ideal method to calculate the direct location of a data record on the

disk without using index structure.

It is also a helpful technique for implementing dictionaries.


Important Terminologies using in Hashing

Here, are important terminologies which are used in Hashing:

Data bucket – Data buckets are memory locations where the records are stored. It

is also known as Unit of Storage.

Key: A DBMS key is an attribute or set of an attribute which helps you to identify a

row(tuple) in a relation(table). This allows you to find the relationship between two

tables.

Hash function: A hash function, is a mapping function which maps all the set of

search keys to the address where actual records are placed.

Linear Probing – Linear probing is a fixed interval between probes. In this method,

the next available data block is used to enter the new record, instead of overwriting

on the older record.

Quadratic probing- It helps you to determine the new bucket address. It helps you

to add Interval between probes by adding the consecutive output of quadratic

polynomial to starting value given by the original computation.

Hash index – It is an address of the data block. A hash function could be a simple

mathematical function to even a complex mathematical function.

Double Hashing –Double hashing is a computer programming method used in hash

tables to resolve the issues of has a collision.

Bucket Overflow: The condition of bucket-overflow is called collision. This is a

fatal stage for any static has to function.