Hash Dr Rang Nguyen Basic concepts Hash functions Direct Hashing Modulo division Digit extraction Mid square Mid square Folding Collision resolution Open addressing Linked list resolution 12 1 Chapter[.]
Trang 1Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Chapter 12
Hash
Data Structures and Algorithms
Dr Rang Nguyen Faculty of Computer Science and Engineering
University of Technology, VNU-HCM
Trang 2Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Outcomes
• L.O.5.1 - Depict the following concepts: hashing table,
key, collision, and collision resolution.
• L.O.5.2 - Describe hashing functions using pseudocode
and give examples to show their algorithms.
• L.O.5.3 - Describe collision resolution methods using
pseudocode and give examples to show their algorithms.
• L.O.5.4 - Implement hashing tables using C/C++.
• L.O.5.5 - Analyze the complexity and develop
experiment (program) to evaluate methods supplied for
hashing tables.
• L.O.1.2 - Analyze algorithms and use Big-O notation to
characterize the computational complexity of algorithms
Trang 3Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Trang 4Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Basic concepts
Trang 5Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Basic concepts
• Sequential search: O(n)
• Binary search: O(log 2 n)
→ Requiring several key
comparisons before the
target is found.
Trang 6Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Trang 7Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Basic concepts
Is there a search algorithm
whose complexity is O(1) ?
YES
Trang 8Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Basic concepts
Is there a search algorithm
whose complexity is O(1) ?
YES
Trang 9Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolutionBasic concepts
Hình: Each key has only one address
Trang 10Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolutionBasic concepts
Trang 11Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
• Ideal hashing :
• No location collision
• Compact address space
Trang 12Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
• Collision : the location of the data to be
inserted is already occupied by the synonym
Trang 13Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
• Collision : the location of the data to be
inserted is already occupied by the synonym
Trang 14Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolutionBasic concepts
Trang 15Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolutionBasic concepts
Trang 16Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolutionBasic concepts
Trang 17Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolutionBasic concepts
Trang 18Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Hash functions
Trang 19Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Trang 20Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Direct Hashing
The address is the key itself:
hash(Key) = Key
Trang 21Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Direct Hashing
• Advantage : there is no collision.
• Disadvantage : the address space (storage
size) is as large as the key space.
Trang 22Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Modulo division
Address = Key mod listSize
• Fewer collisions if listSize is a prime
Trang 23Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Trang 24Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Mid-square
Address = middle digits of Key 2
Example:
9452 * 9452 = 89 3403 04→ 3403
Trang 25Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Trang 26Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Folding
The key is divided into parts whose size
matches the address size.
Trang 27Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Folding
The key is divided into parts whose size
matches the address size.
Trang 28Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Collision resolution
Trang 29Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Collision resolution
• Except for the direct hashing, none of the
others are one-to-one mapping
→ Requiring collision resolution methods
• Each collision resolution method can be
used independently with each hash function
Trang 30Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Trang 31Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Open addressing
When a collision occurs, an
for placing the new element in.
Trang 32Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Trang 33Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Trang 34Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Open Addressing
Algorithm hashInsert(ref T <array>, val k <key>)
Inserts key k into table T.
Trang 35Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Open Addressing
Algorithm hashSearch(val T <array>, val k <key>)
Searches for key k in table T.
Trang 36Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Trang 37Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Linear Probing
• When a home address is occupied, go to
the next address (the current address + 1):
hp(k, i) = (h(k) + i) mod m
Trang 38Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Linear Probing
• When a home address is occupied, go to
the next address (the current address + 1):
hp(k, i) = (h(k) + i) mod m
Trang 39Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolutionLinear Probing
Trang 40Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Linear Probing
• Advantages :
• quite simple to implement
• data tend to remain near their home
address (significant for disk addresses)
• Disadvantages :
• produces primary clustering
Trang 41Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Quadratic Probing
• The address increment is the collision probe
number squared:
hp(k, i) = (h(k) + i 2 ) mod m
Trang 42Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Quadratic Probing
• Advantages :
• works much better than linear probing
• Disadvantages :
• time required to square numbers
• produces secondary clustering
h(k 1 ) = h(k 2 ) → hp(k 1 , i) = hp(k 2 , i)
Trang 43Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Double Hashing
• Using two hash functions:
hp(k, i) = (h 1 (k) + ih 2 (k)) mod m
Trang 44Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolution
Linked List Resolution
• Major disadvantage of Open Addressing :
each collision resolution increases the
probability for future collisions.
→ use linked lists to store synonyms
Trang 45Dr Rang Nguyen
Basic concepts Hash functions
Direct HashingModulo divisionDigit extractionMid-squareMid-squareFolding
Collision resolution
Open addressingLinked list resolutionLinked list resolution