10.8 Hash Table
A hash table
(also known as a hash map) uses space complexity to store data. By leveraging a hash function, it maps positions to achieve approximate time complexity for insertion, lookup, and deletion operations. Hash tables can be utilized for tasks such as frequency counting and content recording.
If the elements are finite and their range is small, a fixed-size array can be used to store or count elements. For instance, to count the occurrences of all letters in a string, you can use an array of length 26, where the hash function maps each letter to its position in the alphabet. This approach reduces the space complexity to a constant.
1. Two Sum
Problem Description
Given an (unsorted) array of integers, find the indices of two numbers that add up to a specific target. Assume that there is exactly one solution.
Input and Output Example
Input is a one-dimensional integer array and a target value. Output is a one-dimensional array of size 2, representing the indices of the two numbers.
Input: nums = [2, 7, 15, 11], target = 9
Output: [0, 1]
In this example, the value at index 0 (2) and the value at index 1 (7) sum up to 9.
Solution Explanation
We can use a hash table to store values we have already seen along with their indices. For each index i
, we check if target - nums[i]
exists in the hash table. If it does, it means the two numbers sum up to the target.
- C++
- Python
vector<int> twoSum(vector<int>& nums, int target) {
unordered_map<int, int> cache; // <value, index>
for (int i = 0; i < nums.size(); ++i) {
int num1 = nums[i], num2 = target - num1;
if (cache.contains(num2)) {
return vector<int>{cache[num2], i};
}
cache[num1] = i;
}
return {};
}
def twoSum(nums: List[int], target: int) -> List[int]:
cache = dict() # <value, index>
for i, num1 in enumerate(nums):
num2 = target - num1
if num2 in cache:
return [cache[num2], i]
cache[num1] = i
return []
128. Longest Consecutive Sequence
Problem Description
Given an array of integers, find the length of the longest consecutive sequence that can be formed using the numbers in the array.
Input and Output Example
Input is an array of integers, and output is an integer representing the length of the consecutive sequence.
Input: [100, 4, 200, 1, 3, 2]
Output: 4
In this example, the longest consecutive sequence is [1,2,3,4].
Solution Explanation
We can put all the numbers into a hash table and repeatedly pick any value from the hash table. For each value, we remove all its consecutive numbers before and after it, and update the length of the longest consecutive sequence. Repeating this process allows us to find all consecutive number sequences.
- C++
- Python
int longestConsecutive(vector<int>& nums) {
unordered_set<int> cache(nums.begin(), nums.end());
int max_len = 0;
while (!cache.empty()) {
int cur = *(cache.begin());
cache.erase(cur);
int l = cur - 1, r = cur + 1;
while (cache.contains(l)) {
cache.erase(l--);
}
while (cache.contains(r)) {
cache.erase(r++);
}
max_len = max(max_len, r - l - 1);
}
return max_len;
}
def longestConsecutive(nums: List[int]) -> int:
cache = set(nums)
max_len = 0
while len(cache) > 0:
cur = next(iter(cache))
cache.remove(cur)
l, r = cur - 1, cur + 1
while l in cache:
cache.remove(l)
l -= 1
while r in cache:
cache.remove(r)
r += 1
max_len = max(max_len, r - l - 1)
return max_len
149. Max Points on a Line
Problem Description
Given some points in a 2D plane, find the maximum number of points that lie on the same straight line.
Input and Output Example
Input is a 2D integer array representing the x and y coordinates of each point. Output is an integer representing the maximum number of points on the same line.
Input: [[1,1],[3,2],[5,3],[4,1],[2,3],[1,4]]
^
|
| o
| o o
| o
| o o
+------------------->
0 1 2 3 4 5 6
Output: 4
In this example, the line contains four points.
Solution Explanation
For each point, we use a hash table to count the number of points with the same slope relative to that point. A line is uniquely determined by one point and its slope. Additionally, we must account for cases where the slope does not exist and for duplicate coordinates.
This problem also uses a small optimization: when iterating through each point at position i
, we only consider points after i
since points before i
have already been considered.
- C++
- Python
int maxPoints(vector<vector<int>>& points) {
int max_count = 0, n = points.size();
for (int i = 0; i < n; ++i) {
unordered_map<double, int> cache; // <slope, number of points>
int same_xy = 1, same_y = 1;
for (int j = i + 1; j < n; ++j) {
if (points[i][1] == points[j][1]) {
++same_y;
if (points[i][0] == points[j][0]) {
++same_xy;
}
} else {
double dx = points[i][0] - points[j][0],
dy = points[i][1] - points[j][1];
++cache[dx / dy];
}
}
max_count = max(max_count, same_y);
for (auto item : cache) {
max_count = max(max_count, same_xy + item.second);
}
}
return max_count;
}
def maxPoints(points: List[List[int]]) -> int:
max_count, n = 0, len(points)
for i, point1 in enumerate(points):
cache = dict() # <slope, number of points>
same_xy, same_y = 1, 1
for point2 in points[i + 1:]:
if point1[1] == point2[1]:
same_y += 1
if point1[0] == point2[0]:
same_xy += 1
else:
dx, dy = point1[0] - point2[0], point1[1] - point2[1]
cache[dx / dy] = cache.get(dx / dy, 0) + 1
max_count = max(max_count, same_y)
for count in cache.values():
max_count = max(max_count, same_xy + count)
return max_count