Problem
We are given an integer array nums and an integer k. We need to return the k most frequent elements in the array. The answer is guaranteed to be unique, and we can return the result in any order.
Example 1
Input: nums = [1,2,2,3,3,3], k = 2
Output: [2,3]Explanation:
1appears once2appears twice3appears three times
The two most frequent elements are 2 and 3.
Example 2
Input: nums = [7,7], k = 1
Output: [7]Approach 1: Sorting by Frequency
The simplest way to solve this problem is:
- Count how many times each number appears.
- Store each number with its frequency.
- Sort by frequency.
- Pick the top
kelements.
from typing import List
class Solution:
def topKFrequent(self, nums: List[int], k: int) -> List[int]:
count = {}
for num in nums:
count[num] = 1 + count.get(num, 0)
arr = []
for num, cnt in count.items():
arr.append([cnt, num])
arr.sort()
res = []
while len(res) < k:
res.append(arr.pop()[1])
return resExplanation
First, we build a frequency map:
count[num] = 1 + count.get(num, 0)For example:
nums = [1,2,2,3,3,3]The frequency map becomes:
{
1: 1,
2: 2,
3: 3
}Then we store the frequency first and the number second:
[frequency, number]So the array becomes:
[[1,1], [2,2], [3,3]]After sorting, the largest frequencies are at the end. We keep popping from the end until we have k elements.
Complexity
Let n be the length of nums.
Time complexity:
O(n log n)Space complexity:
O(n)Approach 2: Min Heap
Instead of sorting all elements, we can keep only the top k frequent elements in a heap.
Python’s heapq is a min heap. That means the smallest frequency stays at the top.
If the heap size becomes greater than k, we remove the least frequent element.
import heapq
from typing import List
class Solution:
def topKFrequent(self, nums: List[int], k: int) -> List[int]:
count = {}
for num in nums:
count[num] = 1 + count.get(num, 0)
heap = []
for num in count.keys():
heapq.heappush(heap, (count[num], num))
if len(heap) > k:
heapq.heappop(heap)
res = []
for i in range(k):
res.append(heapq.heappop(heap)[1])
return resExplanation
We first count the frequency of every number.
Then, for each unique number, we push this into the heap:
(frequency, number)If the heap size becomes greater than k, we remove the smallest frequency:
heapq.heappop(heap)This ensures the heap always stores only the k most frequent elements.
Complexity
Let n be the length of nums, and m be the number of unique elements.
Time complexity:
O(n + m log k)Space complexity:
O(n)This is better than sorting when k is much smaller than the number of unique elements.
Approach 3: Bucket Sort
This is the most optimal approach.
The maximum possible frequency of any number is len(nums). So we can create buckets where the index represents the frequency.
For example:
freq[3]will store all numbers that appear 3 times.
from typing import List
class Solution:
def topKFrequent(self, nums: List[int], k: int) -> List[int]:
count = {}
freq = [[] for num in range(len(nums) + 1)]
for num in nums:
count[num] = 1 + count.get(num, 0)
for num, cnt in count.items():
freq[cnt].append(num)
res = []
for i in range(len(freq) - 1, 0, -1):
for num in freq[i]:
res.append(num)
if len(res) == k:
return resExplanation
First, we count the frequency of each number:
{
1: 1,
2: 2,
3: 3
}Then we place each number into a bucket based on its frequency:
freq[1] = [1]
freq[2] = [2]
freq[3] = [3]Now we iterate from the highest possible frequency down to 1.
The first elements we find are the most frequent ones.
For example:
nums = [1,2,2,3,3,3]
k = 2We check:
freq[6]
freq[5]
freq[4]
freq[3] -> [3]
freq[2] -> [2]So the answer is:
[3,2]This is valid because the output can be in any order.
Complexity
Time complexity:
O(n)Space complexity:
O(n)This is the best solution because we avoid sorting and heap operations.
Final Thoughts
There are multiple ways to solve this problem.
The sorting solution is the easiest to understand.
The heap solution is useful when we only want to keep track of the top k elements.
The bucket sort solution is the most efficient because it runs in linear time.
For interviews, the bucket sort approach is usually the best answer because it gives us: O(n) time complexity.