Heap

Task management

How to manage the processes running in the operating system.

 

listen music, download files and write essay

Queue

  • Short task may have to wait a long time.
  • High priority task cannot get prioritized.

ArrayList

Have to sort every time a new task arrives?
Insert to maintain order every time?

Heap (min, max)

Very fast for getting the lowest/highest priority element.

Heap (min, max)

  • Specialized tree-based data structure.
  • If A is a parent node of B then the key of node A is ordered with respect to the key of node B with the same ordering applying across the heap
    • min-heap: root is always smaller than children.
    • max-heap: root is always larger than children.
  • A common implementation would be binary heap, which is based on complete binary tree.

Heap (min, max)

       100

       /    \ 

   19     36

   /  \     /  \

17   3  25

           2

       /       \ 

   19        3

   /  \       /  \

20   31  5   8 

max-heap

min-heap

Heap (min, max)

  • Min-Heap
    • Root is the smallest node of the whole tree.
  • Max-Heap
    • Root is the largest node of the whole tree.
  • Complete Binary Tree
    • So if you construct a heap by yourself, then you only need an Array

Basic Operations

  • insert, (offer)
  • delete
    • delete root, (poll)
  • initialize

Basic Operations

  • insert
    • Insert the element to the end every time
    • Then do the bubble up
    • At most bubble up to the root, operation times would be the depth, log(N)
    • O(logN)

Basic Operations

  • insert -> bubble up

                90

          /             \

       70             50

     /      \         /    \

  65      44     30 20

  /  \     / 

35 21 8 

80

                90

          /             \

       70             50

     /      \         /    \

  65      44     30 20

  /  \     /  \   

35 21 8 80  

                90

          /             \

       80             50

     /      \         /    \

  65      70     30 20

  /  \     /  \   

35 21 8 44  

Basic Operations

  • insert
  • delete (pop, poll)
    • After swapping, at most sift down to the leaf, operation time would be the depth, log(N).
    • O(logN).
    • delete root, O(logN)
      • O(1), get the largest/smallest element.
      • O(logN), to moderate.

Basic Operations

  • insert
  • delete

                90

          /             \

       70             50

     /      \         /    \

  65      44     30 20

  /  \     / 

35 21 8 

                90

          /             \

        8              50

     /      \         /    \

  65      44     30 20

  /  \      

35 21  

                90

          /             \

       65             50

     /      \         /    \

  35      44     30 20

  /  \      

 8  21  

Basic Operations

  • insert
  • delete
  • initialize

Basic Operations

  • insert
  • delete
  • initialize
    • Insert elements one by one, O(NlogN).

Basic Operations

  • insert
  • delete
  • initialize
    • Insert elements one by one, O(NlogN).
    • 26, 45, 21, 37, 89, 12, 9

Basic Operations

  • insert
  • delete
  • initialize
    • Insert elements one by one, O(NlogN).
    • 26, 45, 21, 37, 89, 12, 9

        26

Basic Operations

  • insert
  • delete
  • initialize
    • Insert elements one by one, O(NlogN).
    • 26, 45, 21, 37, 89, 12, 9

        26

      /

   45

Basic Operations

  • insert
  • delete
  • initialize
    • Insert elements one by one, O(NlogN).
    • 26, 45, 21, 37, 89, 12, 9

        45

      /

   26

Basic Operations

  • insert
  • delete
  • initialize
    • Insert elements one by one, O(NlogN).
    • 26, 45, 21, 37, 89, 12, 9

        45

      /      \

   26      21

Basic Operations

  • insert
  • delete
  • initialize
    • Insert elements one by one, O(NlogN).
    • 26, 45, 21, 37, 89, 12, 9

        45

      /      \

   26      21

   /

37

Basic Operations

  • insert
  • delete
  • initialize
    • Insert elements one by one, O(NlogN).
    • 26, 45, 21, 37, 89, 12, 9

        45

      /      \

   37      21

   /

26

Basic Operations

  • insert
  • delete
  • initialize
    • Insert elements one by one, O(NlogN).
    • 26, 45, 21, 37, 89, 12, 9

        45

      /      \

   37      21

   / \

26 89

Basic Operations

  • insert
  • delete
  • initialize
    • Insert elements one by one, O(NlogN).
    • 26, 45, 21, 37, 89, 12, 9

        45

      /      \

   89      21

   / \

26 37

Basic Operations

  • insert
  • delete
  • initialize
    • Insert elements one by one, O(NlogN).
    • 26, 45, 21, 37, 89, 12, 9

        89

      /      \

   45      21

   / \

26 37

Basic Operations

  • insert
  • delete
  • initialize
    • Insert elements one by one, O(NlogN).
    • 26, 45, 21, 37, 89, 12, 9

        89

      /      \

   45      21

   / \      /

26 37 12

Basic Operations

  • insert
  • delete
  • initialize
    • Insert elements one by one, O(NlogN).
    • 26, 45, 21, 37, 89, 12, 9

        89

      /      \

   45      21

   / \      /  \

26 37 12  9

Basic Operations

  • insert
  • delete
  • initialize
    • sift down level by level, O(N)

Basic Operations

  • insert
  • delete
  • initialize
    • sift down level by level, O(N)
    • 45, 36, 18, 53, 72, 30, 48, 93, 15, 35

              45

          /          \

        36        18

       /   \       /   \

   53     72 30 48

   /  \    /

93 15 35

Basic Operations

  • insert
  • delete
  • initialize
    • sift down level by level, O(N)
    • 45, 36, 18, 53, 72, 30, 48, 93, 15, 35

              45

          /          \

        36        18

       /   \       /   \

   53     72 30 48

   /  \    /

93 15 35

Basic Operations

  • insert
  • delete
  • initialize
    • sift down level by level, O(N)
    • 45, 36, 18, 53, 72, 30, 48, 93, 15, 35

              45

          /          \

        36        18

       /   \       /   \

   53     72 30 48

   /  \    /

93 15 35

              45

          /          \

        36        18

       /   \       /   \

   93     72 30 48

   /  \    /

53 15 35

Basic Operations

  • insert
  • delete
  • initialize
    • sift down level by level, O(N)
    • 45, 36, 18, 53, 72, 30, 48, 93, 15, 35

              45

          /          \

        36        18

       /   \       /   \

   93     72 30 48

   /  \    /

53 15 35

              45

          /          \

        36        48

       /   \       /   \

   93     72 30 18

   /  \    /

53 15 35

Basic Operations

  • insert
  • delete
  • initialize
    • sift down level by level, O(N)
    • 45, 36, 18, 53, 72, 30, 48, 93, 15, 35

              45

          /          \

        36        48

       /   \       /   \

   93     72 30 18

   /  \    /

53 15 35

              45

          /          \

        93        48

       /   \       /   \

   36     72 30 18

   /  \    /

53 15 35

              45

          /          \

        93        48

       /   \       /   \

   53     72 30 18

   /  \    /

36 15 35

Basic Operations

  • insert
  • delete
  • initialize
    • sift down level by level, O(N)
    • 45, 36, 18, 53, 72, 30, 48, 93, 15, 35

              45

          /          \

        93        48

       /   \       /   \

   53     72 30 18

   /  \    /

36 15 35

              93

          /          \

        45        48

       /   \       /   \

   53     72 30 18

   /  \    /

36 15 35

              93

          /          \

        72        48

       /   \       /   \

   53     45 30 18

   /  \    /

36 15 35

Basic Operations

  • insert
  • delete
  • initialize
    • sift down level by level, O(N)
    • Last level, do not need to move
    • second last level, at most sift down once
T = \frac{n}{4} \times 1 + \frac{n}{8} \times 2 + ... + 1 \times ( \log n - 1)
2T = \frac{n}{2} \times 1 + \frac{n}{4} \times 2 + ... + 2 \times ( \log n - 1)
T = \frac{n}{2} + \frac{n}{4} + \frac{n}{8} + ... = O(n)

Basic Operations

  • insert, O(logN)
  • delete, O(logN)
  • initialize, O(N)
    • Insert elements one by one, O(NlogN).
    • sift down level by level, O(N)

Heap Summary

  • insert, O(logN)
  • delete, O(logN)
  • initialize, O(N)
  • Good at maintaining a dynamic data stream, from which largest/smallest/certain position is always needed, while there is no need for sorting.

Merge Sorted Array/List

  • Two sorted array/list
    • Two pointer, move the smaller one backward.
  • K sorted array/list
    • Impossible to manually maintain k pointers.
    • Data structure to maintain k pointers and capable of returning the largest/smallest element with high efficiency.
    • Heap

Merge k Sorted List

Merge k sorted linked lists and return it as one sorted list. Analyze and describe its complexity.

 

public ListNode merge(ListNode[] lists);

Merge k Sorted List

Merge k sorted linked lists and return it as one sorted list. Analyze and describe its complexity.

 

Insert all head nodes into minHeap.

while (minHeap is not empty)

     root = minHeap.pop();

     Add root into result

     Insert root.next into minHeap.

return result

Merge k Sorted List

    def mergeKLists(self, lists: List[ListNode]) -> ListNode:
        q = []
        for index, each_list in enumerate(lists):
            if each_list:
                heapq.heappush(q, [each_list.val, index,  each_list])    
        
        ans = []
        dummy = ListNode(-1)
        cur = dummy 
        while q:
            v, i, node = heapq.heappop(q)
            cur.next = node
            cur = cur.next
            if node.next:
                heapq.heappush(q, [node.next.val, i, node.next])
        
        return dummy.next 
            

Merge k Sorted List

This can be used in external merge sort

 

But usually people will use tournament tree instead of a heap to do the external merge sort

10,2,8,5,20,30

 

7,9,0,-2,35,21

 

1,90,80,15,-1,6

2,5,8,10,20,30

 

-2,0,7,9,21,35

 

-1,1,6,15,80,90

-2,-1,0,1,2,5

 

6,7,8,9,10,15

 

20,21,30,35,80,90

Find Maximum Number

  • O(N) to find maximum
  • What if we need to update the list very often, that said, when the maximum is removed, a new number will come instantly.
    • Need to compare n times again. O(N)
    • Sort? Need to sort every time.
      • O(NlogN) for the first time and O(N) every time.
      • Inserting number is very time consuming.
    • Heap.
      • Good at dealing with data stream.

Sliding Window Maximum

Given an array nums, there is a sliding window of size k which is moving from the very left of the array to the very right. You can only see the k numbers in the window. Each time the sliding window moves right by one position.

 

For example,

Given nums = [1,3,-1,-3,5,3,6,7], and k = 3.

Return [3,3,5,5,6,7].

 

Note:

You may assume k is always valid, ie: 1 ≤ k ≤ input array's size for non-empty array.

Sliding Window Maximum

class Solution:
    def maxSlidingWindow(self, nums: List[int], k: int) -> List[int]:
        n = len(nums)
        # 注意 Python 默认的优先队列是小根堆
        q = [(-nums[i], i) for i in range(k)]
        heapq.heapify(q)

        ans = [-q[0][0]]
        for i in range(k, n):
            heapq.heappush(q, (-nums[i], i))
            while q[0][1] <= i - k:
                heapq.heappop(q)
            ans.append(-q[0][0])
        
        return ans

Sliding Window Maximum

    def maxSlidingWindow(self, nums: List[int], k: int) -> List[int]:
        if not nums or len(nums) == 0:
            return nums
        
        queue = deque()
        result = [0] * (len(nums) - k + 1)
        
        for i in range(len(nums)):
            if queue and queue[0] == i - k:
                queue.popleft()
            
            while queue and nums[queue[-1]] < nums[i]:
                queue.pop()
            
            queue.append(i)
            
            if i >= k - 1:
                result[i - k + 1] = nums[queue[0]]
        
        return result

A better way: Use deque to get O(n) time complexity

Kth Largest Element in an Array

Find the kth largest element in an unsorted array. Note that it is the kth largest element in the sorted order, not the kth distinct element.​​

 

For example,
Given [3,2,1,5,6,4] and k = 2, return 5.

 

Note:
You may assume k is always valid, 1 ≤ k ≤ array's length.

Kth Largest Element in an Array

Find the kth largest element in an unsorted array. Note that it is the kth largest element in the sorted order, not the kth distinct element.​​

  • Find largest element for k times. O(kn)
  • Sort. O(nlogn)
  • Min-Heap
    • Build a min heap with capacity of k.
    • Insert all numbers into the heap.
      • If the heap is full, then only insert when number is larger than the root.
    • Root of the heap will be the kth largest element.

Kth Largest Element in an Array

Find the kth largest element in an unsorted array. Note that it is the kth largest element in the sorted order, not the kth distinct element.​

    def findKthLargest(self, nums: List[int], k: int) -> int:
        minHeap = []
    
        for i in range(k):
            heapq.heappush(minHeap, nums[i])
        
        for i in range(k, len(nums)):
            if nums[i] > minHeap[0]:
                heapq.heappop(minHeap)
                heapq.heappush(minHeap, nums[i])
        
        return heapq.heappop(minHeap)

Kth Largest Element in an Array

Find the kth largest element in an unsorted array. Note that it is the kth largest element in the sorted order, not the kth distinct element.​​

  • Find largest element for k times. O(kn)
  • Sort. O(nlogn)
  • Min-Heap. O(nlogk)
  • Recursion. O(n)
    • Randomly select one number, find its position in array. O(n)
    • If pos > k, then recursion in left half.
    • Otherwise, recursion in right half.
    • Time: n + n/2 + n/4 + .... = 2n = O(n)

Find Median in Data Stream

Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value.

Examples:
[2,3,4] , the median is 3

[2,3], the median is (2 + 3) / 2 = 2.5

Design a data structure that supports the following two operations:

- void addNum(int num) - Add a integer number from the data stream to the data structure.
- double findMedian() - Return the median of all elements so far.

Find Median in Data Stream

Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value.

 

class MedianFinder {

   public void addNum(int num);

   public double findMedian();

}

Find Median in Data Stream

Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value.

minHeap

(big nums)

maxHeap

(small nums)

Median

  • size of minHeap and maxHeap should be the same (diff <= 1).
  • minHeap.peek > maxHeap.peek

Find Median in Data Stream

Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value.

minHeap

(big nums)

maxHeap

(small nums)

Median

  • If num > minHeap.peek
    • Add into minHeap
  • Otherwise
    • Add into maxHeap
  • Maintain size diff

Find Median in Data Stream

Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value.

minHeap

(big nums)

maxHeap

(small nums)

Median

  • If num > minHeap.peek
    • Add into minHeap
  • Otherwise
    • Add into maxHeap
  • Maintain size diff
  • [4, 2, 3, 1, 5]

Find Median in Data Stream

Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value.

Median

  • If num > minHeap.peek
    • Add into minHeap
  • Otherwise
    • Add into maxHeap
  • Maintain size diff
  • [4, 2, 3, 1, 5]

4

Find Median in Data Stream

Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value.

  • If num > minHeap.peek
    • Add into minHeap
  • Otherwise
    • Add into maxHeap
  • Maintain size diff
  • [4, 2, 3, 1, 5]

4

2

Find Median in Data Stream

Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value.

Median

  • If num > minHeap.peek
    • Add into minHeap
  • Otherwise
    • Add into maxHeap
  • Maintain size diff
  • [4, 2, 3, 1, 5]

4

2

Find Median in Data Stream

Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value.

Median

  • If num > minHeap.peek
    • Add into minHeap
  • Otherwise
    • Add into maxHeap
  • Maintain size diff
  • [4, 2, 3, 1, 5]

4

3

2

Find Median in Data Stream

Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value.

  • If num > minHeap.peek
    • Add into minHeap
  • Otherwise
    • Add into maxHeap
  • Maintain size diff
  • [4, 2, 3, 1, 5]

4

3

2

1

Find Median in Data Stream

Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value.

Median

  • If num > minHeap.peek
    • Add into minHeap
  • Otherwise
    • Add into maxHeap
  • Maintain size diff
  • [4, 2, 3, 1, 5]

4

3

2

1

Find Median in Data Stream

Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value.

Median

  • If num > minHeap.peek
    • Add into minHeap
  • Otherwise
    • Add into maxHeap
  • Maintain size diff
  • [4, 2, 3, 1, 5]

4

3

2

1

5

Find Median in Data Stream

class MedianFinder:

    def __init__(self):
        self.minHeap = []
        self.maxHeap = []

    def addNum(self, num: int) -> None:
        if len(self.minHeap) > 0 and num > self.minHeap[0]:
            heapq.heappush(self.minHeap, num)
        else:
            heapq.heappush(self.maxHeap, -num)
        
        if len(self.minHeap) - len(self.maxHeap) == 2:
            heapq.heappush(self.maxHeap, -heapq.heappop(self.minHeap))
        elif len(self.maxHeap) - len(self.minHeap) == 2:
            heapq.heappush(self.minHeap, -heapq.heappop(self.maxHeap))


    def findMedian(self) -> float:
        if len(self.minHeap) > len(self.maxHeap):
            return self.minHeap[0]
        if len(self.minHeap) < len(self.maxHeap):
            return -self.maxHeap[0]
        
        return (self.minHeap[0] - self.maxHeap[0]) / 2.0

Find Median in Data Stream

class MedianFinder:

    def __init__(self):
        self.minHeap = []
        self.maxHeap = []

    def addNum(self, num: int) -> None:
        heapq.heappush(self.minHeap, num)
        heapq.heappush(self.maxHeap, -heapq.heappop(self.minHeap))
        
        if len(self.maxHeap) - len(self.minHeap) > 1:
            heapq.heappush(self.minHeap, -heapq.heappop(self.maxHeap))
  

    def findMedian(self) -> float:
        if len(self.minHeap) == len(self.maxHeap):
            return (self.minHeap[0] - self.maxHeap[0]) / 2.0
        return -self.maxHeap[0]

Heap Summary

  • Data Stream
    • Need to be maintained/update all the time.
  • Max or min is needed, but no need for sorting
  • Other use cases
    • Heap sort