-
Notifications
You must be signed in to change notification settings - Fork 121
/
chapterHeap.tex
128 lines (109 loc) · 4.2 KB
/
chapterHeap.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
\chapter{Heap}
\section{Introduction}
Heap-ordered. Binary heap is one of the implementations of Priority Queue (ADT). The core relationship of elements in the heap:
$A_{2i} \leq A_{i} \geq A_{2i+1}$.
\begin{figure}[hbtp]
\centering
\subfloat{\includegraphics[width=\linewidth]{heap}}
\caption{Heap}
\label{fig:heap}
\end{figure}
\section{Operations}
Assume the root \textbf{starts} at $a[1]$ rather than $a[0]$.
\\
Basic operations:
\begin{enumerate}
\item sink()/ sift\_down() - recursive
\item swim()/ sift\_up() - recursive
\item build()/ heapify() - bottom-up sink()
\end{enumerate}
\subsection{Sink (sift\_down)}
Core clue: compare parent to the \textit{larger} child (because we want to maintain the heap invariant).
\begin{python}
def sink(self, idx):
while 2*idx <= self.N:
c = 2*idx
if c+1 <= self.N and self.less(c, c+1):
c += 1
if not self.less(idx, c):
return
self.swap(idx, c)
idx = c
\end{python}
We can return the \pyinline{idx} at the end to report the final index of the element.
\subsection{Swim (sift\_up)}
Core clue: compare child to its parent.
\begin{python}
def swim(self, idx):
while idx > 1 and self.less(idx/2, idx):
pi = idx/2
self.swap(pi, idx)
idx = pi
\end{python}
\subsection{Heapify}
Core clue: bottom-up sink().
\begin{python}
def heapify(self):
for i in range(self.N/2, 0, -1):
self.sink(i);
\end{python}
\runinhead{Complexity.} Heapifying \textbf{a sorted array} is the worst case for heap construction, because the root of each subheap considered sinks all the way to the bottom. The worst case complexity $\sim 2N$.
Building a heap is $O(N)$ rather than $O(N \lg N)$. Intuitively, the deeper the level, the more the nodes, but the less the level to sink down.
At most $\big\lceil\frac{n}{2^{h+1}}\big\rceil$ nodes of any height $h$.
Proof:
\begin{align*}
\because \sum_{i=0}^{+\infty} {ix^i} =\frac{x}{(1-x)^2} \\
\therefore \sum_{h=0}^{\lfloor\lg n\rfloor}{\Big\lceil\frac{n}{2^{h+1}}\Big\rceil
O(h)} &= O\Bigg(n\sum_{h=0}^{\lfloor\lg n\rfloor}{\frac{h}{2^h}}\Bigg) \\
&= O(n)
\end{align*}
\section{Implementation}
\subsection{General}
The self-implemented binary heap's index usually starts at 1 rather than 0.
The array representation of heap is in \textbf{level-order}.
The main reason that we can use an array to represent the heap-ordered tree in a binary heap is because the tree is \textbf{complete}.
Suppose that we represent a BST containing N keys using an array, with $a[0]$ empty, the root at $a[1]$. The two children of $a[k]$ will be at $a[2k]$ and $a[2k+1]$. Then, the length of the array might need to be as large as $2^N$.
It is possible to have 3-heap. A 3-heap is an array representation (using 1-based indexing) of a complete 3-way tree.
The children of $a[k]$ are $a[3k-1]$, $a[3k]$, and $a[3k+1]$.
\begin{figure}[hbtp]
\centering
\subfloat{\includegraphics[width=\linewidth]{heapRepr}}
\caption{Heap representation}
\label{fig:heap}
\end{figure}
\subsection{Python Heapq}
Python only has built in min-heap. To use max-heap, you can:
\begin{enumerate}
\item Invert the number: 1 becomes -1.
(usually the best solution)\item Wrap the data into another class and override \textbf{comparators}: \_\_cmp\_\_ or \_\_lt\_\_
\end{enumerate}
The following code presents the wrapping method:
\begin{python}
class HeapValue:
def __init__(self, val):
self.val = val
self.deleted = False # lazy delete
def __cmp__(self, other):
# Reverse order by height to get max-heap
assert isinstance(other, Value)
return other.val - self.val
\end{python}
Normally the deletion by value in Python is $O(n)$, to achieve $O(\lg n)$ we can use \textbf{lazy deletion}. Before take the top of the heap, we do the following:
\begin{python}
while heap and heap[0].deleted:
heapq.heappop(heap)
\end{python}
\subsection{Java Priority Queue}
\begin{java}
// min-heap
PriorityQueue<Integer> pq = new PriorityQueue<>(
(o1, o2) -> o1-o2
);
// max-heap
PriorityQueue<Integer> pq = new PriorityQueue<>(
(o1, o2) -> o2-o1
);
\end{java}
\section{Derivatives}
\subsection{Heap of Linked Lists}
Maintain a heap of linked lists, pop the min head, and push the head's next back to the heap.