Segment Tree Intro

Use case:

Given an array , we want to:

  • find the sum of elements in
  • update the values of an elment in the array

the Segment Tree can process both queries in time.

The space complexity is in the worst case. However, it could be reduced to

Segment-Tree-Iterative

It is the internal arrangement of the segmentree (1-indexed) which is built by bottom-up iterative method. Few facts to mention:

  • the actual values are stored in
  • If the index of current item is , then the childs are

Implementation (recursive version)

Segment-Tree.drawio

  • The number in center of the cell is the index of vertex.
  • The range in purple is the left and right bound of the segment.

How to build tree?

Note that:

  • The left bound and right bound of the leaf node satisfy: left_bound == right_bound. And that’s where we end the recursion. The boundry of the original data is where is the number of elements.
  • The index of vertex starts from 1. So in the main function, we call the build function like build(data,1, 0, N-1)
1
2
3
4
5
6
7
8
9
10
void build(int a[], int v, int tl, int tr) {
if (tl == tr) {
t[v] = a[tl];
} else {
int tm = (tl + tr) / 2;
build(a, v*2, tl, tm);
build(a, v*2+1, tm+1, tr);
t[v] = t[v*2] + t[v*2+1];
}
}

Update the value?

  • pos is 0-indexed and determines which child node we go into. (acts like binary search)
1
2
3
4
5
6
7
8
9
10
11
12
void update(int v, int tl, int tr, int pos, int new_val) {
if (tl == tr) {
t[v] = new_val;
} else {
int tm = (tl + tr) / 2;
if (pos <= tm)
update(v*2, tl, tm, pos, new_val);
else
update(v*2+1, tm+1, tr, pos, new_val);
t[v] = t[v*2] + t[v*2+1];
}
}

Range Sum Query?

  • tl,tr is the boundry of the segment
  • l,r is the boundry of the query
1
2
3
4
5
6
7
8
9
10
int sum(int v, int tl, int tr, int l, int r) {
if (l > r)
return 0;
if (l == tl && r == tr) {
return t[v];
}
int tm = (tl + tr) / 2;
return sum(v*2, tl, tm, l, min(r, tm))
+ sum(v*2+1, tm+1, tr, max(l, tm+1), r);
}

Other variations

The values of items are determined based on the specific task. In above example, we want to query the sum of elements in given range. Likewise, we can also make queries of:

  • Maxium (and the number of times it appears)
  • the greatest common divisor(GCD) / least common multiple(LCM)
  • counting the number of zeros, searching for the k-th zero. (basically it perform like binary search by descending the segment tree. The internal nodes stores the sum of zeros in both child nodes.)

Searching for an array prefix with a given amount.

Given value , we have to quickly find the first(smallest) such that the sum of first elements of array
When we descend the tree, we compare with the value of left and right child. If , then the is in right part of the tree.

Finding sub-segments with the maximum sum

Given the range for each query, we have the find a sub-segment such that and the sum of elements in this range is maximal. We also want to modify individual element of the array.

This time, in each vertex we store:

  • the sum of the segment
  • the maximum prefix sum
  • the maximum suffix sum
  • the sum of the maximal sub-segment

The answer for the current vertex is either:

  • a[left].ans, which means the optimal is entirely placed in the left child.
  • a[right].ans, which means the optimal is entirely placed in the right child.
  • a[left].ans+a[right].ans, which means the optimal subsegments interects with both children.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
class Node {
int sum;
int prefix_sum;
int suffix_sum;
int maximal_subseg;
}

Node combine(Node left, Node right) {
Node newNode;
newNode.sum = left.sum + right.sum;
newNode.prefix_sum = max(left.prefix_sum, left.sum + right.prefix_sum);
newNode.suffix_sum = max(right.suffix_sum, left.suffix_sum + right.sum);
newNode.maximal_subseg = max(left.maximal_subseg+right.maximal_subseg,max(left.maximal_subseg,right.maximal_subseg));
return newNode;
}

Node query(int v, int tl, int tr, int l, int r) {
if (l > r)
return Node(0);
if (l == tl && r == tr)
return t[v];
int tm = (tl + tr) / 2;
return combine(query(v*2, tl, tm, l, min(r, tm)),
query(v*2+1, tm+1, tr, max(l, tm+1), r));
}

Range updates (Lazy Propagation)

Segment Tree allows applying modification queries to an entire segment of contiguous elements, and perform the query in the same time .

Assignment on segments

Suppose the modification query asks to assign each element of a certain segment to value

Previously, we have to modify every left nodes in the range and also updates its parent node. In the worst case, the modification costs .

Alternatively, we could make a lazy update(like delay writing): we only change some segments, and leave others unchanged. A marked vertex means that every element of the corresponding segment is assigned to that value, and actually also the complete subtree should only contain this value. Before we want to query a value in the range, the actual modification happens by pushing the information to the child nodes.

Summarizing we get: for any queries (a modification or reading query) during the descent along the tree we should always push information from the current vertex into both of its children. We can understand this in such a way, that when we descent the tree we apply delayed modifications, but exactly as much as necessary (so not to degrade the complexity of O(log⁡n).

Adding on segments

Core of lazy propagation

1
2
3
4
5
6
7
8
9
10
void push_down(int v,int lb,int rb) {
if(marked[v]) {
int mid = lb + ((rb-lb) >> 1);
tree[2*v] += marked[v]*(mid-lb+1);
tree[2*v+1] += marked[v]*(rb-mid);
marked[v*2] += marked[v];
marked[v*2+1] += marked[v];
marked[v] = 0;
}
}

What the above code basically do is push the information to its children. The information includes mark[] and tree[].

  • mark[] tells how many we should add to the range of the elements.
  • modifying tree[] means when the query are made on this range(precisely the range of segment in the tree), we could always use the updated value without descending the tree.

Add to range

1
2
3
4
5
6
7
8
9
10
11
12
13
14
void ADD(int v, int lb, int rb, int ql, int qr, int add) {
if (ql > qr) return;

if (lb == ql && rb == qr) {
tree[v] += (qr-ql+1)*add;
marked[v] += add;
return;
}
push_down(v,lb,rb); // push down information to child.
int mid = lb + ((rb-lb) >> 1); // prevent overflow
ADD(v * 2, lb, mid, ql, min(mid, qr), add);
ADD(v * 2+1, mid + 1, rb, max(mid + 1, ql), qr, add);
tree[v] = tree[v * 2] + tree[v * 2 + 1]; // push up
}
  • ql,qr represent the left and right bound of the query(inclusive). In other words, the function will add to range
  • lb,rb represent the left and right bound of the segment. They’re only related to the tree node you’re visiting.
  • When you find the range of segment which precisely match the query range, you should also update the sum of the segment. i.e. tree[v]. Because this information is used to update parents’ sum of range.

Range Sum query

1
2
3
4
5
6
7
8
9
10
ll RSQ(int v, int lb, int rb, int ql, int qr) {
if (ql > qr) return 0;
if (lb == ql && rb == qr) {
return tree[v];
}
push_down(v,lb,rb); // push down information to child.
int mid = lb + ((rb-lb) >> 1); // prevent overflow
return RSQ(v * 2, lb, mid, ql, min(mid, qr)) +
RSQ(v * 2 + 1, mid + 1, rb, max(mid + 1, ql), qr);
}