Implementation of LCA using Euler Tour

pavanscoding · June 28, 2022, 4:33am

I had a question about Benq’s implementation of LCA using the Euler tour. So in CPH, it said that you are supposed to find the node with the minimum depth in between the two nodes in the Euler tour array in which you are trying to find the LCA. However, in Benq’s implementation, he finds the node that has the earliest appearance in the Euler tour array. They also did this in the cp-algo tutorial on reducing LCA to RMQ. I was wondering if you guys could help me where I went wrong in understanding the Euler tour technique on LCA. Thanks so much!!

pavanscoding · June 29, 2022, 4:23am

bump

TheBeast5520 · June 29, 2022, 5:05am

I also had a similar question where the implementations of Euler Tour varied from implementation to implementation. I think it just depends on your end goal. Depending on what you are trying to do, you can alter the details of how you are doing the Euler Tour to best help your implementation. The main thing to take away is you can flatten a tree into an array which can then allow you to apply some other algorithms (such as storing the first appearance of each node to quickly find subtree sizes).

pavanscoding · June 29, 2022, 5:23am

I also don’t understand how finding the earliest node is the LCA. I was hoping you could help me with that.

TheBeast5520 · June 29, 2022, 5:52am

I’m no expert on this, but the way I’ve seen it done is:
Do a DFS down the tree, and every time you come on a node, put it in the tree. For example, if a tree had two nodes 1, 2 rooted at 1 (with one edge from 1 to 2), the Euler Tour would be,
121
If the tree was defined with two edges, 1 to 2 and 1 to 3 (rooted at 1), the Euler Tour would be
12131
Now, you also keep track of the depths of each node, so for the above it would be
12121 since 1 is at depth 1 and the other two nodes have depth 2.

If you want to find the LCA of a and b, then you would do a range minimum query on the depths of all the nodes between two appearances of a and b in the Euler Tour Array. The reason this works is that to travel from a to b, you essentially have to “come back up” to the LCA of the two nodes before you can travel back down to the second node, which means it would have appeared at least once in the array.

If you didn’t get that too much, you can try this article: https://usaco.guide/CPH.pdf#page=177.

pavanscoding · June 29, 2022, 7:06am

Ohh ok, that makes more sense. How does Benq’s implementation work? Why is the node with the earliest appearance the LCA?

TheBeast5520 · June 29, 2022, 7:24am

If you look at his implementation, you will notice that he does it the same way. He imports range minimum query from another file, then adds a node every time it appears. I’m assuming you are talking abt this implementation: USACO/LCArmq (10.2).h at master · bqi343/USACO · GitHub

Also notice that it shouldn’t matter which appearance of a node you take, as long as you take all the nodes in between the two appearances you chose.

pavanscoding · June 29, 2022, 7:28am

Oh no I was talking about this one:

int n; // The number of nodes in the graph
vector<int> graph[100000];
int timer = 0, tin[100000], euler_tour[200000];
int segtree[800000]; // Segment tree for RMQ

void dfs(int node = 0, int parent = -1) {
	tin[node] = timer; // The time when we first visit a node
	euler_tour[timer++] = node;
	for (int i : graph[node]) {
		if (i != parent) {
			dfs(i, node);
			euler_tour[timer++] = node;
		}
	}
}

int mn_tin(int x, int y) {
	if (x == -1) return y;
	if (y == -1) return x;
	return (tin[x] < tin[y] ? x : y);
}

// Build the segment tree: run `build()` after running dfs`
void build(int node = 1, int l = 0, int r = timer - 1) {
	if (l == r) segtree[node] = euler_tour[l];
	else {
		int mid = (l + r) / 2;
		build(node * 2, l, mid);
		build(node * 2 + 1, mid + 1, r);
		segtree[node] = mn_tin(segtree[node * 2], segtree[node * 2 + 1]);
	}
}

int query(int a, int b, int node = 1, int l = 0, int r = timer - 1) {
	if (l > b || r < a) return -1;
	if (l >= a && r <= b) return segtree[node];
	int mid = (l + r) / 2;
	return mn_tin(query(a, b, node * 2, l, mid), query(a, b, node * 2 + 1, mid + 1, r));
}

// Make sure you run $dfs` and `build()` before you run this
int lca(int a, int b) {
	if (tin[a] > tin[b]) swap(a, b);
	return query(tin[a], tin[b]);
}

TheBeast5520 · June 30, 2022, 12:51am

I’m pretty sure this is also the same thing:

The dfs function creates a Euler Tour where it puts down the node every time it comes across it, like I described above.

The mn_tin function is essentially a minimum comparator function.

The build and query functions form the segment tree.

The lca function just returns the minimum from the Euler Tour by calling the query function.

pavanscoding · July 11, 2022, 9:59am

So based on what CP handbook had to say about finding the LCA using tree depths, here is my implementation.

My implementation

#include <bits/stdc++.h>
using namespace std;

const int mxn=2e5+1;
int n,q,depth[mxn],first[mxn],timer=1,euler[2*mxn];
vector<int> adj[mxn];

struct SEG{
    int N;vector<int> tree;
    SEG(int N_){ N=N_+1;tree.resize(2*N,1e9); }
    int comb(int a, int b){ return depth[a]<depth[b]? a:b; }
    void pull(int p){ tree[p]=comb(tree[2*p],tree[2*p+1]); }
    void update(int p, int val){ tree[p+=N]=val;for(p/=2;p;p/=2) pull(p); }
    int query(int l, int r){
        int ll=1e9, rr=1e9;
        for(l+=N,r+=N+1;l<r;l/=2,r/=2){
            if(l&1) ll=comb(ll,tree[l++]);
            if(r&1) rr=comb(rr,tree[--r]);
        }
        return comb(ll,rr);
    }
};

void dfs(int node, int parent, int d=0){
    first[node]=timer;
    euler[timer++]=node;
    depth[node]=d;
    for(int i: adj[node]){
        if(i!=parent){
            dfs(i,node,d+1);
            euler[timer++]=node;
        }
    }
}

int lca(int a, int b, SEG s){
    return s.query(min(first[a],first[b]), max(first[a],first[b]));
}

int main(){
    cin.tie(0)->sync_with_stdio(0);
    cin >> n >> q;
    for(int i=1;i<n;i++){
        int a; cin >> a;
        adj[a].push_back(i);
        adj[i].push_back(a);
    }
    dfs(1,0);
    SEG seg(2*n-1);
    for(int i=1;i<2*n;i++)
        seg.update(i, euler[i]);
    while(q--){
        int a, b;cin >> a >> b;
        cout << lca(a,b,seg)<<'\n';
    }
}

Note that the stuff in the main function is for solving this problem; CSES - Company Queries II. Now, I’m pretty sure that my implementation is wrong as I keep getting the wrong output. The segment tree is just basically finding the node with the least depth (which will be the LCA) between the first occurrence in the Euler tour of each of the nodes in which we are trying the find the LCA. I was wondering if you could tell me where I went wrong in implementing the Euler tour.

Also, another question about LCA’s. Suppose we have nodes n1, n2. Suppose that the LCA of them is n3. Now suppose we have another arbitrary node n4. Is the LCA of n3 and n4 also the LCA of n1,n2, and n4? In other words, are LCA’s transitive. If it is, I was wondering if you could give me a proof.

Thanks!

pavanscoding · July 18, 2022, 6:21pm

bump