I implemented a binary lifting code to this problem which passes for all but one test case (last one); what is going wrong? What else needs to be optimized?
I had the exact same problem and using a 2d array instead of a 2d vector passed within the time limits. The last test case took 0.9s. Writing the code in C passed well within the time limit. (Around 0.6s for the last test case)