Robot Instructions TLE on USACO server, but not on local

Problem Info

Robot Instructions: USACO

My Work

#include <bits/stdc++.h>
using namespace std;
#define int long long
using vi = vector<int>;
using pi = pair<int, int>;
#define pb push_back
#define f first
#define s second

vector<pi> tmp;
int nd, n, rx, ry;//rx is the goal x; ry is the goal y
map<pi, map<int, int>> tot;//tot[i][j] is the number of ways to get pair i with j pairs
vi ret;
void gen(int st, int tx, int ty, int prs){//st is the starting index, tx is the total x value of this subset, ty is the total y value of the subset, prs is the number of pairs in this subset
        //if you are looking at the second half of the possible pairs then you want to find the number of ways so that i+(the current subset) = (rx, ry)
        if(nd==n){
                int c = 0;
                for(pi i: tot[{rx-tx, ry-ty}]){
                        c++;
                        ret.at(i.f+prs)+=i.s;
                }
                assert(c<41);//complexity of gen should be at most 40*2^20
        }else{
                tot[{tx, ty}][prs]++;
        }
        for(int i = st+1; i<nd; i++){//generate all subsets from st to nd
                gen(i, tx+tmp[i].f, ty+tmp[i].s, prs+1);
        }
}
signed main(){
	ios::sync_with_stdio(false);
	cin.tie(0);
        //freopen("test.in", "r", stdin);
        cin>>n>>rx>>ry;
        for(int i = 0; i<n; i++){
                int a, b;
                cin>>a>>b;
                tmp.pb({a, b});
                ret.pb(0);
        }
        ret.pb(0);
        nd = n/2;
        gen(-1, 0, 0, 0);//generates all subsets with the first n/2 elements
        nd = n;
        gen(n/2-1, 0, 0, 0);//generates all subsets from the n/2th element to the last element
        for(int i =1; i<=n; i++){
                cout<<ret[i]<<endl;
        }
}

The code splits the array into two separate arrays of equal length and then tests if you can sum up to the resulting pair (rx, ry).

What I’ve Tried

I downloaded the usaco test data and it worked fine for me. I also compared the speed of benq’s offical solution to my local solution and I didn’t notice any difference in execution speed. I use onlinegdb, so I know it isn’t just my computer being fast.

Question

Why is my code running slow on the USACO server? It runs perfectly fine on onlinegdb compiler using usaco’s downloaded data. My assert statement makes sure that the number of times a loop is run is at most 41.
PS: you probably should make it so that people cannot flag their own questions.

Perhaps there might be some undefined behavior somewhere? Not completely sure though.
Also, try replacing endl with '\n'.

How are you measuring execution speed? When I run your solution and my solution on test case 7 on USACO Guide IDE I get times of 1.97s and 0.38s, respectively.

Yours: https://ide.usaco.guide/MxQruRP4dwhwAwJCk69

Mine: https://ide.usaco.guide/MxQs5RlYTLG8Zk_hXSM

I measured speed by running and seeing if the code took a while to run.

still far less than 4s. For this problem the time limit was inflated:


I don’t think the USACO servers would be slower than the usaco.guide compiler. In addition, my code should run in O(n*2^n), which theoretically fits.

I’m not using any arrays, so I don’t think that would be an issue. I tried using \n and it didn’t help speed up much (It wouldn’t because I am only outputing 40 lines anyways), but here is my result after that:


My code should run in O(n*2^n).

It’s not uncommon for the USACO servers to be 2x slower (or even worse).