TLE on O(nlogn) solution

Problem Info

Fair Photography

My Work

  1. Make a prefix sum of the cows (sorted by position) with a white cow having a value of 1 and a spotted cow with a value of -1.
  2. Create a stack of values that stores pairs of (a,pos), pos being the farthest selectable position with a prefix sum value less than or equal to a. (and greater than the ‘a’ on the next pair, if it exists.) (This can be done in O(n))
  3. Go through the prefix sum array, and for each element find the farthest element (aka with the largest index) that’s greater than or equal to it by binary searching through the previously created stack. (This ensures that the range selected contains at least as many white cows as spotted cows.) Use these two indices to calculate one possible range of cows.

(Sidenote: I split the stack into two smaller stacks to make sure the number of cows in the range selected was always even.)

#include <iostream>
#include <set>
#include <vector>
#include <cmath>
#include <algorithm>
#include <queue>
#include <iterator>
#include <cstring>
#include <fstream>
#include <map>
#include <bitset>
#include <cassert>

using namespace std;

#define fr(ee) for(int i = 0; i<ee; i++)
#define fill(a,b)(memset(a,b,sizeof(a)));
#define mp(a,b)(make_pair(a,b))
#define pb(a) push_back(a)

using l = long long;
using vi = vector<int>;
using vvi = vector<vector<int> >;
using pii = pair<int, int>;

int MIN = -2147483648;
int MOD = 1000000007;

template<class T> inline int sz(T container) {
	return (int)container.size();
}

void read(int &n) {

	cin >> n;
}
int pi() {
	int N;
	read(N);
	return N;
}
l pl() {
	l N;
	cin >> N;
	return N;
}
char pc() {
	char c;
	cin >> c;
	return c;
}
string ps() {
	string s;
	cin >> s;
	return s;
}
l powf(l a, l b) {
	if (b == 0)return 1;
	l val = powf(a, (l)(b / 2));
	if (b & 1) {

		return ((val * val) % MOD * a) % MOD;
	} else {
		return (val * val) % MOD;
	}
}
void IO(string fileName) {
	freopen((fileName + ".in").c_str(), "r", stdin);
	freopen((fileName + ".out").c_str(), "w", stdout);
}
#define p (pi())
#define fast ios::sync_with_stdio(0);cin.tie(0)
/*

5
8 W
11 S
3 W
10 W
5 S


 */
void add_stack(vector<pii> &stack, int* pre, int* pos, int i) {

	int size = sz(stack);
	while (size > 0 &&( stack.at(size- 1).first <= pre[i])) {
		stack.pop_back();
		size--;
	}
	if (size == 0 || (stack.at(size - 1).first > pre[i])) {
		stack.pb(mp(pre[i], pos[i]));
	}
}
int main() {
	fast;
	// IO("fairphoto");
	int N = p;
	int pre[N];
	int pos[N];
	pii cows[N];
	int s = 0;
	fr(N){
		int posa = p;
		char type = pc();
		int t = (type == 'W') ? 1 : -1;
		cows[i] = mp(posa,t);
	}
	sort(cows,cows+N);
	fr(N) {
		s += cows[i].second;
		pre[i] = s;
		pos[i] = cows[i].first;
	}
	vector<pii> e_stack;
	vector<pii> o_stack;

//Code runs up to here
	fr(N) {
		if(i%2){
			add_stack(o_stack,pre,pos,i);
		}else{
			add_stack(e_stack,pre,pos,i);
		}
		// cout<<i<<endl;
	}
	int m = 0;
//Code doesn't get past here in the TLE cases
	fr(N){
		int val = (i==0)?0:pre[i-1];
		vector<pii> stack = (i%2)?e_stack:o_stack;
		if(sz(stack)==0)continue;
		int a = 0;
		int b = sz(stack)-1;
		while(a<=b){
			int mid = (a+b)/2;
			if(stack.at(mid).first<val){
				b = mid-1;
			}else{
				a = mid+1;
			}
		}
		if(b==-1)continue;


		m = max(m,stack.at(b).second-pos[i]);

	}
	cout<<m<<endl;

	return 0;
}

Question

The strange thing is that although it’s a nlog(n) solution, it times out on several test cases. After placing some assert statements in the code, I noticed that the code seemingly stops in the middle (indicated by the comments) with no apparent reason. The issue is likely in the add_stack method, but does anyone know why?

Have you tried running the test cases locally on your machine?

Yeah it runs fine… just a bit slow for some reason.

		vector<pii> stack = (i%2)?e_stack:o_stack;

This operation takes O(N) since it’s making a new copy, which results in O(N^2) in total. One way to fix this is to store a vector<pii> stack[2];, where stack[0] and stack[1] represents e_stack and o_stack respectively.

1 Like

Oops, even with the fix above, it still TLEs. How to fix:

write a faster stack (this fixed the TLE for me). std::vectors are pretty slow.

actually, the problem is the sz() function

template<class T> inline int sz(T container) {
	return (int)container.size();
}

^ this makes a copy of container every call, so changing it to

template<class T> inline int sz(const T& container) {
	return (int)container.size();
}

works.
or you could just make a macro: #define sz(x) int(x.size()).

Wait what do you mean by a faster stack?

Yep this fixed it for me! :slight_smile:
Also, wouldn’t it be easier to just do

vector<pii> &stack = (i%2)?e_stack:o_stack;

to prevent making a copy of the original stack?

upd: nevermind I see your point with the stack array as well.

wait actually both fixes were necessary. Thanks for the help!

My bad. When I coded it, it passed, and I thought that the reason was because std::vector was slower (by a constant factor), but it turned out that with the fix in sz() (my last post), it worked in both cases. Anyway, the “faster stack” would look something like

Code
struct _stack {
	pii x[100000];
	int sz = 0;

	int size() {
		return sz;
	}

	void pop_back() {
		sz--;
	}
	pii back() {
		return x[sz - 1];
	}
	void push_back(pii aa) {
		x[sz++] = aa;
	}
	pii operator[](int i) {
		return x[i];
	}
	pii at(int i) {
		return x[i];
	}
};

Tbh I’m not sure but these are my guesses on why this passed (without the fix in sz()):

  1. The whole structure might not have been copied again in sz() because of pii x[100000];
  2. The compiler might have optimized the “copy” in sz() because of
	int size() {
		return sz;
	}

since it only needs the sz variable from the struct.

Interesting…
Yeah I guess there are a couple of ways to prevent the stack from copying itself over and over again during execution.