Wormhole Sort timeout on server but not on my end

Hi guys! I have a question about my solution to the Wormhole Sort problem and why it works within the time limit on my system but not on the grading server.

Problem Info

My Work

#include <bits/stdc++.h>
using namespace std;

bool test(int x,unordered_map<int,int> &l,vector<pair<int,pair<int,int>>> &holes) //returns true if x works as minimum width
{
   vector<unordered_set<int>> adjmat; //initialize adjacency list
   for (int i = 0;i<l.size();i++)
   {
      unordered_set<int> a;
      adjmat.push_back(a);
   }

   auto iter = lower_bound(holes.begin(),holes.end(),make_pair(x,make_pair(0,0)));
   //adds to adjacency list for wormholes with width x or higher
   while (iter<holes.end())
   {
      adjmat[(*iter).second.first].insert((*iter).second.second);
      adjmat[(*iter).second.second].insert((*iter).second.first);
      iter++;
   }

   unordered_set<int> visited;
   for (int i = 0;i<l.size();i++)
   {
      if (visited.find(i)!=visited.end())
         continue;
      
      //dfs
      unordered_set<int> component;
      stack<int> s;
      s.push(i);
      while (!s.empty())
      {
         int a = s.top();
         s.pop();
         if (visited.find(a)==visited.end())
         {
            visited.insert(a);
            component.insert(a);
            for (auto j:adjmat[a])
               s.push(j);
         }
      }

      //checks if cows in this component can be sorted - essentially same as set comparison
      for (int j:component)
      {
         if (component.find(l[j]) == component.end())
            return false;
      }
   }

   return true;
}

int main()
{
   ios_base::sync_with_stdio(false);
   cin.tie(NULL);
   
   freopen("wormsort.in","r",stdin);
   freopen("wormsort.out","w",stdout);

   int n,m;
   cin >> n >> m;
   vector<int> cows(n);
   unordered_map<int,int> lookup; //lookup table for index of each cow
   for (int i = 0;i<n;i++)
   {
      int x;
      cin >> x;
      cows[i] = x-1; //cows and wormhole positions are 0 based
      lookup[x-1]=i;
   }

   //print -1 if sorted
   if (is_sorted(cows.begin(),cows.end()))
   {
      cout << "-1\n";
      return 0;
   }

   vector<pair<int,pair<int,int>>> wormholes(m);
   for (int i = 0;i<m;i++)
   {
      int x,y,w;
      cin >> x >> y >> w;
      wormholes[i]={w,{x-1,y-1}};
   }

   sort(wormholes.begin(),wormholes.end());

   //binary search
   int l=0, r=1e9+1;
   while (l<r)
   {
      int mid = (l+r+1)/2;
      if (test(mid,lookup,wormholes))
      {
         l=mid;
      }
      else
      {
         r=mid-1;
      }
   }

   cout << l << "\n";
}

I use a binary search to find the answer, and in order to test whether the answer is smaller or larger than the current value tested, I use DFS. For each test, it generates an adjacency list using wormholes with width of at least the value I’m testing. It then finds all components in the graph, testing to see if each component contains the correct cows.


Interestingly, the solution jumps straight from 28ms to a timeout between cases 5 and 6.

I’ve downloaded the test cases and confirmed that I get the right answers, and with the O2 flag, they finish in under 2s. My CPU isn’t super buffed or anything; it’s an i3-7100. Why would the server suddenly time out in these cases that work perfectly fine on my end? Do I simply need to speed up my code more? If so, how could I do that? It would definitely lie in my test function somewhere, but where or how exactly I have no idea.

In response to how to speed up your code: unordered_set and unordered_map are very slow, usually even slower than their ordered counterparts (map and set).

You can look at https://usaco.guide/gold/faster-hashmap?lang=cpp if you really need a hash map, but it’s rarely needed (usually map or vector suffices).

Ah, thanks for the info! I assumed unordered_set and unordered_map were faster because they are O(1.) I’ll give this a go with different containers later :slight_smile:

I’ve improved my code. It now finishes in time on all cases except the last one. I guess the grader is just a good amount slower than my system.

#include <bits/stdc++.h>
using namespace std;

bool test(int x,vector<int> &l,vector<pair<int,pair<int,int>>> &holes) //returns true if x works as minimum width
{
   vector<vector<int>> adjmat; //initialize adjacency list
   for (int i = 0;i<l.size();i++)
   {
      vector<int> a;
      adjmat.push_back(a);
   }

   auto iter = lower_bound(holes.begin(),holes.end(),make_pair(x,make_pair(0,0)));
   //adds to adjacency list for wormholes with width x or higher
   while (iter<holes.end())
   {
      adjmat[(*iter).second.first].push_back((*iter).second.second);
      adjmat[(*iter).second.second].push_back((*iter).second.first);
      iter++;
   }

   vector<bool> visited(l.size());
   for (int i = 0;i<l.size();i++)
   {
      if (visited[i])
         continue;
      
      //dfs
      set<int> component;
      stack<int> s;
      s.push(i);
      while (!s.empty())
      {
         int a = s.top();
         s.pop();
         if (!visited[a])
         {
            visited[a]=true;
            component.insert(a);
            for (auto j:adjmat[a])
               s.push(j);
         }
      }

      //checks if cows in this component can be sorted - essentially same as set comparison
      for (int j:component)
      {
         if (component.find(l[j]) == component.end())
            return false;
      }
   }

   return true;
}

int main()
{
   ios_base::sync_with_stdio(false);
   cin.tie(NULL);
   
   freopen("wormsort.in","r",stdin);
   freopen("wormsort.out","w",stdout);

   int n,m;
   cin >> n >> m;
   vector<int> cows(n);
   vector<int> lookup(n); //lookup table for index of each cow
   for (int i = 0;i<n;i++)
   {
      int x;
      cin >> x;
      cows[i] = x-1; //cows and wormhole positions are 0 based
      lookup[x-1]=i;
   }

   //print -1 if sorted
   if (is_sorted(cows.begin(),cows.end()))
   {
      cout << "-1\n";
      return 0;
   }

   vector<pair<int,pair<int,int>>> wormholes(m);
   for (int i = 0;i<m;i++)
   {
      int x,y,w;
      cin >> x >> y >> w;
      wormholes[i]={w,{x-1,y-1}};
   }

   sort(wormholes.begin(),wormholes.end());

   //binary search
   int l=0, r=1e9+1;
   while (l<r)
   {
      int mid = (l+r+1)/2;
      if (test(mid,lookup,wormholes))
      {
         l=mid;
      }
      else
      {
         r=mid-1;
      }
   }

   cout << l << "\n";
}

Summary of changes:

  • Changed the lookup table from a map to a vector
  • Changed adjacency list from a vector of unordered_set to a 2D vector
  • For DFS, changed visited list from unordered_set to set and changed the list of nodes in a component from unordered_set to set

I must be pretty close to a proper solution since the 9th case works in 1.5s. Are there any other quick speedups I could implement?

(I’ve also just realized why the original solution suddenly jumped in time at case 6 - cases 3 to 5 had size constraints and the rest had no constraints.)

Try switching your lookup table to a vector as well.

Some other notes:

  • Not sure if doing freopen(...) after you’ve untied cin resets fast IO, but it might be worth doing freopen before ios_base....
  • If you need a data structure that contains exactly 3 integers, you can use array<int, 3> or tuple<int, int, int> instead of pair<int, pair<int, int>>. If you use a tuple, make sure to use structured bindings for cleaner code.
  • You can declare wormholes and lookup at the top of your code so you don’t have to pass them into all your functions and re-type their types.