In the guide it says: “We are not sure whether it is possible to modify the above approach to receive full credit (please let us know if you manage to succeed)! Of course, it is possible to pass this problem in Python using DSU (a Gold topic):”
With much help from pajenegod, I have succeeded in making python pass 10/10 testcases reliably (~3.7s/4s), using the above approach. I believe it could still be optimised further.
Here is the code:
def main():
f = open("wormsort.in","rb")
n,m = map(int, f.readline().split())
loc = [*map(int, f.readline().split())]
edges = [[] for _ in range(n)]
weights = []
lo,hi = 0, m - 1
def valid(loc, minW):
component = [-1]*n
numcomps = 0
for i in range(n):
if component[i] != component[loc[i] - 1]:
return False
elif component[i] == -1:
todo = [i]
component[i] = numcomps
for node in todo:
for child, weight in edges[node]:
if component[child] == -1 and weight >= minW:
component[child] = numcomps
todo.append(child)
numcomps += 1
return True
for line in f:
a,b,w = map(int, line.split())
edges[a - 1].append((b - 1, w))
edges[b - 1].append((a - 1, w))
weights.append(w)
weights.sort()
while lo != hi:
mid = (lo + hi) // 2
if valid(loc, weights[mid]):
lo = mid + 1
else:
hi = mid
open("wormsort.out","w").write(f"{-1 if lo == 0 else weights[lo-1]}\n")
main()
The main bottlenecks in the original solution are:
valid() function
sorting the tuples
For the optimisations involved:
yes, this is faster
yes, reading and dealing with bytes is faster than strings, for example split() will be faster as well. But I/O isn’t really the bottleneck here.
reinitializing loc isnt really the bottleneck, does not affect the speed much
yes, the if check is there so that we can terminate early + we dont need to do 2 for loops. I think using if elif is faster than if if also.
yes, in fact this is quite a major speed up, as sorting tuples in python is slow. We only need to sort the weights
Some other optimisations (going from most to least important):
for the graph traversal: the original one given was slower, as it had to keep calling len() and it uses a while loop. A for-each loop is way faster here.
using 1 for-each loop (for line in f) to initialise everything. The other solution uses multiple for i in range loops which is slower. for-each loops are also faster than for i in range loops
using map(int, lst) instead of map(lambda x: int(x) - 1, lst), and then -1 at the end. This is because map(int) is faster than map(lambda x).
Here are some other optimisations(?) which I’m not too sure if it affects the solution:
checking for component[child] == -1 first during the “and” check, as I feel that it would have a higher chance of shortcircuiting the solution.
passing in parameters into valid(), not too sure if it increases the speed of reference.