General: Choosing a Language: Python solution for USACO Silver, Wormhole Sort

drugkeeper · November 30, 2022, 9:28am

In the guide it says: “We are not sure whether it is possible to modify the above approach to receive full credit (please let us know if you manage to succeed)! Of course, it is possible to pass this problem in Python using DSU (a Gold topic):”

With much help from pajenegod, I have succeeded in making python pass 10/10 testcases reliably (~3.7s/4s), using the above approach. I believe it could still be optimised further.

Here is the code:

def main():
	f = open("wormsort.in","rb")
	n,m = map(int, f.readline().split())
	loc = [*map(int, f.readline().split())]
	edges = [[] for _ in range(n)]
	weights = []
	lo,hi = 0, m - 1

	def valid(loc, minW):
		component = [-1]*n
		numcomps = 0
		for i in range(n):
			if component[i] != component[loc[i] - 1]:
				return False
			elif component[i] == -1:
				todo = [i]
				component[i] = numcomps
				for node in todo:
					for child, weight in edges[node]:
						if component[child] == -1 and weight >= minW:
							component[child] = numcomps
							todo.append(child)
				numcomps += 1
		return True

	for line in f:
		a,b,w = map(int, line.split())
		edges[a - 1].append((b - 1, w))
		edges[b - 1].append((a - 1, w))
		weights.append(w)
	weights.sort()

	while lo != hi:
		mid = (lo + hi) // 2
		if valid(loc, weights[mid]):
			lo = mid + 1
		else:
			hi = mid

	open("wormsort.out","w").write(f"{-1 if lo == 0 else weights[lo-1]}\n")

main()

Benq · November 30, 2022, 5:47pm

Cool, feel free to submit a PR with the updated code.

I noticed the following optimizations compared to the guide solution:

Wrapping everything with main.
Using rb instead of r
Not reinitializing loc
if component[i] != component[loc[i] - 1] is checked within the first for loop instead of afterward
Only integers are sorted (rather than triples of integers).

Did I miss anything? There are some other changes to the code, but I don’t think they affect how fast it is.

Benq · November 30, 2022, 6:42pm

Minor thing: The code doesn’t actually output the correct answer if it is -1 and there is more than one edge.

drugkeeper · December 1, 2022, 3:01am

Hi Benq, thanks for looking at my solution.

The main bottlenecks in the original solution are:

valid() function
sorting the tuples

For the optimisations involved:

yes, this is faster
yes, reading and dealing with bytes is faster than strings, for example split() will be faster as well. But I/O isn’t really the bottleneck here.
reinitializing loc isnt really the bottleneck, does not affect the speed much
yes, the if check is there so that we can terminate early + we dont need to do 2 for loops. I think using if elif is faster than if if also.
yes, in fact this is quite a major speed up, as sorting tuples in python is slow. We only need to sort the weights

Some other optimisations (going from most to least important):

for the graph traversal: the original one given was slower, as it had to keep calling len() and it uses a while loop. A for-each loop is way faster here.
using 1 for-each loop (for line in f) to initialise everything. The other solution uses multiple for i in range loops which is slower. for-each loops are also faster than for i in range loops
using map(int, lst) instead of map(lambda x: int(x) - 1, lst), and then -1 at the end. This is because map(int) is faster than map(lambda x).

Here are some other optimisations(?) which I’m not too sure if it affects the solution:

checking for component[child] == -1 first during the “and” check, as I feel that it would have a higher chance of shortcircuiting the solution.
passing in parameters into valid(), not too sure if it increases the speed of reference.