Stuck on the problem Bovine Genomics under the Complete Search section of the guide.
My understanding of the problem is as follows: if any three distinct positions in spotty[n]
- (s_i,s_j,s_k) - equal the same positions in plain[n]
- (p_i,p_j,p_k) - then that set will be not be considered distinct.
I came up with the following approach to check each position for all possible three pairs:
string spotty[n], plain[n];
//after taking input
int ans = 0;
for(int i=0; i<m; i++) {
for(int j=i+1; j<m; j++) {
for(int k=j+1; k<m; k++) {
bool check = false;
for(int a=0; a<n; a++) {
if(spotty[a][i] != plain[a][i] ||
spotty[a][j] != plain[a][j] ||
spotty[a][k] != plain[a][k]) check = true;
}
if(check) ans++;
}
}
}
cout << ans << endl;
For the sample test case, this code block returns ans
as 36 (while the correct value should be 22). The Official Analysis states that we need to check for if we have already visited any similar pairs. I am not able to wrap my head around this part, don’t we just need to check that for any three distinct positions the sequence of characters in the spotty genome doesn’t match with the characters in the plain genome.
How exactly is the official analysis code distinguish that a set has already been visited? I understand that converting each character value to base 4 and then comparing their collective values will confirm if a set is distinct or not, but how is this helping it keep track of sets they have already considered (as stated in the analysis)?