Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Inspired by the example of that great Cornellian, Vladimir Nabokov, some of your

ID: 3852435 • Letter: I

Question

Inspired by the example of that great Cornellian, Vladimir Nabokov, some of your friends have become amateur lepidopterists (they study butterflies). Often when they return from a trip with specimens of butterflies, it is very difficult for them to tell how many distinct species they've caught-thanks to the fact that many species look very similar to one another One day they return with n butterflies, and they believe that each belongs to one of two different species, which we'll call A and B for purposes of this discussion. They'd like to divide the n specimens into two groups-those that belong to A and those that belong to B-but it's very hard for them to directly label any one specimen. So they decide to adopt the following approach. For each pair of specimens i and j, they study them carefully side by side. If they're confident enough in their judgment, then they label the pair (i. ) either "same" (meaning they believe them both to come from the same species) or "different" (meaning they believe them to come from different species). They also have the option of rendering no judgment on a given pair, in which case we'll call the pair ambiguous. So now theyhave the collection of n specimens, as well as a collection of m judgments (either same" or "different") for the pairs that were not declared to be ambiguous. They'd like to know if this data is consistent with the idea that each butterfly is from one of species A or B. So more concretely, we'll declare the m judgments to be consistent if it is possible to label each specimen either A or B in such a way that for each pair (i. /) labeled "same," it is the case that i and j have the same label; and for each pair (i, /) labeled "different," it is the case that i and j have different labels. They're in the middle of tediously working out whether their judgments are consistent when one of them realizes that you probably have an algorithm that would answer this question right away Give an algorithm with running time O(m + n) that determines whether the m judgments are consistent.

Explanation / Answer

Answer for the given problem statement:

the idea is to modify BFS. Assume the graph is connected. If not, the same
algorithm can be run on each connected component. Pick a starting node s and an arbitrary label,
say A, for the starting node. When running BFS, when a new node v is explored via edge (u, v) (i.e.,
when node v is put in layer Li+1 by node u Li), use the label of u and the judgment on edge (u, v) to
label v. This process will uniquely label all nodes based on tree edges and the arbitrary choice of label
A for s. Then, check all non-tree edges (u, v) to see if the endpoints (u, v) are consistent—either same
or different—with the judgment on the edge. If so, report “consistent”. If not, report “inconsistent.”
To argue correctness, we argue two cases. First, if the algorithm reports “consistent”, then the judgments
are consistent because the algorithm has labeled all nodes using the labels A or B and: (1)
the labeling process guarantees that the labels are consistent with tree ediges, and (2) the algorithm
explicitly checks that the labels are consistent with non-tree edges. Second, if the algorithm reports
“inconsistent”, the the judgments are inconsistent. This is true because the nodes were labeled in
the unique way possible based on only the tree edges (up to switching the labels A and B)—and the
resulting labeling is inconsistent on some non-tree edge. There is no other way to change the labels to
make all edges consistent.
The algorithm takes O(m + n) time because it is a modification of BFS, with only constant-time
operations added within the execution of BFS, and then O(m) time to check the non-tree edges after
BFS completes.