Disjoint partitioning with binary representation

Let’s just start with a problem.

Problem 1. Alice has hidden $n=2 \times 10^5$ balls numbered from $1$ to $n$ . Exactly two of them, numbered $p$ and $q$ , are colored red and the rest are white. Bob needs to report Alice a set $S$ of 18 pairs of sets, $\{ (A_0, B_0), (A_1,B_1), \dots (A_{17}, B_{17})\}$ such that $A_i, B_i \subseteq \{1,2,\dots , n\}$ and $A_i \cap B_i = \phi$ for $0 \leq i \leq 17$ . Bob wins if $\exists i$ such that $p \in A_i, q \in B_i$ or $p \in B_i, q \in A_i$ , that is $p$ and $q$ are in different sets of the same pair. You need to construct the set $S$ for Bob.

Solution. There’s an easy probabilistic solution to it. Randomly choose $\frac{n}{2}$ integers and put them in $A_0$ , put the rest of them in $B_0$ . Now the probability of $p$ and $q$ to fall in the same set is around $0.5$ , so if we continue doing the same for $(A_1, B_1) \dots (A_{17}, B_{17})$ , our probability of success becomes around $1-0.5^{18}$ , which is very close to $1$ . But we want a deterministic solution, and that’s where our trick comes in.

The trick: The whole idea can be wrapped up in one line: Place all integers with the $i$ -th bit turned off in $A_i$ and the rest in $B_i$ . Formally, let $x_i$ denote the $i$ -th bit of $x$ , where $x \in \{1,2,\dots, n\}$ . For all $i$ , set $A_i:= \{ x \mid x_i=0 \}$ and $B_i := \{ x \mid x_i = 1\}$ . And it magically solves the problem! How? Well, since $p \neq q$ , $\exists i$ where $p_i \neq q_i$ (that is, their $i$ -th bits differ). That means both $p$ and $q$ cannot be in the same set- one of them must be in $A_i$ and the other must be in $B_i$ . Since $2\times 10^5 < 2^{18}$ , the bit they differ at will always be smaller than $18$ .

That’s it! This idea of partitioning into disjoint sets based on bits comes in handy in many interactive problems.

Well, I first saw this trick in a shortest path problem, which was probably like this:

Problem 2. You are given a weighted graph $G(V,E)$ , $(|V|, |E| \leq 2\times 10^5)$ and a set $A \subseteq V$ of special nodes. You need to find $\min\limits_{u,v \in A} D(u, v)$ , where $D(u, v)$ is the shortest distance from node $u$ to $v$ .

Solution. This problem does have a $O(n \log n)$ solution which requires some observations if I remember correctly. But the bit trick gives us a very simple $O (n \log^2 n)$ solution.

If we choose a set $X \subset A$ and run multisource Dijkstra from nodes in $X$ , we find the shortest path from all $u \in X$ to $v \in A\setminus X$ . If $P$ and $Q$ are the closest pair of nodes in $A$ , and if we can somehow put one of them in the source nodes and the other in the destination nodes, we will get the shortest pair distance! The bit trick will do that for us. We will run $\log n$ multisource Dijkstra. During the $i$ -th time, set $X_i:=\{u \in A \mid u_i = 0\}$ and run Dijkstra from the nodes in $X_i$ . Like the last time, since $P \neq Q$ , $\exists i$ such that $P_i \neq Q_i$ , one of them will be in the source nodes and the other will be in the destination nodes, giving us the shortest pair distance!

Problem 3. (CEOI16 ICC) There’s a hidden graph of $n$ $(n \leq 100)$ nodes. Initially, the graph has no edges. Until the graph becomes connected, the Terran will add an edge $(u,v)$ such that $u$ and $v$ were previously disconnected (so, total $(n-1)$ edges will be added, eventually forming a tree). After adding each edge, we will have to find out $u$ and $v$ . Allowed Query: Given two disjoint sets $A, B$ of nodes, computer will return whether there exists $u \in A, v \in B$ such that $u$ and $v$ are connected directly by an edge. You can ask at most $1650$ queries overall.

Solution. We will solve this for one edge only, as we can repeat the same algorithm for the rest of them. If there are currently $z$ components, with the bit trick, we can find two sets $X, Y$ such that $u\in X$ and $v \in Y$ in $\lceil \log z \rceil$ queries. Pick one node from each component (preferably the representative node if you use DSU to maintain the graph), and apply the bit trick (well, before querying, don’t forget to push the other nodes of the components to their respective set as well). Now binary search on $X$ and find $u$ . Do the same for $v$ . (I skipped some details, I’m sure you can figure it out :D). This solution takes around $3 \log n$ queries per edge. I got 90 with this solution. However, after I checked the bits in random order while partitioning the representative nodes and ended the loop as I already found a valid division, I got 100. (This is another useful trick, often try randomizing things and see if it helps :D)

Here’s another nice problem, almost copy-pasted from Mamnoon Siam’s Note.

Problem 4. (The penguin’s game) There’s a hidden array a (size is at most $1000$ ). Computer has selected two indices $i$ and $j$ such that $a_i=a_j=x$ and $a_{k \notin \{i,j\}} = y$ $(1 \leq x, y \leq 10^9, x \neq y)$ . Find $i, j$ using at most 19 queries. Allowed Query: given a set $S$ , computer returns $\oplus_{k \in S} a_k$ (that is, the bitwise xor of all integers in $S$ ).

Solution. Well, binary search won’t work here. Note that If we ask with a set $S$ , we can deduce the parity of the number of $x$ in $S$ . So, using our trick we can determine which bits of $i$ and $j$ are the same and which aren’t. For that, for every bit $i$ , we can simply put all integers with $i$ -th bit on in $S$ and see if there are even numbers of $x$ in it. If so, the $i$ -th bit of them are equal; otherwise not. Now, if we can somehow find one of them, we have the necessary information to determine the other one, and we are done. Since $i \neq j$ , they will differ at least in one bit. So, we have a set of indices where exactly one of $i$ or $j$ lies. Find where that is using binary search. Also, this set’s size is at most half of the original array’s- that saves one query in binary search.

Problem 5. (POI Remont) You are given a color scheme $a_1, a_2,\dots, a_n$ ( $a_i \leq k)$ for $n$ vertical strips of a wall ( $a_i$ is the color of the $i$ -th strip). A roller can paint two consecutive strips. You have total $k^2$ rollers, one for each pair of colors. You can use each roller at most once and each strip can be colored multiple times, but with the same color every time. Determine whether it is possible to complete the color scheme.

Solution. Construct a new array $B$ of pairs of colors (or rollers) of length $(n -1)$ and set $B_i:=(a_i, a_{i+1})$ . Obviously, we need to color the strips using the pairs from $B$ . Let $C_i$ denote whether we used the $i$ -th roller or not. It is easy to see both $C_i$ and $C_{i+1}$ cannot be $0$ for any $i$ , otherwise there is no way you can color the $(i+1)$ -th strip with color $a_{i+1}$ . We can actually write this condition in CNF form: $(C_1 \vee C_2) \wedge (C_2 \vee C_3) \wedge \dots (C_{n-2} \vee C_{n-1}).$ This leads us to the well-known 2-SAT problem, but we have to add some additional conditions. First of all, $C_1$ and $C_{n-1}$ must be true since there’s no other way to color the first and last strip. Then comes the main challenge: since each roller can be used at most once, for each unique roller $x \in B$ , we’ll have a set $S_x:= \{i \mid B_i=x\}$ , and there can be at most one $k \in S_x$ such that $C_k = 1$ , and for all $i\in S_x \setminus \{k\}, C_i$ must be $0$ . But how to force this condition in 2-SAT?
To exist atmost one $k \in S=\{S_1, S_2, \dots S_m\}$ such that $C_k$ is true, $(\neg C_i \vee \neg C_j)$ must be true for all valid $i, j \in S$ . But that leads us to $O(n^2)$ new edges in our implication graph, which is too slow. And once again, the bit trick saves us!
Note that $(\neg C_i \vee \neg C_j)$ means $C_i \implies \neg C_j$ and $C_j \implies \neg C_i$ . But since adding these edges among all pairs is costly, we will add two layers of dummy nodes ( $\log n$ nodes in each layer) in the implication graph. Let’s denote them by $D_{0,0}, D_{0,1} \dots D_{0, m}$ and $D_{1,0}, D_{1,1}, \dots, D_{1,m}$ . Now for all $i \in S$ , for each bit $j$ , add two directed edges: $(D_{i_j, j}, \neg i)$ and $(i, D_{i_j \oplus 1, j})$ (here $i_j$ is the $j$ -th bit of $i$ ). Now, since for all $i, j$ , there exists a bit in which they differ, $i \rightarrow D \rightarrow \neg j$ and $j \rightarrow D \rightarrow \neg i$ edges exist in the graph (here $D$ is a node from the dummy layer). That means $C_i \implies \neg C_j$ and $C_j \implies \neg C_i$ implications are added for all $i,j$ , which is all we needed!