Disjoint partitioning with binary representation

Let’s just start with a problem.

Problem 1. Alice has hidden n=2×105n=2 \times 10^5 balls numbered from 11 to nn. Exactly two of them, numbered pp and qq, are colored red and the rest are white. Bob needs to report Alice a set SS of 18 pairs of sets, {(A0,B0),(A1,B1),(A17,B17)}\{ (A_0, B_0), (A_1,B_1), \dots (A_{17}, B_{17})\} such that Ai,Bi{1,2,,n}A_i, B_i \subseteq \{1,2,\dots , n\} and AiBi=ϕA_i \cap B_i = \phi for 0i170 \leq i \leq 17. Bob wins if i\exists i such that pAi,qBip \in A_i, q \in B_i or pBi,qAip \in B_i, q \in A_i, that is pp and qq are in different sets of the same pair. You need to construct the set SS for Bob.

Solution. There’s an easy probabilistic solution to it. Randomly choose n2\frac{n}{2} integers and put them in A0A_0, put the rest of them in B0B_0. Now the probability of pp and qq to fall in the same set is around 0.50.5, so if we continue doing the same for (A1,B1)(A17,B17)(A_1, B_1) \dots (A_{17}, B_{17}), our probability of success becomes around 10.5181-0.5^{18}, which is very close to 11. But we want a deterministic solution, and that’s where our trick comes in.

The trick: The whole idea can be wrapped up in one line: Place all integers with the ii-th bit turned off in AiA_i and the rest in BiB_i. Formally, let xix_i denote the ii-th bit of xx, where x{1,2,,n}x \in \{1,2,\dots, n\}. For all ii, set Ai:={xxi=0}A_i:= \{ x \mid x_i=0 \} and Bi:={xxi=1}B_i := \{ x \mid x_i = 1\}. And it magically solves the problem! How? Well, since pqp \neq q, i\exists i where piqip_i \neq q_i (that is, their ii-th bits differ). That means both pp and qq cannot be in the same set- one of them must be in AiA_i and the other must be in BiB_i. Since 2×105<2182\times 10^5 < 2^{18}, the bit they differ at will always be smaller than 1818.

That’s it! This idea of partitioning into disjoint sets based on bits comes in handy in many interactive problems.

Well, I first saw this trick in a shortest path problem, which was probably like this:

Problem 2. You are given a weighted graph G(V,E)G(V,E), (V,E2×105)(|V|, |E| \leq 2\times 10^5) and a set AVA \subseteq V of special nodes. You need to find minu,vAD(u,v)\min\limits_{u,v \in A} D(u, v), where D(u,v)D(u, v) is the shortest distance from node uu to vv.

Solution. This problem does have a O(nlogn)O(n \log n) solution which requires some observations if I remember correctly. But the bit trick gives us a very simple O(nlog2n)O (n \log^2 n) solution.

If we choose a set XAX \subset A and run multisource Dijkstra from nodes in XX, we find the shortest path from all uXu \in X to vAXv \in A\setminus X. If PP and QQ are the closest pair of nodes in AA, and if we can somehow put one of them in the source nodes and the other in the destination nodes, we will get the shortest pair distance! The bit trick will do that for us. We will run logn\log n multisource Dijkstra. During the ii-th time, set Xi:={uAui=0}X_i:=\{u \in A \mid u_i = 0\} and run Dijkstra from the nodes in XiX_i. Like the last time, since PQP \neq Q, i\exists i such that PiQiP_i \neq Q_i, one of them will be in the source nodes and the other will be in the destination nodes, giving us the shortest pair distance!

Problem 3. (CEOI16 ICC) There’s a hidden graph of nn (n100)(n \leq 100) nodes. Initially, the graph has no edges. Until the graph becomes connected, the Terran will add an edge (u,v)(u,v) such that uu and vv were previously disconnected (so, total (n1)(n-1) edges will be added, eventually forming a tree). After adding each edge, we will have to find out uu and vv. Allowed Query: Given two disjoint sets A,BA, B of nodes, computer will return whether there exists uA,vBu \in A, v \in B such that uu and vv are connected directly by an edge. You can ask at most 16501650 queries overall.

Solution. We will solve this for one edge only, as we can repeat the same algorithm for the rest of them. If there are currently zz components, with the bit trick, we can find two sets X,YX, Y such that uXu\in X and vYv \in Y in logz\lceil \log z \rceil queries. Pick one node from each component (preferably the representative node if you use DSU to maintain the graph), and apply the bit trick (well, before querying, don’t forget to push the other nodes of the components to their respective set as well). Now binary search on XX and find uu. Do the same for vv. (I skipped some details, I’m sure you can figure it out :D). This solution takes around 3logn3 \log n queries per edge. I got 90 with this solution. However, after I checked the bits in random order while partitioning the representative nodes and ended the loop as I already found a valid division, I got 100. (This is another useful trick, often try randomizing things and see if it helps :D)

Here’s another nice problem, almost copy-pasted from Mamnoon Siam’s Note.

Problem 4. (The penguin’s game) There’s a hidden array a (size is at most 10001000). Computer has selected two indices ii and jj such that ai=aj=xa_i=a_j=x and ak{i,j}=ya_{k \notin \{i,j\}} = y (1x,y109,xy)(1 \leq x, y \leq 10^9, x \neq y). Find i,ji, j using at most 19 queries. Allowed Query: given a set SS, computer returns kSak\oplus_{k \in S} a_k (that is, the bitwise xor of all integers in SS).

Solution. Well, binary search won’t work here. Note that If we ask with a set SS, we can deduce the parity of the number of xx in SS. So, using our trick we can determine which bits of ii and jj are the same and which aren’t. For that, for every bit ii, we can simply put all integers with ii-th bit on in SS and see if there are even numbers of xx in it. If so, the ii-th bit of them are equal; otherwise not. Now, if we can somehow find one of them, we have the necessary information to determine the other one, and we are done. Since iji \neq j, they will differ at least in one bit. So, we have a set of indices where exactly one of ii or jj lies. Find where that is using binary search. Also, this set’s size is at most half of the original array’s- that saves one query in binary search.

Problem 5. (POI Remont) You are given a color scheme a1,a2,,ana_1, a_2,\dots, a_n (aik)a_i \leq k) for nn vertical strips of a wall (aia_i is the color of the ii-th strip). A roller can paint two consecutive strips. You have total k2k^2 rollers, one for each pair of colors. You can use each roller at most once and each strip can be colored multiple times, but with the same color every time. Determine whether it is possible to complete the color scheme.

Solution. Construct a new array BB of pairs of colors (or rollers) of length (n1)(n -1) and set Bi:=(ai,ai+1)B_i:=(a_i, a_{i+1}). Obviously, we need to color the strips using the pairs from BB. Let CiC_i denote whether we used the ii-th roller or not. It is easy to see both CiC_i and Ci+1C_{i+1} cannot be 00 for any ii, otherwise there is no way you can color the (i+1)(i+1)-th strip with color ai+1a_{i+1}. We can actually write this condition in CNF form: (C1C2)(C2C3)(Cn2Cn1).(C_1 \vee C_2) \wedge (C_2 \vee C_3) \wedge \dots (C_{n-2} \vee C_{n-1}).This leads us to the well-known 2-SAT problem, but we have to add some additional conditions. First of all, C1C_1 and Cn1C_{n-1} must be true since there’s no other way to color the first and last strip. Then comes the main challenge: since each roller can be used at most once, for each unique roller xBx \in B, we’ll have a set Sx:={iBi=x}S_x:= \{i \mid B_i=x\}, and there can be at most one kSxk \in S_x such that Ck=1C_k = 1, and for all iSx{k},Cii\in S_x \setminus \{k\}, C_i must be 00. But how to force this condition in 2-SAT?
To exist atmost one kS={S1,S2,Sm}k \in S=\{S_1, S_2, \dots S_m\} such that CkC_k is true, (¬Ci¬Cj)(\neg C_i \vee \neg C_j) must be true for all valid i,jSi, j \in S. But that leads us to O(n2)O(n^2) new edges in our implication graph, which is too slow. And once again, the bit trick saves us!
Note that (¬Ci¬Cj)(\neg C_i \vee \neg C_j) means Ci    ¬CjC_i \implies \neg C_j and Cj    ¬CiC_j \implies \neg C_i. But since adding these edges among all pairs is costly, we will add two layers of dummy nodes (logn\log n nodes in each layer) in the implication graph. Let’s denote them by D0,0,D0,1D0,mD_{0,0}, D_{0,1} \dots D_{0, m} and D1,0,D1,1,,D1,mD_{1,0}, D_{1,1}, \dots, D_{1,m}. Now for all iSi \in S, for each bit jj, add two directed edges: (Dij,j,¬i)(D_{i_j, j}, \neg i) and (i,Dij1,j)(i, D_{i_j \oplus 1, j}) (here iji_j is the jj-th bit of ii). Now, since for all i,ji, j, there exists a bit in which they differ, iD¬ji \rightarrow D \rightarrow \neg j and jD¬ij \rightarrow D \rightarrow \neg i edges exist in the graph (here DD is a node from the dummy layer). That means Ci    ¬CjC_i \implies \neg C_j and Cj    ¬CiC_j \implies \neg C_i implications are added for all i,ji,j, which is all we needed!