Submission #824198

# Submission time Handle Problem Language Result Execution time Memory
824198 2023-08-13T17:42:45 Z jophyyjh Bit Shift Registers (IOI21_registers) C++17
89 / 100
4 ms 1220 KB
/**
 * Hmmm, not a very interesting problem for me, but I think I tried my best. First of all,
 * it's clear that the first subproblem we need to solve is to find the min of two numbers.
 * Comparing two ints involves finding the most significant bit in which they differ, and
 * pick the num with a 1 on that bit. A bit difficult (without control statements), right?
 * 
 * I may have overcomplicated stuff, but here's how I solved it. Two-complement is the way
 * most modern computers store negative numbers. If we have b bits, we can store
 * [-2^{b-1}, 2^{b-1}). In memory, non-negative numbers are just stored in their binary
 * format. -1, however, is 11...11. Why? Can't it just be 100...001 (using the first bit
 * to repr negative)? Indeed we can. But two-complement allows us to do sth really cool!
 * 11...11 is chosen to be -1 because it IS -1 modulo 2^b. This means addition involving
 * negative numbers can be carried out in the same way as non-negative numbers (where
 * overflow is omitted, i.e. addition under modulo), if we know how to find the negation
 * of a value! I believe this is the most important understanding of two-complement! The
 * remaining stuff can be easily found online.
 * 
 * So, when we want to compare two int, we'll just move them into two separate registers.
 * Under two-complement, it's well-known that -x = ~x + 1, so we can do subtraction. Note
 * that our values are in [0, 2^b). After subtraction, it should be (-2^b, 2^b), but in
 * memory, it's [0, 2^{b+1}). As usual, we can see if the (b+1)-th bit is a 1 to see if a
 * num is negative modulo 2^{b+1}. Do a bit shift to filter out that bit and we will have
 * compared two integers.
 * 
 * For simplicity and extensibility, I chose to implement swap() for both s=0 and s=1.
 * swap() swaps two num iff the num in front is greater than the num at the back. Most
 * importantly, the part of comparing and swapping of different pairs can be carried out
 * simultaneously. Therefore, we can do divide-and-conquer to finish [S1-4]. In practice,
 * there're some more things to consider. First, we can impl a "multiply by 0/1" operation:
 * by taking advantage of -1 -> 11...11 and -0 -> 00...0, we can do a AND operation with
 * a num and a 0/1's negation. Finally, in order to swap (x,y), I first compute their xor,
 * and multiply this by the 0/1 representing whether the element is front is greater, and
 * change both x,y with the xor. This technique of using xor is also well-known:
 *              swap(x,y):      x ^= y  -->  y ^= x  -->  x ^= y
 * 
 * [After some hints]
 * Ahhh, bubble sort can be parallelized in a few ways. For example, we can do these swap
 * pairs first: (0,1)(2,3)(4,5).... Then, in another phase, do (1,2)(3,4)(5,6)... Repeat
 * for n times.
 * 
 * Find min:    21 * ceil(log_2(n)) + 3         (tight for me, failed [S2])
 *     Sort:    21 * n                          (not tight)
 * Implementation 1     (Full solution except [S2]; bit operations, maths, two-complement)
*/

#include <bits/stdc++.h>
#include "registers.h"

typedef std::vector<int>	vec;

const int b = 2000;


// Do a min-swap with 21 steps on all the following pairs:
// {first_idx[0], first_idx[0] + diff}, {first_idx[1], first_idx[1] + diff}, ...
void swap(const vec& first_idx, int diff, int k) {
    // Setup the masks
    std::vector<bool> mask(b, 0), mask2(b, 0), ones(b, 0);
    for (int f : first_idx) {
        ones[f * k] = 1;
        for (int i = f * k; i < (f + 1) * k; i++)
            mask[i] = mask2[i] = 1;
        mask2[(f + 1) * k] = 1;
    }
    append_store(3, mask);
    append_store(4, ones);
    append_store(5, mask2);

    // Move the first elements to register 1, the second elements to register 2.
    append_move(1, 0);
    append_right(2, 0, diff * k);
    append_and(1, 1, 3);
    append_and(2, 2, 3);
    // append_print(1);
    // append_print(2);

    if (k >= 2) {
        // Subtraction with two-complement
        append_not(6, 1);       // flip
        append_and(6, 6, 5);
        append_add(6, 6, 2);
        append_add(6, 6, 4);    // +1
        // append_print(6);

        // Extract sign
        append_right(7, 6, k);
        append_and(7, 7, 4);
        append_not(7, 7);
        append_and(7, 7, 5);
        append_add(7, 7, 4);
        // append_print(7);
    } else {    // k=1 is a special case
        append_not(7, 2);
        append_and(7, 7, 1);
    }

    // Compute xor if a swap is needed
    append_xor(8, 1, 2);
    append_and(8, 8, 7);
    // append_print(8);

    // Swap by xor
    append_left(9, 8, diff * k);
    append_or(9, 9, 8);
    append_xor(0, 0, 9);
}

void construct_instructions(int s, int n, int k, int q) {
    if (s == 0) {
        // Extend the array with INF s.t. its len is a power of 2
        int two_pow = 1;
        while (two_pow < n)
            two_pow *= 2;
        if (two_pow > n) {
            append_not(1, 1);       // register 1 is initially 0
            append_left(1, 1, n * k);
            append_or(0, 0, 1);
        }
        n = two_pow;
        // Divide-and-conquer to find min
        for (int step = 2; step <= n; step *= 2) {
            vec indices;
            for (int a = 0; a < n; a += step)
                indices.push_back(a);
            swap(indices, step / 2, k);
        }
    } else {
        for (int r = 0; r < n; r++) {
            vec indices;
            for (int i = 0; i < n - 1; i++) {
                if (i % 2 == r % 2)
                    indices.push_back(i);
            }
            swap(indices, 1, k);
        }
    }
}
# Verdict Execution time Memory Grader output
1 Correct 0 ms 212 KB Output is correct
# Verdict Execution time Memory Grader output
1 Incorrect 0 ms 212 KB Wrong answer detected in grader
2 Halted 0 ms 0 KB -
# Verdict Execution time Memory Grader output
1 Correct 0 ms 212 KB Output is correct
2 Correct 0 ms 340 KB Output is correct
3 Correct 1 ms 340 KB Output is correct
4 Correct 1 ms 340 KB Output is correct
# Verdict Execution time Memory Grader output
1 Correct 1 ms 340 KB Output is correct
2 Correct 1 ms 340 KB Output is correct
3 Correct 1 ms 340 KB Output is correct
4 Correct 1 ms 340 KB Output is correct
5 Correct 1 ms 340 KB Output is correct
6 Correct 0 ms 340 KB Output is correct
# Verdict Execution time Memory Grader output
1 Correct 1 ms 340 KB Output is correct
2 Correct 1 ms 340 KB Output is correct
# Verdict Execution time Memory Grader output
1 Correct 1 ms 340 KB Output is correct
2 Correct 1 ms 340 KB Output is correct
3 Correct 3 ms 1220 KB Output is correct
4 Correct 4 ms 980 KB Output is correct
5 Correct 2 ms 852 KB Output is correct
6 Correct 2 ms 852 KB Output is correct
7 Correct 2 ms 852 KB Output is correct