This submission is migrated from previous version of oj.uz, which used different machine for grading. This submission may have different result if resubmitted.
/**
 * Cute problem~ I started by experimenting different examples. Nope. I noticed nothing,
 * and was stuck for a while. I initially guessed that maybe Aryan never moves a book to
 * an intermediate table, rather than directly to its destination, but I quickly dismissed
 * the thought because of (2,3,0,1). In fact, the min num of steps to solve (2,3,0,1) is 8,
 * not 10. Then, I wonder, can I prove that 8 is the minimum? Yep! That was how I found the
 * way. I chose to consider some monovariant/invariant. A common one for sorting is
 * sum{|i - p_i|}, i.e. the sum of distances between a book and its destination. This sum
 * is exactly 8 for (2,3,0,1), so 8 is indeed the min.
 * 
 * sum{|i-p_i|} is a minimum because every step of Aryan can only bring a book at most 1
 * table closer to its destination. The min is achieved iff Aryan takes a book towards its
 * right place at every step, regardless of whether a book has been put down in an
 * intermediate table. What's a convenient way to achieve sum{|i-p_i|}? Note that a
 * permutation can be seen as the collection of some cycles. For each cycle, we can just
 * pick the books in it, in order. Let's call such traversal the "min-cost traversal" of
 * the cycle. For example, if the perm is just one cycle of size n, then the sum is indeed
 * sum{|i-p_i|}, achieved with just one complete "min-cost traversal".
 * 
 * For convenience, let's call sum{|i-p_i|} the "principal cost". Note that the principal
 * cost isn't always the min num of steps. Let c_1, c_2, ..., c_k be all the non-trivial
 * cycles (len>1, containing elements that have to be moved) representing our perm. Let the
 * "repr range" of c_i be [smallest_j_in_c_i, greatest_j_in_c_i]. We've already
 * established that a single cycle can be handled with one min-cost traversal. If the repr
 * ranges of c_i and c_j intersect, it can be proved that their min-cost traversal can be
 * "combined" (with no additional steps needed), which means the k cycles' repr ranges can
 * now be seen as a few disjoint repr ranges. The gaps between these disjoint ranges are
 * definitely not covered in the principal cost, but Aryan must walk through each gap at
 * least twice (since he has to return to s too).
 * 
 * Finally, if p[i] != i, we're done; but if p[i] = i, the two costs above aren't enough.
 * Call a position i bad iff p[i] != i. Specifically, we can prove that an additional cost
 * is needed iff all bad positions are to the right of s or are all to the left of s. These
 * three parts combined yields the total min num of steps.
 * 
 * Time Complexity: O(n * log(n))           (maths, permutation)
 * Implementation 1                 (Full solution)
*/
#include <bits/stdc++.h>
#include "books.h"
typedef long long   ll;
typedef std::vector<int>    vec;
ll minimum_walk(vec perm, int s) {
    int n = perm.size();
    std::vector<bool> visited(n, false);
    vec ls, rs;
    int cs = 0;
    ll cost = 0;
    for (int src = 0; src < n; src++) {
        if (visited[src] || src == perm[src])
            continue;
        int l = n, r = -1;
        for (int pt = src;; pt = perm[pt]) {
            visited[pt] = true;
            cost += std::abs(pt - perm[pt]), l = std::min(l, pt), r = std::max(r, pt);
            if (perm[pt] == src)
                break;
        }
        ls.push_back(l);
        rs.push_back(r);
        cs++;
    }
    if (cs == 0)
        return 0;
    
    std::sort(ls.begin(), ls.end());
    std::sort(rs.begin(), rs.end());
    if (s < ls.front())
        cost += 2 * (ls.front() - s);
    if (rs.back() < s)
        cost += 2 * (s - rs.back());
    for (int t = ls.front(), i = 0, j = 0, level = 0; t < rs.back(); t++) {
        while (i < cs && ls[i] <= t)
            level++, i++;
        while (j < cs && rs[j] <= t)
            level--, j++;
        if (level == 0)
            cost += 2;
    }
    return cost;
}
| # | Verdict  | Execution time | Memory | Grader output | 
|---|
| Fetching results... | 
| # | Verdict  | Execution time | Memory | Grader output | 
|---|
| Fetching results... | 
| # | Verdict  | Execution time | Memory | Grader output | 
|---|
| Fetching results... | 
| # | Verdict  | Execution time | Memory | Grader output | 
|---|
| Fetching results... | 
| # | Verdict  | Execution time | Memory | Grader output | 
|---|
| Fetching results... |