Function core::slice::rotate::ptr_rotate
source · pub unsafe fn ptr_rotate<T>(left: usize, mid: *mut T, right: usize)
Expand description
Rotates the range [mid-left, mid+right)
such that the element at mid
becomes the first
element. Equivalently, rotates the range left
elements to the left or right
elements to the
right.
Safety
The specified range must be valid for reading and writing.
Algorithm
Algorithm 1 is used for small values of left + right
or for large T
. The elements are moved
into their final positions one at a time starting at mid - left
and advancing by right
steps
modulo left + right
, such that only one temporary is needed. Eventually, we arrive back at
mid - left
. However, if gcd(left + right, right)
is not 1, the above steps skipped over
elements. For example:
left = 10, right = 6
the `^` indicates an element in its final place
6 7 8 9 10 11 12 13 14 15 . 0 1 2 3 4 5
after using one step of the above algorithm (The X will be overwritten at the end of the round,
and 12 is stored in a temporary):
X 7 8 9 10 11 6 13 14 15 . 0 1 2 3 4 5
^
after using another step (now 2 is in the temporary):
X 7 8 9 10 11 6 13 14 15 . 0 1 12 3 4 5
^ ^
after the third step (the steps wrap around, and 8 is in the temporary):
X 7 2 9 10 11 6 13 14 15 . 0 1 12 3 4 5
^ ^ ^
after 7 more steps, the round ends with the temporary 0 getting put in the X:
0 7 2 9 4 11 6 13 8 15 . 10 1 12 3 14 5
^ ^ ^ ^ ^ ^ ^ ^
Fortunately, the number of skipped over elements between finalized elements is always equal, so
we can just offset our starting position and do more rounds (the total number of rounds is the
gcd(left + right, right)
value). The end result is that all elements are finalized once and
only once.
Algorithm 2 is used if left + right
is large but min(left, right)
is small enough to
fit onto a stack buffer. The min(left, right)
elements are copied onto the buffer, memmove
is applied to the others, and the ones on the buffer are moved back into the hole on the
opposite side of where they originated.
Algorithms that can be vectorized outperform the above once left + right
becomes large enough.
Algorithm 1 can be vectorized by chunking and performing many rounds at once, but there are too
few rounds on average until left + right
is enormous, and the worst case of a single
round is always there. Instead, algorithm 3 utilizes repeated swapping of
min(left, right)
elements until a smaller rotate problem is left.
left = 11, right = 4
[4 5 6 7 8 9 10 11 12 13 14 . 0 1 2 3]
^ ^ ^ ^ ^ ^ ^ ^ swapping the right most elements with elements to the left
[4 5 6 7 8 9 10 . 0 1 2 3] 11 12 13 14
^ ^ ^ ^ ^ ^ ^ ^ swapping these
[4 5 6 . 0 1 2 3] 7 8 9 10 11 12 13 14
we cannot swap any more, but a smaller rotation problem is left to solve
when left < right
the swapping happens from the left instead.