It does not make much sense to me, but GCC somehow optimises the
inline assembler even though the output is very obviously used and
having observable side effects.
This reverts commit 09731fbfc3a914ec4f6ffad60aa9062db6a8f6aa.
So far, AV_READ_TIME would return the cycle counter. This posed two
problems:
1) On recent systems, it would just raise an illegal instruction
exception. Indeed RDCYCLE is blocked in user space to ward off some
side channel attacks. In particular, this would cause the random
number generator to crash.
2) It does not match the x86 behaviour and the apparent original intent
of AV_READ_TIME in the functional code base (outside test cases).
So this replaces the cycle counter with the time counter. The unit is
a platform-dependent constant fraction of time, and the value should be
stable across harts (RISC-V lingo for physical CPU thread).
This uses the architected RISC-V 64-bit cycle counter from the
RISC-V unprivileged instruction set.
In 64-bit and 128-bit, this is a straightforward CSR read.
In 32-bit mode, the 64-bit value is exposed as two CSRs, which
cannot be read atomically, so a loop is necessary to detect and fix up
the race condition where the bottom half wraps exactly between the two
reads.