GCC提供的几个內建函数

时间：2017-10-30 15:58:55 阅读：737 评论：0 收藏：0 [点我收藏+]

标签：gcc uda round har wap cache null col glibc

参考

https://gcc.gnu.org/onlinedocs/gcc-4.3.2/gcc/Other-Builtins.html#Other-Builtins

https://en.wikipedia.org/wiki/Find_first_set#CTZ

Tool/library	Name	Type	Input type(s)	Notes	Result for zero input
POSIX.1 compliant libc 4.3BSD libc OS X 10.3 libc^[2]^[21]	`ffs`	Library function	int	Includes glibc. POSIX does not supply the complementary log base 2 / clz.	0
FreeBSD 5.3 libc OS X 10.4 libc^[22]	`ffsl` `fls` `flsl`	Library function	int, long	fls ("find last set") computes (log base 2) + 1.	0
FreeBSD 7.1 libc^[23]	`ffsll` `flsll`	Library function	long long		0
GCC	`__builtin_ffs[l,ll,imax]`	Built-in functions	unsigned int, unsigned long, unsigned long long, uintmax_t		0
GCC 3.4.0^[24]^[25] Clang 5.x ^[26]^[27]	`__builtin_clz[l,ll,imax]` `__builtin_ctz[l,ll,imax]`	Built-in functions			undefined
Visual Studio 2005	`_BitScanForward`^[28] `_BitScanReverse`^[29]	Compiler intrinsics	unsigned long, unsigned __int64	Separate return value to indicate zero input	0
Visual Studio 2008	`__lzcnt`^[30]	Compiler intrinsic	unsigned short, unsigned int, unsigned __int64	Relies on x64-only lzcnt instruction	Input size in bits
Intel C++ Compiler	`_bit_scan_forward` `_bit_scan_reverse`^[31]	Compiler intrinsics	int		undefined
NVIDIA CUDA^[32]	`__clz`	Functions	32-bit, 64-bit	Compiles to fewer instructions on the GeForce 400 Series	32
NVIDIA CUDA^[32]	`__ffs`	Functions	32-bit, 64-bit	Compiles to fewer instructions on the GeForce 400 Series	0
LLVM	`llvm.ctlz.` `llvm.cttz.`^[33]	Intrinsic	8, 16, 32, 64, 256	LLVM assembly language	Input size if arg 2 is 0, else undefined
GHC 7.10 (base 4.8), in `Data.Bits`	`countLeadingZeros` `countTrailingZeros`	Library function	`FiniteBits b => b`	Haskell programming language	Input size in bits

— Built-in Function: int __builtin_ffs (unsigned int x)

　　Returns one plus the index of the least significant 1-bit of x, or if x is zero, returns zero.

　　右起第一个1的位置

— Built-in Function: int __builtin_clz (unsigned int x)

　　Returns the number of leading 0-bits in x, starting at the most significant bit position. If x is 0, the result is undefined.

　　左起0的个数

— Built-in Function: int __builtin_ctz (unsigned int x)

　　Returns the number of trailing 0-bits in x, starting at the least significant bit position. If x is 0, the result is undefined。

　　右起0的个数

— Built-in Function: int __builtin_popcount (unsigned int x)

　　Returns the number of 1-bits in x.

　　x中1的个数

— Built-in Function: int __builtin_parity (unsigned int x)

　　Returns the parity of x, i.e. the number of 1-bits in x modulo 2.

　　x中1的个数的奇偶

上面函数都有long，long long版本[-l, -ll]。按上面表格看，也有-imax

— Built-in Function: int32_t __builtin_bswap32 (int32_t x)

翻转32位数各字节

Returns x with the order of the bytes reversed; for example, 0xaabbccdd becomes 0xddccbbaa. Byte here always means exactly 8 bits.

— Built-in Function: int64_t __builtin_bswap64 (int64_t x)

翻转64位数各字节

Similar to __builtin_bswap32, except the argument and return types are 64-bit.

— Built-in Function: long __builtin_expect (long exp, long c)

分支预测

You may use __builtin_expect to provide the compiler with branch prediction information. In general, you should prefer to use actual profile feedback for this (-fprofile-arcs), as programmers are notoriously bad at predicting how their programs actually perform. However, there are applications in which this data is hard to collect.

The return value is the value of exp, which should be an integral expression. The semantics of the built-in are that it is expected that exp == c. For example:
          if (__builtin_expect (x, 0))
            foo ();
     
would indicate that we do not expect to call foo, since we expect x to be zero. Since you are limited to integral expressions for exp, you should use constructions such as
          if (__builtin_expect (ptr != NULL, 1))
            error ();
     
when testing pointer or floating-point values.

— Built-in Function: void __builtin_prefetch (const void *addr, ...)

预取

This function is used to minimize cache-miss latency by moving data into a cache before it is accessed. You can insert calls to __builtin_prefetch into code for which you know addresses of data in memory that is likely to be accessed soon. If the target supports them, data prefetch instructions will be generated. If the prefetch is done early enough before the access then the data will be in the cache by the time it is accessed.

The value of addr is the address of the memory to prefetch. There are two optional arguments, rw and locality. The value of rw is a compile-time constant one or zero; one means that the prefetch is preparing for a write to the memory address and zero, the default, means that the prefetch is preparing for a read. The value locality must be a compile-time constant integer between zero and three. A value of zero means that the data has no temporal locality, so it need not be left in the cache after the access. A value of three means that the data has a high degree of temporal locality and should be left in all levels of cache possible. Values of one and two mean, respectively, a low or moderate degree of temporal locality. The default is three.
          for (i = 0; i < n; i++)
            {
              a[i] = a[i] + b[i];
              __builtin_prefetch (&a[i+j], 1, 1);
              __builtin_prefetch (&b[i+j], 0, 1);
              /* ... */
            }
     
Data prefetch does not generate faults if addr is invalid, but the address expression itself must be valid. For example, a prefetch of p->next will not fault if p->next is not a valid address, but evaluation will fault if p is not a valid address.

If the target does not support data prefetch, the address expression is evaluated if it includes side effects but no other code is generated and GCC does not issue a warning.

GCC提供的几个內建函数

标签：gcc uda round har wap cache null col glibc

原文地址：http://www.cnblogs.com/justinh/p/7754655.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行