Skip to main content

Table 1 Integer half-word/quad-byte SIMD video instructions

From: CUDAMPF: a multi-tiered parallel framework for accelerating protein sequence search in HMMER on CUDA-enabled GPU

Intrinsic PTX assembly

Semantics

Operands and

  

Optional operations

vadd2, vsub2, vadd4, vsub4

Addition/Substraction

.u32.s32.sat.add

vmax2, vmin2, vmax4, vmin4

Maximum/Minimum

.u32.s32.sat.add

vset2, vset4

Comparison

.u32.s32.cmp.add

vavrg2, vavrg4

Average

.u32.s32.sat.add

vabsdiff2, vabsdiff4

Absolute value of difference

.u32.s32.sat.add

  1. Respectively, u32 and s32 represent unsigned and signed values of 32-bit; sat is used to clamp the range of operand based on its bit-width; add is for accumulation; cmp consists of 6 comparison operators: eq, ne, lt, le, gt, ge