SHORT Half-precision floating point

EVENTSET
GPU0 SMSP_SASS_THREAD_INST_EXECUTED_OP_HADD_PRED_ON_SUM
GPU1 SMSP_SASS_THREAD_INST_EXECUTED_OP_HMUL_PRED_ON_SUM
GPU2 SMSP_SASS_THREAD_INST_EXECUTED_OP_HFMA_PRED_ON_SUM


METRICS
Runtime (RDTSC) [s] time
HP [MFLOP/s] 1E-6*(GPU0+GPU1+(GPU2*2))/time


LONG
Formulas:
HP [MFLOP/s] = 1E-6*(SMSP_SASS_THREAD_INST_EXECUTED_OP_HADD_PRED_ON_SUM+SMSP_SASS_THREAD_INST_EXECUTED_OP_HMUL_PRED_ON_SUM+2*SMSP_SASS_THREAD_INST_EXECUTED_OP_HFMA_PRED_ON_SUM)/time
--
This group measures the half-precision floating-point operations per second using the events
SMSP_SASS_THREAD_INST_EXECUTED_OP_H{ADD, MUL, FMA}_PRED_ON_SUM.
