Running benchmarks...
  Threads: 2
  QoS: Background
Determining FP32 Neon performance...
  Repetitions:  1000000000
  Total time:  16.580476
  GFLOPS: 28.949712
Determining FP32 SSVE performance...
  Repetitions:  100000000
  Total time:  26.366517
  GFLOPS: 7.281963
Determining FP32 AMX performance...
  Repetitions:  100000000
  Total time:  17.603374
  GFLOPS: 116.341333
Determining FP32 SME FMOPA performance (1 tile)...
  Repetitions:  250000000
  Total time:  70.240158
  GFLOPS: 116.628439
Determining FP32 SME FMOPA performance (2 tiles)...
  Repetitions:  250000000
  Total time:  70.212615
  GFLOPS: 116.674190
Determining FP32 SME FMOPA performance (4 tiles)...
  Repetitions:  250000000
  Total time:  70.348192
  GFLOPS: 116.449332
Determining FP32 SME FMOPA performance (4 tiles, reordering)...
  Repetitions:  250000000
  Total time:  69.982326
  GFLOPS: 117.058127
Determining FP32 SME SMSTART-SMSTOP performance (8 instructions per block)...
  Repetitions:  250000000
  Total time:  21.966044
  GFLOPS: 93.234813
Determining FP32 SME SMSTART-SMSTOP performance (16 instructions per block)...
  Repetitions:  250000000
  Total time:  43.992361
  GFLOPS: 93.107074
Determining FP32 SME SMSTART-SMSTOP performance (32 instructions per block)...
  Repetitions:  250000000
  Total time:  79.224191
  GFLOPS: 103.402760
Determining FP32 SME SMSTART-SMSTOP performance (64 instructions per block)...
  Repetitions:  250000000
  Total time:  149.597182
  GFLOPS: 109.520780
Determining FP32 SME SMSTART-SMSTOP performance (128 instructions per block)...
  Repetitions:  250000000
  Total time:  290.207172
  GFLOPS: 112.912440
Determining FP32 SME BFMOPA performance (widening)...
  Repetitions:  250000000
  Total time:  140.719626
  GFLOPS: 116.430099
Determining FP32 SME BFMOPA performance (widening)...
  Repetitions:  250000000
  Total time:  141.775632
  GFLOPS: 115.562878