Running benchmarks...
  Threads: 1
  QoS: Background
Determining FP32 Neon performance...
  Repetitions:  1000000000
  Total time:  16.562051
  GFLOPS: 14.490959
Determining FP32 SSVE performance...
  Repetitions:  100000000
  Total time:  13.285433
  GFLOPS: 7.225959
Determining FP32 AMX performance...
  Repetitions:  100000000
  Total time:  8.875444
  GFLOPS: 115.374510
Determining FP32 SME FMOPA performance (1 tile)...
  Repetitions:  250000000
  Total time:  35.208206
  GFLOPS: 116.336515
Determining FP32 SME FMOPA performance (2 tiles)...
  Repetitions:  250000000
  Total time:  35.296698
  GFLOPS: 116.044849
Determining FP32 SME FMOPA performance (4 tiles)...
  Repetitions:  250000000
  Total time:  35.287484
  GFLOPS: 116.075150
Determining FP32 SME FMOPA performance (4 tiles, reordering)...
  Repetitions:  250000000
  Total time:  35.374569
  GFLOPS: 115.789397
Determining FP32 SME SMSTART-SMSTOP performance (8 instructions per block)...
  Repetitions:  250000000
  Total time:  13.127246
  GFLOPS: 78.005699
Determining FP32 SME SMSTART-SMSTOP performance (16 instructions per block)...
  Repetitions:  250000000
  Total time:  22.049020
  GFLOPS: 92.883947
Determining FP32 SME SMSTART-SMSTOP performance (32 instructions per block)...
  Repetitions:  250000000
  Total time:  39.568879
  GFLOPS: 103.515695
Determining FP32 SME SMSTART-SMSTOP performance (64 instructions per block)...
  Repetitions:  250000000
  Total time:  74.616996
  GFLOPS: 109.787320
Determining FP32 SME SMSTART-SMSTOP performance (128 instructions per block)...
  Repetitions:  250000000
  Total time:  145.710790
  GFLOPS: 112.441913
Determining FP32 SME BFMOPA performance (widening)...
  Repetitions:  250000000
  Total time:  70.615022
  GFLOPS: 116.009310
Determining FP32 SME BFMOPA performance (widening)...
  Repetitions:  250000000
  Total time:  70.619728
  GFLOPS: 116.001580