Running benchmarks...
  Threads: 4
  QoS: User Interactive
Determining FP32 Neon performance...
  Repetitions:  1000000000
  Total time:  2.454348
  GFLOPS: 391.142576
Determining FP32 SSVE performance...
  Repetitions:  100000000
  Total time:  12.387171
  GFLOPS: 30.999814
Determining FP32 AMX performance...
  Repetitions:  100000000
  Total time:  2.064732
  GFLOPS: 1983.792570
Determining FP32 SME FMOPA performance (1 tile)...
  Repetitions:  250000000
  Total time:  9.237824
  GFLOPS: 1773.577847
Determining FP32 SME FMOPA performance (2 tiles)...
  Repetitions:  250000000
  Total time:  8.266204
  GFLOPS: 1982.046415
Determining FP32 SME FMOPA performance (4 tiles)...
  Repetitions:  250000000
  Total time:  8.263533
  GFLOPS: 1982.687066
Determining FP32 SME FMOPA performance (4 tiles, reordering)...
  Repetitions:  250000000
  Total time:  8.262634
  GFLOPS: 1982.902789
Determining FP32 SME SMSTART-SMSTOP performance (8 instructions per block)...
  Repetitions:  250000000
  Total time:  3.100762
  GFLOPS: 1320.965621
Determining FP32 SME SMSTART-SMSTOP performance (16 instructions per block)...
  Repetitions:  250000000
  Total time:  5.165702
  GFLOPS: 1585.844480
Determining FP32 SME SMSTART-SMSTOP performance (32 instructions per block)...
  Repetitions:  250000000
  Total time:  9.290259
  GFLOPS: 1763.567625
Determining FP32 SME SMSTART-SMSTOP performance (64 instructions per block)...
  Repetitions:  250000000
  Total time:  17.560335
  GFLOPS: 1866.023627
Determining FP32 SME SMSTART-SMSTOP performance (128 instructions per block)...
  Repetitions:  250000000
  Total time:  34.077260
  GFLOPS: 1923.159315
Determining FP32 SME BFMOPA performance (widening)...
  Repetitions:  250000000
  Total time:  16.521966
  GFLOPS: 1983.299082
Determining FP32 SME BFMOPA performance (widening)...
  Repetitions:  250000000
  Total time:  16.521973
  GFLOPS: 1983.298242