Running benchmarks...
  Threads: 6
  QoS: Background
Determining FP32 Neon performance...
  Repetitions:  1000000000
  Total time:  17.174461
  GFLOPS: 83.845426
Determining FP32 SSVE performance...
  Repetitions:  100000000
  Total time:  79.021135
  GFLOPS: 7.289189
Determining FP32 AMX performance...
  Repetitions:  100000000
  Total time:  52.714006
  GFLOPS: 116.553464
Determining FP32 SME FMOPA performance (1 tile)...
  Repetitions:  250000000
  Total time:  210.274185
  GFLOPS: 116.875973
Determining FP32 SME FMOPA performance (2 tiles)...
  Repetitions:  250000000
  Total time:  211.011456
  GFLOPS: 116.467610
Determining FP32 SME FMOPA performance (4 tiles)...
  Repetitions:  250000000
  Total time:  210.860038
  GFLOPS: 116.551245
Determining FP32 SME FMOPA performance (4 tiles, reordering)...
  Repetitions:  250000000
  Total time:  210.948922
  GFLOPS: 116.502136
Determining FP32 SME SMSTART-SMSTOP performance (8 instructions per block)...
  Repetitions:  250000000
  Total time:  78.923535
  GFLOPS: 77.847501
Determining FP32 SME SMSTART-SMSTOP performance (16 instructions per block)...
  Repetitions:  250000000
  Total time:  131.852318
  GFLOPS: 93.195176
Determining FP32 SME SMSTART-SMSTOP performance (32 instructions per block)...
  Repetitions:  250000000
  Total time:  237.246153
  GFLOPS: 103.588613
Determining FP32 SME SMSTART-SMSTOP performance (64 instructions per block)...
  Repetitions:  250000000
  Total time:  463.033406
  GFLOPS: 106.152168
Determining FP32 SME SMSTART-SMSTOP performance (128 instructions per block)...
  Repetitions:  250000000
  Total time:  988.457442
  GFLOPS: 99.451930
Determining FP32 SME BFMOPA performance (widening)...
  Repetitions:  250000000
  Total time:  596.284022
  GFLOPS: 82.430517
Determining FP32 SME BFMOPA performance (widening)...
  Repetitions:  250000000
  Total time:  498.357598
  GFLOPS: 98.627974