Running benchmarks...
  Threads: 8
  QoS: User Interactive
Determining FP64 Neon FMLA performance...
  Repetitions:  500000000
  Duration (s): 1.69049
  GOPS:         283.942
Determining FP32 Neon FMLA performance...
  Repetitions:  500000000
  Duration (s): 1.67869
  GOPS:         571.875
Determining FP16 Neon FMLA performance...
  Repetitions:  500000000
  Duration (s): 1.67555
  GOPS:         1145.89
Determining BF16-BF16-FP32 BFMMLA Neon performance
  Repetitions:  100000000
  Duration (s): 2.10325
  GOPS:         365.148
Determining FP32 SSVE FMLA (Z accumulation) performance...
  Repetitions:  50000000
  Duration (s): 7.18411
  GOPS:         53.4513
Detemining FP64 SSVE FMLA (Z accumulation) performance...
  Repetitions:  50000000
  Duration (s): 7.18517
  GOPS:         26.7217
Determining FP32 AMX performance...
  Repetitions:  300000000
  Duration (s): 10.4942
  GOPS:         2341.87
Determining FP32 SME FMOPA performance (1 tile)...
  Repetitions:  35000000
  Duration (s): 2.15869
  GOPS:         2125.14
Determining FP32 SME FMOPA performance (2 tiles)...
  Repetitions:  75000000
  Duration (s): 4.1984
  GOPS:         2341.47
Determining FP32 SME FMOPA performance (4 tiles)...
  Repetitions:  125000000
  Duration (s): 6.99339
  GOPS:         2342.78
Determining FP32 SME predicated (8/16) FMOPA performance (4 tiles)...
  Repetitions:  125000000
  Duration (s): 6.99614
  GOPS:         1170.93
Determining FP32 SME predicated (15/16) FMOPA performance (4 tiles)...
  Repetitions:  125000000
  Duration (s): 7.03095
  GOPS:         2184.63
Determining FP32 SME FMOPA performance (4 tiles, reordering)...
  Repetitions:  125000000
  Duration (s): 7.70738
  GOPS:         2125.75
Determining FP32 SME SMSTART-SMSTOP performance (8 instructions per block)..
  Repetitions:  125000000
  Duration (s): 3.32277
  GOPS:         1232.71
Determining FP32 SME SMSTART-SMSTOP performance (16 instructions per block)...
  Repetitions:  125000000
  Duration (s): 5.56872
  GOPS:         1471.07
Determining FP32 SME SMSTART-SMSTOP performance (32 instructions per block)...
  Repetitions:  125000000
  Duration (s): 9.80014
  GOPS:         1671.81
Determining FP32 SME SMSTART-SMSTOP performance (64 instructions per block)...
  Repetitions:  100000000
  Duration (s): 14.7087
  GOPS:         1782.24
Determining FP32 SME SMSTART-SMSTOP performance (128 instructions per block)...
  Repetitions:  50000000
  Duration (s): 13.9826
  GOPS:         1874.78
Determining FP16-FP16-FP32 SME FMOPA performance...
  Repetitions:  75000000
  Duration (s): 8.79525
  GOPS:         2235.39
Determining BF16-BF16-FP32 SME BFMOPA performance...
  Repetitions:  75000000
  Duration (s): 8.79423
  GOPS:         2235.65
Determining FP64 SME FMOPA performance ...
  Repetitions:  125000000
  Duration (s): 8.39437
  GOPS:         487.946
Determining I8-I8-I32 SME SMOPA performance...
  Repetitions:  75000000
  Duration (s): 8.44221
  GOPS:         4657.74
Determining I16-I16-I32 SME FMOPA performance...
  Repetitions:  75000000
  Duration (s): 8.45731
  GOPS:         2324.71
Determining FP32 SME FMLA performance...
  Repetitions:  125000000
  Duration (s): 6.97167
  GOPS:         587.52
Determining FP64 SME FMLA performance...
  Repetitions:  150000000
  Duration (s): 8.36704
  GOPS:         293.724
Determining BF16-BF16-FP32 SME BFDOT performance...
  Repetitions:  100000000
  Duration (s): 8.63777
  GOPS:         758.714