Running benchmarks...
  Threads: 4
  QoS: User Interactive
Determining FP64 Neon FMLA performance...
  Repetitions:  500000000
  Duration (s): 1.24127
  GOPS:         193.35
Determining FP32 Neon FMLA performance...
  Repetitions:  500000000
  Duration (s): 1.21153
  GOPS:         396.192
Determining FP16 Neon FMLA performance...
  Repetitions:  500000000
  Duration (s): 1.21082
  GOPS:         792.852
Determining BF16-BF16-FP32 BFMMLA Neon performance
  Repetitions:  100000000
  Duration (s): 1.52879
  GOPS:         251.178
Determining FP32 SSVE FMLA (Z accumulation) performance...
  Repetitions:  50000000
  Duration (s): 6.19417
  GOPS:         30.9969
Detemining FP64 SSVE FMLA (Z accumulation) performance...
  Repetitions:  50000000
  Duration (s): 6.1948
  GOPS:         15.4969
Determining FP32 AMX performance...
  Repetitions:  300000000
  Duration (s): 6.19442
  GOPS:         1983.72
Determining FP32 SME FMOPA performance (1 tile)...
  Repetitions:  35000000
  Duration (s): 1.29377
  GOPS:         1772.93
Determining FP32 SME FMOPA performance (2 tiles)...
  Repetitions:  75000000
  Duration (s): 2.48
  GOPS:         1981.93
Determining FP32 SME FMOPA performance (4 tiles)...
  Repetitions:  125000000
  Duration (s): 4.1316
  GOPS:         1982.77
Determining FP32 SME predicated (8/16) FMOPA performance (4 tiles)...
  Repetitions:  125000000
  Duration (s): 4.13147
  GOPS:         991.414
Determining FP32 SME predicated (15/16) FMOPA performance (4 tiles)...
  Repetitions:  125000000
  Duration (s): 4.13204
  GOPS:         1858.65
Determining FP32 SME FMOPA performance (4 tiles, reordering)...
  Repetitions:  125000000
  Duration (s): 4.13164
  GOPS:         1982.75
Determining FP32 SME SMSTART-SMSTOP performance (8 instructions per block)..
  Repetitions:  125000000
  Duration (s): 1.54985
  GOPS:         1321.42
Determining FP32 SME SMSTART-SMSTOP performance (16 instructions per block)...
  Repetitions:  125000000
  Duration (s): 2.58169
  GOPS:         1586.56
Determining FP32 SME SMSTART-SMSTOP performance (32 instructions per block)...
  Repetitions:  125000000
  Duration (s): 4.64855
  GOPS:         1762.27
Determining FP32 SME SMSTART-SMSTOP performance (64 instructions per block)...
  Repetitions:  100000000
  Duration (s): 7.0236
  GOPS:         1866.17
Determining FP32 SME SMSTART-SMSTOP performance (128 instructions per block)...
  Repetitions:  50000000
  Duration (s): 6.81789
  GOPS:         1922.47
Determining FP16-FP16-FP32 SME FMOPA performance...
  Repetitions:  75000000
  Duration (s): 4.95678
  GOPS:         1983.22
Determining BF16-BF16-FP32 SME BFMOPA performance...
  Repetitions:  75000000
  Duration (s): 4.95669
  GOPS:         1983.26
Determining FP64 SME FMOPA performance ...
  Repetitions:  125000000
  Duration (s): 4.13162
  GOPS:         495.689
Determining I8-I8-I32 SME SMOPA performance...
  Repetitions:  75000000
  Duration (s): 4.95681
  GOPS:         3966.42
Determining I16-I16-I32 SME FMOPA performance...
  Repetitions:  75000000
  Duration (s): 4.95666
  GOPS:         1983.27
Determining FP32 SME FMLA performance...
  Repetitions:  125000000
  Duration (s): 4.13188
  GOPS:         495.658
Determining FP64 SME FMLA performance...
  Repetitions:  150000000
  Duration (s): 4.95805
  GOPS:         247.839
Determining BF16-BF16-FP32 SME BFDOT performance...
  Repetitions:  100000000
  Duration (s): 5.3704
  GOPS:         610.159