Stefan Pohl Computer Chess

Home of famous UHO openings and EAS Ratinglist

Here you find experimental testruns, which are not part of my regular testwork.

2024/02/15 Experimental testrun of Revenge 1.0 for my UHO-Top15 ratinglist, in order to test, if my EAS-tool works as I predicted.

Author of Willow 4.0 engine on talkchess said this about my EAS-tool:
"Also, the fact that Stockfish and Torch are at the top by a country mile suggests that a large part of what EAS is measuring is engines taking advantage of tactical mistakes by other engines rather than actively seeking out an aggressive play style."

So, here the proof, that this is completely wrong and my EAS-tool works as I always predicted:

I did a testrun of Revenge 1.0 (the strongest really aggressive playing engine besides Stockfish and Torch, but lightyears weaker than Stockfish and Torch, of course):
15000 games versus the Top15-engines of my UHO-Top15 ratinglist. Of course, Revenge 1.0 is way too weak, compared to these top engines. So, the score of Revenge 1.0 was only 18.3% (-141 Elo weaker, than the weakest engine in my UHO-Top15 ratinglist (RofChade 3.1) and Revenge 1.0 won only 465 games out of 15000 (!!!)

Program Elo + - Games Score Av.Op. Draws

1 Stockfish 16 230630 : 3821 4 4 15000 73.8% 3628 45.8% 2 Torch 1 popavx2 : 3783 4 4 15000 69.3% 3631 46.3% 3 KomodoDragon 3.3 avx2 : 3749 4 4 15000 65.0% 3633 47.0% 4 Berserk 12 avx2 : 3725 4 4 15000 61.7% 3635 47.1% 5 RubiChess 240112 avx2 : 3667 4 4 15000 53.7% 3639 48.4% 6 Ethereal 14.25 nnue : 3666 4 4 15000 53.5% 3639 49.2% 7 Caissa 1.16 avx2 : 3665 4 4 15000 53.4% 3639 49.1% 8 Obsidian 10.0 avx2 : 3653 4 4 15000 51.6% 3640 49.1% 9 Seer 2.8.0 avx2 : 3621 4 4 15000 47.1% 3642 49.1% 10 CSTal 2.0 avx2 : 3604 4 4 15000 44.7% 3643 49.5% 11 Clover 6.1 avx2 : 3596 4 4 15000 43.6% 3643 49.7% 12 Koivisto 9.2 avx2 : 3589 4 4 15000 42.6% 3644 48.3% 13 Alexandria 6.0 avx2 : 3584 4 4 15000 41.9% 3644 48.2% 14 Rebel EAS avx2 : 3573 4 4 15000 40.4% 3645 48.5% 15 RofChade 3.1 avx2 : 3566 4 4 15000 39.4% 3645 47.1% 16 Revenge 1.0 avx2 : 3385 5 5 15000 18.3% 3657 30.3%

Games : 120000 (finished)

White Wins : 57796 (48.2 %) Black Wins : 5741 (4.8 %) Draws : 56463 (47.1 %)

But now, look at the EAS-ratinglist, calculated out of these ratinglist games (120000 games):

bad avg.win Rank EAS-Score sacs shorts draws moves Engine/player ------------------------------------------------------------------- 1 197919 31.18% 29.46% 17.09% 71 Revenge 1.0 avx2 2 184362 20.06% 23.61% 09.13% 71 Stockfish 16 230630 3 146678 15.17% 27.19% 14.12% 69 Torch 1 popavx2 4 122333 15.14% 21.14% 14.53% 72 KomodoDragon 3.3 avx2 5 101137 14.39% 17.85% 16.51% 74 RubiChess 240112 avx2 6 88201 12.09% 09.84% 16.04% 80 Obsidian 10.0 avx2 7 82332 15.98% 10.17% 19.46% 83 Rebel EAS avx2 8 81081 10.20% 12.17% 17.87% 80 CSTal 2.0 avx2 9 75262 09.37% 12.82% 19.57% 78 Clover 6.1 avx2 10 72552 13.23% 08.90% 17.29% 85 Ethereal 14.25 nnue 11 69024 10.48% 12.81% 21.57% 76 Caissa 1.16 avx2 12 68697 10.94% 09.81% 19.23% 81 Alexandria 6.0 avx2 13 66430 09.19% 09.78% 18.59% 80 Berserk 12 avx2 14 63224 08.24% 14.94% 23.39% 75 Seer 2.8.0 avx2 15 51774 08.79% 13.71% 24.52% 77 RofChade 3.1 avx2 16 50559 06.28% 08.08% 21.43% 84 Koivisto 9.2 avx2 ------------------------------------------------------------------- *** Average length of all won games: 76 moves

A: Most high-value sacrifices (3+ pawnunits): [1]:05.38% Revenge 1.0 avx2

[2]:03.61% Stockfish 16 230630

[3]:02.31% Rebel EAS avx2

[4]:02.25% Torch 1 popavx2

[5]:01.78% Obsidian 10.0 avx2

B: Most sacrifices overall : [1]:31.18% Revenge 1.0 avx2

[2]:20.06% Stockfish 16 230630

[3]:15.98% Rebel EAS avx2

[4]:15.17% Torch 1 popavx2

[5]:15.14% KomodoDragon 3.3 avx2

C: Very short wins (45 moves or less) : [1]:04.73% Revenge 1.0 avx2

[2]:02.85% Stockfish 16 230630

[3]:01.95% Torch 1 popavx2

[4]:01.87% KomodoDragon 3.3 avx2

[5]:01.15% Rebel EAS avx2

D: Most short wins overall : [1]:29.46% Revenge 1.0 avx2

[2]:27.19% Torch 1 popavx2

[3]:23.61% Stockfish 16 230630

[4]:21.14% KomodoDragon 3.3 avx2

[5]:17.85% RubiChess 240112 avx2

E: Average length of all won games : [1]:069 Torch 1 popavx2

[2]:071 Revenge 1.0 avx2

[3]:071 Stockfish 16 230630

[4]:072 KomodoDragon 3.3 avx2

[5]:074 RubiChess 240112 avx2

So, the clearly (very clearly!) weakest engine is on rank 1 in the EAS-ratinglist ! How awesome is that?
Additionally, I added the Revenge 1.0 games to my full UHO ratinglist, so you can download the games as a part of the gamebase of the full UHO ratinglist.

2023/02/22 Experimental testrun of Rebel 16.2 with different values of the Evalcorrect UCI-parameter.

This option can be used to change the playing style of the engine. The default value is 202. Increasing the value should increase the engine aggressiveness.

A 10000 games RoundRobin tournament was played. 60sec+600ms thinking-time, singlethread, no ponder, no bases, my UHO_2022_8mvs_+120_+129 openings were used.

Download the games of this test right here

Program Elo + - Games Score Av.Op. Draws

1 Rebel 16.2 default : 3617 6 6 4000 52.4% 3600 56.1% 2 Rebel 16.2 Ec=256 : 3601 6 6 4000 49.5% 3604 56.2% 3 Rebel 16.2 Ec=300 : 3600 6 6 4000 49.4% 3604 57.2% 4 Rebel 16.2 Ec=500 : 3600 6 6 4000 49.3% 3604 55.9% 5 Rebel 16.2 Ec=400 : 3599 6 6 4000 49.3% 3604 57.0%

Games : 10000 (finished) White Wins : 4190 (41.9 %) Black Wins : 162 (1.6 %) Draws : 5648 (56.5 %)

Below the Engines Aggressiveness Scoring (EAS), calculated with my EAS-Tool (V5.21):

bad avg.win Rank EAS-Score sacs shorts draws moves Engine/player ------------------------------------------------------------------- 1 75453 10.06% 20.43% 20.36% 79 Rebel 16.2 default 2 72790 09.25% 18.51% 20.45% 81 Rebel 16.2 Ec=400 3 71327 09.36% 16.14% 19.71% 81 Rebel 16.2 Ec=500 4 70839 10.50% 18.20% 20.60% 80 Rebel 16.2 Ec=256 5 61260 09.23% 16.55% 21.16% 82 Rebel 16.2 Ec=300 ------------------------------------------------------------------- *** Average length of all won games: 80 moves

Conclusions: The Evalcorrect-parameter seems quite meaningless. As you can see, the strength and the aggressiveness of Rebel are nearly identical with all Evalcorrect-values and a higher Evalcorrect-value seems to lower the aggressiveness instead of increasing it...

2022/10/14 Experimental testruns (3) of Pedone 3 with different Strength-parameter settings. Merged into one pgn-file. 1min+1sec, singlethread, no ponder, no bases, balanced openings (Feobos c3). Because Pedone 3 plays very aggressive and runs on Android Smartphones, too. So, it is a very interesting engine for playing against as a human (on an electronical chessboard) etc.

Download Pedone 3 here (mention: Do not use Pedone 3.1, the successor, it plays not very aggressive!)

Download the 40500 played testgames and statistics here

(as you can see in the ratinglist, the strength-parameter is a little bit strange... It has a range of 0 up to 100, but several settings do not differ in strength...)

Program Elo + - Games Score Av.Op. Draws

1 Pedone 3.0 100 : 3350 75 75 2000 99.3% 2008 1.4% 2 Pedone 3.0 99 : 2938 44 44 2000 93.3% 2029 4.4% 3 Pedone 3.0 98 : 2740 36 36 2000 88.8% 2038 6.1% 4 Pedone 3.0 97 : 2534 29 29 2000 82.8% 2049 7.4% 5 Pedone 3.0 96 : 2375 23 23 2000 76.7% 2057 10.3% 6 Pedone 3.0 94 : 2183 19 19 2000 66.6% 2066 13.9% 7 Pedone 3.0 95 : 2181 17 17 2000 66.5% 2066 13.9% 8 Pedone 3.0 93 : 2008 16 16 2000 53.7% 2075 14.4% 9 Pedone 3.0 92 : 2007 17 17 2000 53.6% 2075 13.9% 10 Pedone 3.0 91 : 1904 16 16 2000 44.2% 2080 14.9% 11 Pedone 3.0 89 : 1823 16 16 2000 36.3% 2084 15.4% 12 Pedone 3.0 90 : 1821 16 16 2000 36.0% 2084 14.5% 13 Pedone 3.0 88 : 1817 16 16 2000 35.7% 2085 14.1% 14 Pedone 3.0 87 : 1815 15 15 2000 35.5% 2085 15.3% 15 Pedone 3.0 86 : 1720 16 16 2000 26.3% 2089 12.6% 16 Pedone 3.0 81 : 1718 17 17 2000 26.0% 2090 11.7% 17 Pedone 3.0 85 : 1717 17 17 2000 25.9% 2090 10.9% 18 Pedone 3.0 82 : 1716 16 16 2000 25.8% 2090 11.7% 19 Pedone 3.0 83 : 1715 16 16 2000 25.8% 2090 11.0% 20 Pedone 3.0 84 : 1715 16 16 2000 25.7% 2090 11.5% 21 Pedone 3.0 80 : 1712 10 10 5000 49.8% 1793 14.9% 22 Pedone 3.0 79 : 1706 19 19 1000 62.0% 1619 17.1% 23 Pedone 3.0 78 : 1704 19 19 1000 61.7% 1619 15.4% 24 Pedone 3.0 77 : 1619 18 18 1000 48.8% 1628 16.4% 25 Pedone 3.0 73 : 1605 19 19 1000 46.6% 1629 19.7% 26 Pedone 3.0 76 : 1604 19 19 1000 46.5% 1629 17.7% 27 Pedone 3.0 74 : 1594 20 20 1000 45.0% 1630 16.2% 28 Pedone 3.0 72 : 1592 19 19 1000 44.8% 1630 18.7% 29 Pedone 3.0 75 : 1590 19 19 1000 44.4% 1631 15.7% 30 Pedone 3.0 71 : 1586 19 19 1000 43.9% 1631 15.3% 31 Pedone 3.0 70 : 1584 12 12 5000 41.8% 1644 17.7% 32 Pedone 3.0 60 : 1582 15 15 4000 54.4% 1550 17.5% 33 Pedone 3.0 20 : 1517 24 24 4000 50.2% 1516 14.1% 34 Pedone 3.0 30 : 1517 23 23 4000 50.0% 1517 13.9% 35 Pedone 3.0 40 : 1517 20 20 4000 50.0% 1517 13.9% 36 Pedone 3.0 50 : 1517 17 17 4000 45.5% 1549 15.2% 37 Pedone 3.0 10 : 1515 27 27 2000 49.6% 1517 14.3%

Games : 40500 (finished)

White Wins : 18264 (45.1 %) Black Wins : 16734 (41.3 %) Draws : 5502 (13.6 %)

2022/07/03 Experimental testruns of 2 different TripleBrain "engines", using the aquiri-engine.

Download all 16000 played games and the aiquiri-engine folder here

First of all: Aiquiri does not run with all engines. For example: Koivisto 8.13 or Ethereal 13.75 did not work. So, if you use aiquiri, always check (with the Taskmanager), that "master.exe", "slave1.exe" and "slave2.exe" are running! I used cutechess-cli for my testruns. There I increased the timemargin-parameter to 2000 and set restart=on for the aiquire-engine (=reloading after each game finished). Timeparamters in aquiri were lowered to 35 for slaves (default 40) and 15 (default 20) for master. That worked. With my SPCC-testsetup (3min+1sec, singlethread, 20 games running simultaneously) I had only 8 timelosses in 16000 games - acceptable. I used my EAS-tool for measuring the aggressiveness of play of the engines (look here for more information) and the SPCC-Elo of the engines for the strength.

I built 2 different TripleBrains: (1) with a strong but solid playing master and two weaker but very aggressive playing slaves and (2) with a weaker but very aggressive master and two stronger but solid playing slaves. Mention, that the master-engine only chooses between the 2 slave-moves (if there are 2 different moves), the master-engine never plays an own move!!!

Here the results:

TripleBrain (1) Slave1:Pedone 3 (Elo: 3341 EAS: 11219) Slave2:Velvet 3.3.0 (Elo: 3305 EAS: 10251) Master:Berserk 9 (Elo: 3644 EAS: 688) Result of aquiri TripleBrain: Elo: 3400 EAS: 5891

TripleBrain (2) Slave1:Clover 3.1 (Elo: 3401 EAS: 898) Slave2:Igel 3.1.0 (Elo: 3448 EAS: 932) Master:Velvet 3.3.0 (Elo: 3305 EAS: 10251) Result of aquiri TripleBrain: Elo: 3379 EAS: 1064

The results were (as expected by me):

(1) The stronger master increases the Elo of the TripleBrain (compared to the slaves), but the aggressiveness fades away (the EAS-score is only around 50% compared to the EAS-scores of the slaves)

(2) The weaker master decreases the Elo of the TripleBrain, but the aggressive playing master was not able to gain aggressiveness out of the solid-playing slaves (because you can not play aggressive, if you have to choose between 2 solid moves, only!).

So, IMO, these 2 experiments show clearly, that the TripleBrain-idea is useless...A solid, strong master make aggressive slaves play stronger, but their aggressiveness fades away. An aggressive playing master is unable to make 2 stronger solid plaing slaves play more aggressive.

2022/01/20 Experimental testrun of Fat Titz 2 vs. Stockfish 220113 with very long thinking-time. Goal: Find out, if Fat Titz 2 can benefit from his bigger nnue-net (compared to Stockfish), when the thinking-time is very long...Thinking-time: 20min+10sec on singlethread, average game-duration 65 minutes!!!

Extreme longtime testrun of Fat Titz 2 vs. Stockfish 210113 1000 games, singlethread, no ponder, no bases, i7-8750H 2.6GHz Notebook Unbalanced Human Openings (UHO_V3_8mvs_+100_+109) for low draw-rate Thinking-time: 20min+10secs !!! (average game-duration 65 minutes!)

Program Elo + - Games Score Av.Op. Draws

1 Fat Titz 2 bmi2 : 3802 7 7 1000 50.3 % 3800 51.7 % 2 Stockfish 220113 bmi2 : 3800 7 7 1000 49.8 % 3802 51.7 %

Games : 1000 (finished) White Wins : 482 (48.2 %) Black Wins : 1 (0.1 %) Draws : 517 (51.7 %)

Individual statistics: Fat Titz 2 bmi2: 1000 (+244,=517,-239), 50.3 %

Gamepairs rescoring tool result:

# PLAYER : RATING ERROR PLAYED W D L (%) 1 Fat Titz 2 bmi2 : 3804 21 500 117 272 111 50.6 2 Stockfish 220113 bmi2 : 3800 ---- 500 111 272 117 49.4

Individual statistics: Fat Titz 2 bmi2: 500 (+117,=272,-111), 50.6 %

Conclusion: Fat Titz 2 does not benefit from his much bigger nnue-net (compared to Stockfish), even though the thinking-time was so long...

Download games and statistics here

2021/11/24 Experimental RoundRobin tournament with 6 different playing-styles of KomodoDragon 2.5 (Default, Defensive, Positional, Human, Active, Aggressive), each style combined with MCTS on and off = 12 engine-settings.

Tournament with 2'+1'' thinking-time on AMD Ryzen 3900 12-core (24 threads) notebook with 32GB RAM, singelthread-mode, no ponder, no bases (except for cutechess-cli, to end a game), cutechess-cli, classical, balanced 8 moves deep openings, played by humans (out of Megabase 2020, both players 2400 Elo or better). 100 rounds = 13200 games played.

Download the played games here

Program Elo + - Games Score Av.Op. Draws

1 Dragon 2.5 Default : 0 16 16 2200 89.1 % -409 20.8 % 2 Dragon 2.5 Default MCTS : -96 14 14 2200 82.0 % -400 29.6 % 3 Dragon 2.5 Defensive : -320 11 11 2200 58.3 % -380 51.5 % 4 Dragon 2.5 Positional : -321 11 11 2200 58.2 % -380 43.2 % 5 Dragon 2.5 Human : -329 11 11 2200 57.2 % -379 26.7 % 6 Dragon 2.5 Active : -426 11 11 2200 44.5 % -370 37.3 % 7 Dragon 2.5 Defensive MCTS : -428 10 10 2200 44.2 % -370 49.9 % 8 Dragon 2.5 Aggressive : -465 10 10 2200 39.5 % -367 41.3 % 9 Dragon 2.5 Positional MCTS : -469 11 11 2200 39.0 % -367 39.1 % 10 Dragon 2.5 Human MCTS : -472 10 10 2200 38.6 % -366 26.3 % 11 Dragon 2.5 Active MCTS : -575 12 12 2200 26.1 % -357 32.4 % 12 Dragon 2.5 Aggressive MCTS : -600 12 12 2200 23.3 % -355 32.2 %

Games : 13200 (finished)

White Wins : 4731 (35.8 %) Black Wins : 3736 (28.3 %) Draws : 4733 (35.9 %)

2021/10/09 Experimental RoundRobin tournament with 3 engines (Stockfish 211006, KomodoDragon 2.5 and KomodoDragon 2.5 MCTS), each with 5 different MultiPV-settings (1,2,3,5 and 7, were 1 is the normal, default playing mode). Goal: Measure, how much Elo is lost by calculating more than one PV-line. And to measure, if Dragon 2.5 MCTS has less Elo-loss, than the AlphaBeta-engines, when MultiPV is 3 or higher...This is the new testrun with Stockfish 211006 (fixed time-management in MultiPV-mode) instead of Stockfish 14. The games-download includes the games and the statistics of the first testrun, too, for comparsion. The ratings of the old testrun with Stockfish 14 are below, for comparsion.

Tournament with 3'+1'' thinking-time on AMD Ryzen 3900 12-core (24 threads) notebook with 32GB RAM, singelthread-mode, no ponder, no bases (except for cutechess-cli, to end a game), cutechess-cli, classical, balanced 6 moves deep openings, played by humans (out of Megabase 2020, both players 2400 Elo or better). 100 rounds = 10500 games played. Same engines = same color...

Download the played games here

Program Elo + - Games Score Av.Op. Draws

1 Stockfish 211006 pv=1 : 0 13 13 1400 64.8 % -110 69.9 % 2 Stockfish 211006 pv=2 : -24 13 13 1400 61.4 % -108 73.1 % 3 Stockfish 211006 pv=3 : -49 12 12 1400 57.8 % -106 74.9 % 4 KomDragon 2.5 avx2 pv=1 : -49 13 13 1400 57.8 % -106 76.4 % 5 KomDragon 2.5 avx2 pv=2 : -78 13 13 1400 53.5 % -104 78.6 % 6 Stockfish 211006 pv=5 : -90 13 13 1400 51.7 % -103 74.1 % 7 KomDragon 2.5 MCTS pv=1 : -108 13 13 1400 49.1 % -102 78.6 % 8 KomDragon 2.5 MCTS pv=2 : -112 13 13 1400 48.5 % -102 79.7 % 9 KomDragon 2.5 MCTS pv=3 : -112 12 12 1400 48.5 % -102 76.1 % 10 Stockfish 211006 pv=7 : -112 12 12 1400 48.5 % -102 71.6 % 11 KomDragon 2.5 MCTS pv=5 : -114 13 13 1400 48.2 % -101 78.0 % 12 KomDragon 2.5 MCTS pv=7 : -117 13 13 1400 47.7 % -101 78.8 % 13 KomDragon 2.5 avx2 pv=3 : -118 12 12 1400 47.6 % -101 73.6 % 14 KomDragon 2.5 avx2 pv=5 : -187 12 12 1400 37.6 % -96 62.9 % 15 KomDragon 2.5 avx2 pv=7 : -264 14 14 1400 27.4 % -91 50.7 %

For comparsion, here the ratings with Stockfish 14 (buggy time-management in MultiPV-mode):

Program Elo + - Games Score Av.Op. Draws

1 Stockfish 14 avx2 pv=1 : 0 13 13 1400 64.2 % -105 70.6 % 2 KomDragon 2.5 avx2 pv=1 : -32 13 13 1400 59.7 % -103 76.5 % 3 Stockfish 14 avx2 pv=2 : -32 13 13 1400 59.7 % -103 74.9 % 4 KomDragon 2.5 avx2 pv=2 : -55 13 13 1400 56.3 % -101 77.5 % 5 Stockfish 14 avx2 pv=3 : -73 12 12 1400 53.7 % -100 74.6 % 6 KomDragon 2.5 MCTS pv=7 : -86 13 13 1400 51.7 % -99 77.9 % 7 KomDragon 2.5 MCTS pv=5 : -89 13 13 1400 51.3 % -98 78.0 % 8 KomDragon 2.5 MCTS pv=1 : -89 13 13 1400 51.2 % -98 78.7 % 9 KomDragon 2.5 MCTS pv=3 : -92 12 12 1400 50.8 % -98 77.6 % 10 KomDragon 2.5 avx2 pv=3 : -95 12 12 1400 50.3 % -98 75.9 % 11 KomDragon 2.5 MCTS pv=2 : -97 12 12 1400 50.0 % -98 78.4 % 12 Stockfish 14 avx2 pv=5 : -145 12 12 1400 43.0 % -94 68.0 % 13 KomDragon 2.5 avx2 pv=5 : -169 12 12 1400 39.5 % -93 63.1 % 14 Stockfish 14 avx2 pv=7 : -188 13 13 1400 36.8 % -91 60.7 % 15 KomDragon 2.5 avx2 pv=7 : -225 14 14 1400 31.8 % -89 52.9 %

Conclusions (new testrun): 1) The MCTS-mode is very good for MultiPV-analyzing. As you can see, all 5 KomodoDragon 2.5 MCTS MultiPV-engines are in a very small range of 9 Elo, only (!).

2) The Elo-difference of Stockfish 211006 and KomodoDragon 2.5 non-MCTS is:

pv=1: 49 Elo / pv=2: 54 Elo / pv=3: 69 Elo / pv=5: 97 Elo / pv=7: 152 Elo. In contrast to the old testrun with SF 14, the Elo-difference of Stockfish 211006 and KomodoDragon 2.5 non-MCTS is increasing with higher number of pv-lines very clearly.

2021/03/02 Huge CloneWars tournament. Stockfish 13 vs. 10 Stockfish derivatives/clones. 20000 games.

60''+600ms thinking-time, singlethread, i7-8750H 2.6GHz (Hexacore) Notebook, Windows 10 64bit, no ponder, 5 Syzygy bases for cutechess-cli - none for the engines. All engines bmi2-binary.

My Unbalanced Human Openings V2.00 6moves openings were used (low draw-rate and a wider Elo-spreading than classical opening-sets!).

Program Elo + - Games Score Av.Op. Draws

1 CFish 210208 bmi2 : 3742 11 11 2000 52.7 % 3723 56.0 % 2 Stockfish 13 bmi2 : 3723 3 3 20000 55.8 % 3682 54.3 % 3 CorChess 1.3 bmi2 : 3716 10 10 2000 49.0 % 3723 56.4 % 4 Sugar AI 1.50 bmi2 : 3707 10 10 2000 47.7 % 3723 56.1 % 5 Eman 6.93 bmi2 : 3706 10 10 2000 47.6 % 3723 55.0 % 6 Fat Fritz 2 bmi2 : 3695 10 10 2000 46.0 % 3723 54.0 % 7 Honey 13 bmi2 : 3687 10 10 2000 44.9 % 3723 55.3 % 8 ShashChess 15.1 bmi2 : 3670 11 11 2000 42.5 % 3723 55.5 % 9 Raubfisch x44 bmi2 : 3639 11 11 2000 38.2 % 3723 55.4 % 10 Fat Fritz 2 free : 3633 11 11 2000 37.5 % 3723 48.6 % 11 Crystal 3.1 bmi2 : 3623 11 11 2000 36.0 % 3723 51.1 %

Games : 20000 (finished)

White Wins : 8750 (43.8 %) Black Wins : 382 (1.9 %) Draws : 10868 (54.3 %)

Conclusions: All derivatives/clones are playing measureable weaker than Stockfish 13, except CFish - no surprise, because CFish is running a little bit faster than Stockfish and has no other changes. Mention, that the used Unablanced Human Openings spread the Elo-distances around 2x bigger than a classical opening set...

Download the games here

2020/12/05 Huge experimental test (3x 7000 games) of the Eman 6.60 learning-feature.

I was curious, if (and how many) Eman 6.60 will gain by using it's learning-function. Eman 6.60 writes an experience-file, when he is playing. So I did 3 7000 games testruns, starting with no experience and then let Eman learn and learn. Each of the 3 testruns were 100% identical to the others, expcept, that Eman was allowed to learn and to keep the Eman.exp-file for the next testrun. All conditions like a normal Stockfish-testrun (see main-site): 3'+1'', singlecore, Hash 256MB, no ponder, 500 HERT openings.

As you can see below, the results are a complete disappointment. The second testrun (using experience-file of the first testrun) gave +4 Elo, the third testrun (using experience-file of the first and the second testrun) gave no progress at all, even though the Eman.exp-file size was 72 Megabyte, after the third testrun was finished...

Program Elo + - Games Score Av.Op. Draws

1 CFish 12 3xCerebellum : 3726 9 9 7000 86.1 % 3389 27.3 % 2 Stockfish 201115 avx2 : 3724 8 8 7000 78.3 % 3473 41.9 % 3 Stockfish 201126 avx2 : 3722 8 8 7000 78.1 % 3473 42.6 % 4 Eman 6.60 avx2 3rd_run : 3716 8 8 7000 77.6 % 3473 43.4 % 5 Eman 6.60 avx2 2nd_run : 3716 8 8 7000 77.6 % 3473 43.0 % 6 Eman 6.60 avx2 : 3712 8 8 7000 77.3 % 3473 43.1 % 7 CFish 12 avx2 : 3703 8 8 7000 84.6 % 3389 29.1 % 8 Stockfish 12 200902 : 3684 4 4 25000 74.1 % 3470 45.4 %

*** rest of the ratinglist deleted ***

Individual statistics:

Eman 6.60 avx2 3rd_run: 3716 7000 (+3914,=3035,- 51), 77.6 %

RubiChess 1.9dev nnue : 1000 (+803,=195,- 2), 90.0 % KomodoDragon 1.0 avx2 : 1000 (+227,=758,- 15), 60.6 % Slow Chess 2.4 popc : 1000 (+726,=274,- 0), 86.3 % Ethereal 12.75 avx2 : 1000 (+706,=291,- 3), 85.2 % Nemorino 6.00 avx2 : 1000 (+634,=363,- 3), 81.5 % Stockfish 12 200902 : 1000 (+123,=850,- 27), 54.8 % Houdini 6 pext : 1000 (+695,=304,- 1), 84.7 % ********************************************************** Eman 6.60 avx2 2nd_run: 3716 7000 (+3927,=3007,- 66), 77.6 %

RubiChess 1.9dev nnue : 1000 (+802,=196,- 2), 90.0 % KomodoDragon 1.0 avx2 : 1000 (+202,=782,- 16), 59.3 % Slow Chess 2.4 popc : 1000 (+743,=254,- 3), 87.0 % Ethereal 12.75 avx2 : 1000 (+712,=285,- 3), 85.5 % Nemorino 6.00 avx2 : 1000 (+645,=353,- 2), 82.2 % Stockfish 12 200902 : 1000 (+110,=854,- 36), 53.7 % Houdini 6 pext : 1000 (+713,=283,- 4), 85.5 % ********************************************************** Eman 6.60 avx2 : 3712 7000 (+3900,=3018,- 82), 77.3 %

RubiChess 1.9dev nnue : 1000 (+803,=197,- 0), 90.2 % KomodoDragon 1.0 avx2 : 1000 (+205,=770,- 25), 59.0 % Slow Chess 2.4 popc : 1000 (+740,=254,- 6), 86.7 % Ethereal 12.75 avx2 : 1000 (+718,=277,- 5), 85.7 % Nemorino 6.00 avx2 : 1000 (+620,=378,- 2), 80.9 % Stockfish 12 200902 : 1000 (+113,=849,- 38), 53.8 % Houdini 6 pext : 1000 (+701,=293,- 6), 84.8 % **********************************************************

2020/02/05 Huge experimental test-tournament (31500 games (!)) of Stockfish 11 with 7 different Contempts (-40, -24, -15, 0, +15, +24 (=default of SF 11), +40).

Thinking-time: 1'+1'', singlethread, 256 Hash, no ponder, no endgame-bases for engines (5 Syzygy for cutechess-cli). 5 human moves openings.

Download all played games here

Program Elo + - Games Score Av.Op. Draws

1 Stockfish 11 C=0 : 3558 4 4 9000 50.7 % 3553 79.3 % 2 Stockfish 11 C=+15 : 3558 4 4 9000 50.6 % 3554 77.8 % 3 Stockfish 11 C=-24 : 3555 4 4 9000 50.1 % 3554 82.7 % 4 Stockfish 11 C=-15 : 3554 4 4 9000 50.0 % 3554 81.7 % 5 Stockfish 11 C=+24 : 3554 4 4 9000 50.0 % 3554 76.2 % 6 Stockfish 11 C=+40 : 3551 4 4 9000 49.5 % 3555 73.8 % 7 Stockfish 11 C=-40 : 3549 4 4 9000 49.2 % 3555 84.5 %

Conclusions: Only the +40 and -40 Contempt results are somewhat weaker. All other Contempts are inside errorbar at the same level of strength.

2019/02/20 Testrun of the new Drawkiller balanced set (and testruns of Drawkiller tournament, Stockfish Framework 8moves and GM-4moves sets for comparsion).

3 engines played a RoundRobin (Stockfish 10, Houdini 6 and Komodo 12), with 500 games in each head-to-head, so each engine played 1000 games. For each game one opening-line was chosen per random by the LittleBlitzerGUI.

Singlecore, 3'+1'', LittleBlitzerGUI, no ponder, no bases, 256 MB Hash, i7-6700HQ 2.6GHz Notebook (Skylake CPU), Windows 10 64bit

In the Drawkiller balanced sets, all endposition-evals (analyzed by Komodo) of the opening lines are in a very small interval of [-0.09;+0.09]. The idea is, that this should lead to wider Elo-spreading of the Engine ratings, which makes the Engine rankings much more statistically reliable (or a much lower number of played games is needed, to get the results out of the errorbar-arrays). Of course, on the other hand, this concept leads to little bit higher draw-rates...

Let's see, if it worked:

Drawkiller balanced:

Program Elo + - Games Score Av.Op. Draws

1 Stockfish 10 bmi2 : 3506 11 11 1000 70.9 % 3347 36.2 % 2 Houdini 6 pext : 3392 11 11 1000 48.5 % 3404 40.8 % 3 Komodo 12 bmi2 : 3302 11 11 1000 30.6 % 3449 36.6 %

Elo-spreading (1st to last): 204 Elo

Draws: 37.9%

Drawkiller tournament:

Program Elo + - Games Score Av.Op. Draws

1 Stockfish 10 bmi2 : 3494 11 11 1000 68.9 % 3353 34.2 % 2 Houdini 6 pext : 3387 11 11 1000 47.3 % 3407 38.2 % 3 Komodo 12 bmi2 : 3320 11 11 1000 33.8 % 3440 36.0 %

Elo-spreading (1st to last): 174 Elo

Draws: 36.1%

GM_4moves:

Program Elo + - Games Score Av.Op. Draws

1 Stockfish 10 bmi2 : 3475 11 11 1000 65.4 % 3363 53.2 % 2 Houdini 6 pext : 3381 10 10 1000 46.0 % 3410 59.9 % 3 Komodo 12 bmi2 : 3345 10 10 1000 38.5 % 3428 55.9 %

Elo-spreading (1st to last): 130 Elo

Draws: 56.3%

Stockfish framework 8moves:

Program Elo + - Games Score Av.Op. Draws

1 Stockfish 10 bmi2 : 3463 11 11 1000 63.0 % 3369 59.7 % 2 Houdini 6 pext : 3388 10 10 1000 47.5 % 3406 64.2 % 3 Komodo 12 bmi2 : 3349 10 10 1000 39.5 % 3425 60.1 %

Elo-spreading (1st to last): 114 Elo

Draws: 61.3%

Conclusions:

1) The Drawkiller balanced idea was a success. The draw-rate is a little bit higher, than Drawkiller tournament (that is price, we have to pay for 2)), but look at point 2) and mention, that even this little higher draw-rate is still much, much lower, than the draw-rate of any other non-Drawkiller openings set...

2) The Elo-spreading, using Drawkiller balanced, was measureable higher, than with any other openings-set. That makes the Engine rankings much more statistical reliable. Or a much lower number of played games is needed, to get the results out of the errorbar-arrays:

Example: Compared to the result of Stockfish framework 8moves openings, the Elo-spreading of Drawkiller balanced is nearly doubled, which means, you can have a doubled errorbar-array size for the same statistical reliability of the Engine rankings in a tournament / ratinglist. Mention, that you have to play 4x more games to half the size of an errorbar! That means, if you are using Drawkiller balanced openings, you have to play only 25%-30% amount of games, which you have to play, when using Stockfish Framework 8move openings for the same statistical result-quality of engine rankings (!!!) - how awesome is that?!?

2019/01/26 Testrun of the Skill-Levels of Stockfish 10

I made two large testruns with Stockfish 10, playing RoundRobin vs. itself with different Skill-Levels.

First testrun: Level 20-10 (11000 games, 1'+1'', singlecore)

Second testrun: Level 10-0 (5500 games 1'+1'', singlecore)

Then both game-pools were linked together and ORDO-calculated (fixed to 3450 Elo to Stockfish 10, Level 20, which is the Elo of Stockfish 10 in the CEGT-ratinglist (40m/4', singleCPU)).

Specs: Intel Quadcore-Notebook (SF 10 around 1.4 Mn/s in singlecore-mode), LittleBlitzerGUI, Stockfish Framework 8move-openings. No ponder, no bases. 256MB Hash per engine.

Program Elo + - Games Score Av.Op. Draws

1 Stockfish 10 bmi2 (100%) : 3450 47 47 2000 98.5 % 2601 2.8 % 2 Stockfish 10 lev=19 (95%) : 2905 22 22 2000 73.9 % 2656 16.9 % 3 Stockfish 10 lev=18 (90%) : 2872 22 22 2000 71.2 % 2659 17.6 % 4 Stockfish 10 lev=17 (85%) : 2815 22 22 2000 66.0 % 2665 17.7 % 5 Stockfish 10 lev=16 (80%) : 2761 21 21 2000 60.8 % 2670 20.4 % 6 Stockfish 10 lev=15 (75%) : 2657 21 21 2000 50.3 % 2681 19.9 % 7 Stockfish 10 lev=14 (70%) : 2571 21 21 2000 41.5 % 2689 15.6 % 8 Stockfish 10 lev=13 (65%) : 2483 21 21 2000 32.7 % 2698 13.9 % 9 Stockfish 10 lev=12 (60%) : 2406 22 22 2000 25.5 % 2706 12.4 % 10 Stockfish 10 lev=11 (55%) : 2320 21 21 2000 18.3 % 2714 10.5 % 11 Stockfish 10 lev=10 (50%) : 2221 16 16 3000 36.9 % 2386 5.9 % 12 Stockfish 10 lev=9 (45%) : 2129 26 26 1000 81.8 % 1720 4.0 % 13 Stockfish 10 lev=8 (40%) : 2067 25 25 1000 76.8 % 1726 4.2 % 14 Stockfish 10 lev=7 (35%) : 1976 25 25 1000 68.8 % 1735 4.9 % 15 Stockfish 10 lev=6 (30%) : 1881 25 25 1000 60.2 % 1745 3.4 % 16 Stockfish 10 lev=5 (25%) : 1823 25 25 1000 54.9 % 1751 2.8 % 17 Stockfish 10 lev=4 (20%) : 1678 26 26 1000 42.0 % 1765 2.5 % 18 Stockfish 10 lev=3 (15%) : 1538 28 28 1000 30.1 % 1779 1.1 % 19 Stockfish 10 lev=2 (10%) : 1443 29 29 1000 22.7 % 1789 1.0 % 20 Stockfish 10 lev=1 (5%) : 1341 32 32 1000 15.4 % 1799 0.3 % 21 Stockfish 10 lev=0 (0%) : 1231 36 36 1000 8.8 % 1810 0.3 %

(Stockfish 10: 3450 Elo is the CEGT-ranking 40m/4'), The percent-numbers in brackets are the value of the "strength-meter" in the Droidfish-App for Smartphones...

2019/01/06 One of the biggest opening-sets testings of all time!

8 opening-sets were tested: Drawkiller tournament, SALC V5, Noomen (TCEC openings Season 9-13 Superfinal and Gambit-openings), Stockfish Framework 2-moves and 8-moves openings, 4 GM moves (out of MegaBase 2018, checked with Komodo), the HERT set by Thomas Zipproth and FEOBOS v20.1 contempt 3 (using contempt 3 openings is recommended by the author, Frank Quisinsky). 7 engines played a 2100 games RoundRobin-tournament with each opening-set (not openings-set playing vs. another opening-set!). For each game one opening-line was chosen per random by the GUI.

7 engines played round-robin: Stockfish 10, Houdini 6, Komodo 12, Fire 7.1, Ethereal 11.12, Komodo 12.2.2 MCTS, Shredder 13. = 100 games were played in each head-to-head competition. In each round-robin, each engine played 600 games.

Singlecore, 3'+1'', LittleBlitzerGUI, no ponder, no bases, 256 MB Hash, i7-6700HQ 2.6GHz Notebook (Skylake CPU), Windows 10 64bit. 3 games running in parallel, each testrun took 3-4 days, depending on the average game-duration. Draw adjucation after 130 played moves by the engines (after finishing opening-line)

Conclusions (all data and results below!):

First of all the main question: Why are low draw-rates and wide Elo-spreadings of engine testing-results better? You find the answer here

This excellent experiment of Andreas Strangmueller shows without any doubt, that:

The more thinking-time (or faster hardware, thats the same!) the computerchess gets, the more the draw-rates climb and the more the Elo-spreadings shrink. So, it is only a question of time, that the draw-rates will get so high and the Elo-spreading of testings-results will get so small, that engine-testing or engine-tournaments will no longer give any valuable results, because the Elo-differences of results will always stay inside the errorbars, even with thousands of played games. So, it is absolute necessary to lower the draw-rates and raise the Elo- spreadings, if computerchess shall survive the next decades!

Therefore the follwing conclusions of this huge experiment with different opening-sets:

1) The Drawkiller openings are a breakthrough into another dimension of engine-testing: The overall draw-rate (27%) is nearly halved, compared to classical openings sets (FEOBOS (51.3%), Stockfish Framework 8moves openings (51.9%)) AND the Elo-spreading is around +150 Elo better (!!), so the rankings are much more stable and reliable, because the errorbars of all results are nearly the same in all testruns. And the average game-duration, using Drawkiller, was 11.5% lower, than using a classical opening-set. So, in the same time, you can play more than +10% games on the same machine, which improves the quality of the results, too, because the errorbars get smaller with more played games. Download the future of computerchess (the Drawkiller openings): here

2) The order of rank of the engines is in all mini-ratinglists generated by ORDO out of these testruns exactly the same. So, what we learn here, is, that it does not matter, if an opening-set contains all ECO-codes (FEOBOS does!) or not (Drawkiller, SALC V5 do definitly not!). The order of rank of engines in a ratinglist is exactly the same! So, the over and over repeated statement of many people, that using all or the mostly played ECO-codes (by human players) in an opening-set is important for engine-testing, because otherwise the results are distorted, is a FAIRY TALE and nothing else !!!

3) At the bottom, I added the CEGT and CCRL ratinglists with the same engines, which were used for this project (nearly the same versions (Ethereal 11 instead Ethereal 11.12 for example)). There you can see, that the ranking in these ratinglist is exactly the same, too. So, what we learn here, is, that the over and over repeated statement of many people, that it is necessary to test engines versus a lot of opponents for a valid rating/ranking is a FAIRY TALE, too: 6 opponents gave the same ranking-results in all testruns of this project, than in CEGT in CCRL with much, much more opponents.

4) The FEOBOS-project was a complete waste of time and resources. It took more than one year of work and calculations, but the results are not measureable better, than the results of Stockfish Framework 8moves openings: The overall draw-rate is 0.6% better (=nothing). The Elo-spreading is +21 Elo better (=nearly nothing). And the prime-target of FEOBOS was, to avoid early draws: The number of early draws until move 10 (after leaving the opening-line) is 0, but with Stockfish Framework 8moves openings, there are only 6 games draw until 10 moves. Out of 2100 games (=0.29%). And the number of early draws until move 20 and move 30, FEOBOS is slightly worse, than the Stockfish Framework 8moves openings. So, even the prime-target of FEOBOS failed.

5) The Noomen openings, which lower the draw-rate in the TCEC superfinals and the Noomen gamibt-lines, lowered the overall draw-rate (43%) compared to classical openings sets (FEOBOS (51.3%), Stockfish Framework 8moves openings (51.9%)), but not very much (compared to Drawkiller (27%)). And the Elo- spreading is only a little bit better, than the Elo-spreading of the classical opening-sets. So, these Noomen-openings are a little improvement, but not more. And the number of openings is very, very small (only 477 lines): too small for building an opening book. Drawkiller tournament contains 6848 lines and can be used as opening-book, of course.

Short summary (sorted by the overall draw-rate):

Drawkiller tournament:

Draws : 566 (27.0 %)
Avg game length = 389.777 sec
Elo-spreading: from first to last: 448 Elo

SALC V5:

Draws : 829 (39.5 %)
Avg game length = 399.781 sec
Elo-spreading: from first to last: 341 Elo

Noomen (TCEC openings Season 9-13 Superfinal and Gambit-openings (477 lines)):

Draws : 902 (43.0 %)
Avg game length = 405.223 sec
Elo-spreading: from first to last: 312 Elo

Stockfish Framework 2moves openings:

Draws : 929 (44.2 %)
Avg game length = 430.108 sec
Elo-spreading: from first to last: 333 Elo

4 GM moves (out of MegaBase 2018, checked with Komodo):

Draws : 975 (46.4 %)
Avg game length = 449.414 sec
Elo-spreading: from first to last: 330 Elo

HERT set (500 pos):

Draws : 1013 (48.2 %)
Avg game length = 442.339 sec
Elo-spreading: from first to last: 316 Elo

FEOBOS v20.1 contempt 3:

Draws : 1077 (51.3 %)
Avg game length = 437.481 sec
Elo-spreading: from first to last: 302 Elo

Stockfish Framework 8moves openings:

Draws : 1090 (51.9 %)
Avg game length = 438.899 sec
Elo-spreading: from first to last: 281 Elo

Long summary (with ratinglists):

Drawkiller tournament:

Avg game length = 389.777 sec

Program Elo + - Games Score Av.Op. Draws

1 Stockfish 10 bmi2 : 3459 23 23 600 82.6 % 3157 21.8 % 2 Houdini 6 pext : 3356 20 20 600 71.1 % 3174 29.2 % 3 Komodo 12 bmi2 : 3294 20 20 600 63.3 % 3184 30.8 % 4 Fire 7.1 popc : 3145 19 19 600 42.8 % 3209 30.3 % 5 Ethereal 11.12 pext : 3076 19 19 600 33.5 % 3221 28.0 % 6 Komodo 12.2.2 MCTS : 3060 20 20 600 31.4 % 3223 25.2 % 7 Shredder 13 x64 : 3011 20 20 600 25.3 % 3231 23.3 %

Elo-spreading: from first to last: 448 Elo

Number of early draws:
first 10 moves played by engines: 0 draws= 0%
first 20 moves played by engines: 10 draws= 0.48%
first 30 moves played by engines: 46 draws= 2.19%

Games : 2100 (finished) White Wins : 822 (39.1 %) Black Wins : 712 (33.9 %) Draws : 566 (27.0 %) White Score : 52.6 % Black Score : 47.4 %

SALC V5:

Avg game length = 399.781 sec

Program Elo + - Games Score Av.Op. Draws

1 Stockfish 10 bmi2 : 3404 21 21 600 78.3 % 3166 32.8 % 2 Houdini 6 pext : 3304 19 19 600 65.4 % 3183 39.8 % 3 Komodo 12 bmi2 : 3266 18 18 600 60.1 % 3189 44.5 % 4 Fire 7.1 popc : 3166 18 18 600 45.3 % 3206 46.2 % 5 Ethereal 11.12 pext : 3120 18 18 600 38.4 % 3213 43.2 % 6 Komodo 12.2.2 MCTS : 3076 19 19 600 32.2 % 3221 34.3 % 7 Shredder 13 x64 : 3063 19 19 600 30.4 % 3223 35.5 %

Elo-spreading: from first to last: 341 Elo

Number of early draws: first 10 moves played by engines: 5 draws= 0.24% first 20 moves played by engines: 39 draws= 1.86% first 30 moves played by engines: 81 draws= 3.86%

Games : 2100 (finished) White Wins : 689 (32.8 %) Black Wins : 582 (27.7 %) Draws : 829 (39.5 %) White Score : 52.5 % Black Score : 47.5 %

Noomen (TCEC openings Season 9-13 Superfinal and Gambit-openings (477 lines)):

Avg game length = 405.223 sec

Program Elo + - Games Score Av.Op. Draws

1 Stockfish 10 bmi2 : 3388 20 20 600 76.8 % 3169 39.7 % 2 Houdini 6 pext : 3289 19 19 600 63.6 % 3185 42.8 % 3 Komodo 12 bmi2 : 3257 18 18 600 58.8 % 3191 42.0 % 4 Fire 7.1 popc : 3170 17 17 600 45.6 % 3205 45.8 % 5 Ethereal 11.12 pext : 3129 18 18 600 39.5 % 3212 46.3 % 6 Komodo 12.2.2 MCTS : 3091 18 18 600 33.9 % 3218 40.2 % 7 Shredder 13 x64 : 3076 18 18 600 31.8 % 3221 43.8 %

Elo-spreading: from first to last: 312 Elo

Number of early draws: first 10 moves played by engines: 7 draws= 0.33% first 20 moves played by engines: 32 draws= 1.52% first 30 moves played by engines: 90 draws= 4.29%

Games : 2100 (finished) White Wins : 691 (32.9 %) Black Wins : 507 (24.1 %) Draws : 902 (43.0 %) White Score : 54.4 % Black Score : 45.6 %

Stockfish Framework 2moves openings:

Avg game length = 430.108 sec

Program Elo + - Games Score Av.Op. Draws

1 Stockfish 10 bmi2 : 3395 20 20 600 77.5 % 3168 35.0 % 2 Houdini 6 pext : 3291 18 18 600 63.8 % 3185 46.8 % 3 Komodo 12 bmi2 : 3254 18 18 600 58.4 % 3191 48.8 % 4 Fire 7.1 popc : 3164 17 17 600 44.8 % 3206 46.2 % 5 Ethereal 11.12 pext : 3142 18 18 600 41.5 % 3210 48.0 % 6 Komodo 12.2.2 MCTS : 3092 19 19 600 34.1 % 3218 44.8 % 7 Shredder 13 x64 : 3062 19 19 600 30.0 % 3223 40.0 %

Elo-spreading: from first to last: 333 Elo

Number of early draws: first 10 moves played by engines: 1 draws= 0.05% first 20 moves played by engines: 12 draws= 0.57% first 30 moves played by engines: 31 draws= 1.48%

Games : 2100 (finished) White Wins : 689 (32.8 %) Black Wins : 482 (23.0 %) Draws : 929 (44.2 %) White Score : 54.9 % Black Score : 45.1 %

4 GM moves (out of MegaBase 2018, checked with Komodo):

Avg game length = 449.414 sec

Program Elo + - Games Score Av.Op. Draws

1 Stockfish 10 bmi2 : 3396 20 20 600 77.5 % 3167 37.3 % 2 Houdini 6 pext : 3307 19 19 600 65.9 % 3182 48.2 % 3 Komodo 12 bmi2 : 3262 18 18 600 59.5 % 3190 52.0 % 4 Fire 7.1 popc : 3151 17 17 600 42.9 % 3208 48.5 % 5 Ethereal 11.12 pext : 3119 18 18 600 38.2 % 3213 52.0 % 6 Komodo 12.2.2 MCTS : 3099 18 18 600 35.3 % 3217 45.3 % 7 Shredder 13 x64 : 3066 19 19 600 30.7 % 3222 41.7 %

Elo-spreading: from first to last: 330 Elo

Number of early draws: first 10 moves played by engines: 1 draws= 0.05% first 20 moves played by engines: 7 draws= 0.33% first 30 moves played by engines: 25 draws= 1.19%

Games : 2100 (finished) White Wins : 679 (32.3 %) Black Wins : 446 (21.2 %) Draws : 975 (46.4 %) White Score : 55.5 % Black Score : 44.5 %

HERT set (500 pos):

Avg game length = 442.339 sec

Program Elo + - Games Score Av.Op. Draws

1 Stockfish 10 bmi2 : 3384 20 20 600 76.3 % 3169 42.2 % 2 Houdini 6 pext : 3300 19 19 600 65.1 % 3183 48.2 % 3 Komodo 12 bmi2 : 3270 19 19 600 60.7 % 3188 53.3 % 4 Fire 7.1 popc : 3139 18 18 600 41.0 % 3210 52.3 % 5 Ethereal 11.12 pext : 3131 18 18 600 39.8 % 3212 50.8 % 6 Komodo 12.2.2 MCTS : 3108 18 18 600 36.4 % 3215 46.5 % 7 Shredder 13 x64 : 3068 19 19 600 30.8 % 3222 44.3 %

Elo-spreading: from first to last: 316 Elo

Number of early draws: first 10 moves played by engines: 4 draws= 0.19% first 20 moves played by engines: 19 draws= 0.90% first 30 moves played by engines: 46 draws= 2.19%

Games : 2100 (finished) White Wins : 661 (31.5 %) Black Wins : 426 (20.3 %) Draws : 1013 (48.2 %) White Score : 55.6 % Black Score : 44.4 %

FEOBOS v20.1 contempt 3:

Avg game length = 437.481 sec

Program Elo + - Games Score Av.Op. Draws

1 Stockfish 10 bmi2 : 3365 19 19 600 73.9 % 3173 45.5 % 2 Houdini 6 pext : 3301 19 19 600 65.3 % 3183 51.8 % 3 Komodo 12 bmi2 : 3265 18 18 600 60.0 % 3189 55.0 % 4 Fire 7.1 popc : 3161 17 17 600 44.2 % 3206 59.7 % 5 Ethereal 11.12 pext : 3151 18 18 600 42.6 % 3208 53.8 % 6 Komodo 12.2.2 MCTS : 3094 18 18 600 34.3 % 3218 47.5 % 7 Shredder 13 x64 : 3063 19 19 600 29.8 % 3223 45.7 %

Elo-spreading: from first to last: 302 Elo

Number of early draws: first 10 moves played by engines: 0 draws= 0% first 20 moves played by engines: 22 draws= 1.05% first 30 moves played by engines: 61 draws= 2.90%

Games : 2100 (finished) White Wins : 638 (30.4 %) Black Wins : 385 (18.3 %) Draws : 1077 (51.3 %) White Score : 56.0 % Black Score : 44.0 %

Stockfish Framework 8moves openings:

Avg game length = 438.899 sec

Program Elo + - Games Score Av.Op. Draws

1 Stockfish 10 bmi2 : 3363 19 19 600 73.9 % 3173 44.8 % 2 Houdini 6 pext : 3276 18 18 600 61.8 % 3187 52.8 % 3 Komodo 12 bmi2 : 3267 18 18 600 60.3 % 3189 55.7 % 4 Fire 7.1 popc : 3167 17 17 600 45.0 % 3206 54.0 % 5 Ethereal 11.12 pext : 3140 17 17 600 40.8 % 3210 52.3 % 6 Komodo 12.2.2 MCTS : 3106 18 18 600 35.8 % 3216 52.0 % 7 Shredder 13 x64 : 3082 18 18 600 32.3 % 3220 51.7 %

Elo-spreading: from first to last: 281 Elo

Number of early draws: first 10 moves played by engines: 6 draws= 0.29% first 20 moves played by engines: 20 draws= 0.95% first 30 moves played by engines: 53 draws= 2.52%

Games : 2100 (finished) White Wins : 610 (29.0 %) Black Wins : 400 (19.0 %) Draws : 1090 (51.9 %) White Score : 55.0 % Black Score : 45.0 %

For comparsion:

CEGT 40/4 ratinglist (singlecore):

1 Stockfish 10.0 x64 1CPU 3450 2 Houdini 6.0 x64 1CPU 3372 3 Komodo 12.1.1 x64 1CPU 3337 4 Fire 7.1 x64 1CPU 3242 5 Ethereal 11.00 x64 1CPU 3186 6 Komodo 12.2 x64 1CPU (MCTS) 3182 7 Shredder 13 x64 1CPU 3152

Elo-spreading: from first to last: 298 Elo

CCRL 40/4 ratinglist (singlecore):

1 Stockfish 10 64-bit 3498 2 Houdini 6 64-bit 3446 3 Komodo 12 64-bit 3410 4 Fire 7.1 64-bit 3333 5 Ethereal 11.00 64-bit 3301 6 Komodo 12.2.2 MCTS 64-bit 3288 7 Shredder 13 64-bit 3269

Elo-spreading: from first to last: 229 Elo by bayeselo. (With ORDO: 276 Elo)

2017/10/22 Some days ago, I had the idea, to filter half-closed positions out of my SALC V3 opening-set. Which means, that in the endpositions of the opening-line, following conditions had to be true:

1) On d-line or e-line at least one white and one black pawn (=one of both center-lines closed)
2) no pawn-capture on the center-squares (e4,d4,e5,d5) possible (means: not allowed: (white pawn on e4 and black
pawn on d5) or (white pawn on d4 and black pawn on e5) - so, the position cannot get fully open after 1 or 2 played moves by the engines.
3) no pawn-free d-line, when both queens are on d-line. So, the queens cannot capture each other after 1 or 2 played moves by the engines.

The idea is, that in these positions, the probability of fast and many capturing-moves is much lower, so it should took
more time (and moves) to reach drawish endgame-positions. So, the probability of an interesting and long midgame should get higher...

I did a testrun with these positions (with the exact same testing-conditions like the experimental testruns of the experiment below from 2017/10/17, so the result are comparable). The result is really surprising and much better, than I expected.

Games Completed = 1000 of 1000 (Avg game length = 920.236 sec)
Settings = RR/256MB/300000ms+3000ms/M 450cp for 4 moves, D 120 moves/EPD:C:\LittleBlitzer\SALC_half_closed.epd(7053)
1. asmFish 170426 x64 631.0/1000 387-125-draws:488 (L: m=0 t=0 i=0 a=125) (D: r=143 i=198 f=29 s=4 a=114) (tpm=6691.2 d=30.13 nps=2329022)
2. Komodo 10.4 x64 369.0/1000 125-387-488 (L: m=0 t=0 i=0 a=387) (D: r=143 i=198 f=29 s=4 a=114) (tpm=7026.2 d=26.03 nps=1491141)

The result is impressive. The draw-rate (48.8%) is more than -5% lower than using "normal"-SALC (53.9%) and -14.6% lower than using the Stockfish-Framework 8move-openingset (63.4%)(this means, the number of draws is 23% lower with SALC half-closed!!!) - that is a huge step forward on my mission to prevent computerchess from draw-death!

And the Elo-differences of the engine-scores are not getting smaller (which would happen, when the opening-positions had huge advantages for white or black), they are getting higher (score of asmFish vs Komodo with Standard-openings: 60.3% and with SALC half-closed: 63.1%)!

And take a look on the average game length: 1036 sec with Standard-openings and only 920 sec with SALC half-closed: You need 11.2% less time, using SALC half-closed, for the same number of games. This testrun (3 games played simultaneously) ran nealry a half day shorter, than the testrun with Standard-openings.

2017/10/17 After the release of the FEOBOS v10 opening-books and files (by Frank Quisinsky and Klaus Wlotzka), with the new "contempt-books/opening-sets", I was curious to see, if the opening-set with the highest contempt 5 (means, that none of the 10 analyzing engines had a 0.00 evaluation in any opening-line endposition), could lower the draw-rate in engine-testing (compared to my SALC-openings and the standard 8-move opening-set of the Stockfish-framework). In May, 2017, I did 3 huge testruns (using SALC, Stockfish-opening-set and FEOBOS v3 beta). Now, I tested FEOBOS v10 Contempt 5 opening-set with the exact same conditions, so there was no need to replay the testruns, using SALC and using Stockfish-framework openings...(scroll down here, to find the 3 experimental testruns in 2017/05/19).

asmFish played 1000 games versus Komodo 10.4 with all 3 books/opening sets (=3000 games). Not bullet-speed, but 5'+3'' (!), singlecore, 256 MB Hash, no pondering, both engines with Contempt=+15. LittleBlitzerGUI (in RoundRobin playmode, in which for each game, one opening position is chosen per random out of an epd-openings file).

Games Completed = 1000 of 1000 (Avg game length = 1026.116 sec)
Settings = RR/256MB/300000ms+3000ms/M 450cp for 4 moves, D 120 moves/EPD:C:\LittleBlitzer\feobos_v10_c5.epd(12412)
1. asmFish 170426 x64 612.0/1000 312-88-draws:600 (L: m=0 t=0 i=0 a=88) (D: r=139 i=221 f=51 s=2 a=187) (tpm=6327.1 d=31.20 nps=2379009)
2. Komodo 10.4 x64 388.0/1000 88-312-600 (L: m=0 t=0 i=0 a=312) (D: r=139 i=221 f=51 s=2 a=187) (tpm=6473.7 d=26.68 nps=1493768)

Games Completed = 1000 of 1000 (Avg game length = 944.640 sec)

Settings = RR/256MB/300000ms+3000ms/M 450cp for 4 moves, D 120 moves/EPD:C:\LittleBlitzer\SALC_V2_10moves.epd(10000)

1. asmFish 170426 x64 620.5/1000 351-110-draws: 539 (L: m=0 t=0 i=0 a=110) (D: r=149 i=231 f=38 s=0 a=121) (tpm=6659.0 d=30.93 nps=2552099)

2. Komodo 10.4 x64 379.5/1000 110-351-539 (L: m=0 t=0 i=0 a=351) (D: r=149 i=231 f=38 s=0 a=121) (tpm=6920.9 d=26.71 nps=1619591)

Games Completed = 1000 of 1000 (Avg game length = 1036.164 sec)

Settings = RR/256MB/300000ms+3000ms/M 450cp for 4 moves, D 120 moves/EPD:C:\LittleBlitzer3\34700_ok.epd(32000)

1. asmFish 170426 x64 603.0/1000 286-80-draws: 634 (L: m=0 t=0 i=0 a=80) (D: r=148 i=232 f=39 s=1 a=214) (tpm=6334.2 d=31.54 nps=2570164)

2. Komodo 10.4 x64 397.0/1000 80-286-634 (L: m=0 t=2 i=0 a=284) (D: r=148 i=232 f=39 s=1 a=214) (tpm=6473.6 d=27.00 nps=1614400)

Conclusions: The FEOBOS v10 Contempt 5 positions lowered the draw-rate compared to Stockfish opening-set from 63.4% to 60.0% and the number of 3fold-draws from 14.8% to 13.9%. That is a small, but measureable progress. Still far away from the low draw-rate, the SALC-openings have (SALC lowered the draw-rate from 63.4% to 53.9% (!)), but mention, that FEOBOS plays a wide variety of all openings,and SALC plays only lines, where white and black castled to opposite sides of the chessboard.

2017/09/01 I measured the speed of Stockfish-compiles (abrok, ultimaiq and BrainFish (without Cerebellum-Library, Brainfish is identical to Stockfish). Stockfish C++ code from 170905, measured with fishbench (10 runs each version), i7-6700HQ 2.6 GHz Skylake CPU. These are the results:

abrok modern : 1.557 mn/s
abrok bmi2 : 1.611 mn/s

ultimaiq modern : 1.660 mn/s
ultimaiq bmi2 : 1.702 mn/s

brainfish modern: 1.729 mn/s
brainfish bmi2 : 1.764 mn/s

modern:
abrok -> ultimaiq = +6.6% speedup
ultimaiq -> brainfish = +4.2% speedup

bmi2:
abrok -> ultimaiq = +5.6% speedup
ultimaiq -> brainfish = +3.6% speedup

So, the ultimaiq-compiles are around 6% faster than the abrok-compiles, but BrainFish is around 10% faster than abrok!!! From now, I will use the BrainFish-compiles (without Cerebellum-Library) for my Stockfish-testruns, because these are the fastest compiles at the moment and the results are better comparable with the BrainFish-testruns, when BrainFish uses the Cerebellum-Library.

2017/08/23 Using the new HERT openings-set (by Thomas Zipproth) for my Stockfish-testing was a great opportunity to compare the gamebases played with HERT (contains positions selected from the most played variations in Engine and Human tournaments) and played with my SALC openings (SALC means: only positions with castling to opposite sides, both queens still on board. The idea was to lower the draw-rate in computerchess and make the games more tactical and thrilling, without distorting the results of engine-tests and engine-tournaments). So, here the results. Both gamebases were played with 3'+1'', singlecore, 512 MB Hash. The only difference was the opening-set (HERT / SALC)...

HERT:

Program Elo + - Games Score Av.Op. Draws

1 Stockfish 170526 bmi2 : 3346 7 7 5000 71.3 % 3171 45.6 % 2 Komodo 11.2.2 x64 : 3314 6 6 5000 66.9 % 3177 45.8 % 3 Houdini 5 pext : 3299 6 6 5000 64.7 % 3180 48.5 % 4 Shredder 13 x64 : 3119 6 6 5000 37.8 % 3216 43.7 % 5 Fizbo 1.9 bmi2 : 3096 6 6 5000 34.4 % 3221 38.2 % 6 Andscacs 0.91b bmi2 : 3026 7 7 5000 24.9 % 3235 34.9 %

Elo-differences: 1-6: 320 (overall)

1-2: 32 2-3: 15 3-4: 180 4-5: 23 5-6: 70

Games : 15000 (finished)

average game length: +13.7% compared to SALC games (moves) +10% compared to SALC games (time)

White Wins : 5129 (34.2 %) Black Wins : 3455 (23.0 %) Draws : 6416 (42.8 %)

SALC:

Program Elo + - Games Score Av.Op. Draws

1 Stockfish 170526 bmi2 : 3359 7 7 5000 72.7 % 3168 39.9 % 2 Komodo 11.2.2 x64 : 3327 7 7 5000 68.3 % 3175 38.5 % 3 Houdini 5 pext : 3298 6 6 5000 64.4 % 3180 42.2 % 4 Shredder 13 x64 : 3108 6 6 5000 36.4 % 3218 35.4 % 5 Fizbo 1.9 bmi2 : 3097 7 7 5000 34.8 % 3221 31.1 % 6 Andscacs 0.91b bmi2 : 3012 7 7 5000 23.5 % 3238 27.7 %

Elo-differences: 1-6: 347 (overall)

1-2: 32 2-3: 29 3-4: 190 4-5: 11 5-6: 85

Games : 15000 (finished)

White Wins : 5476 (36.5 %) Black Wins : 4154 (27.7 %) Draws : 5370 (35.8 %)

See the individual statistics (engine vs. engine) here

Download the 2x 15000 games here

Conclusions:

1) SALC lowers the draw-rate a lot (35.8%) , compared to the HERT openings-set (42.8%) - mention, that the HERT-set was optimized for a low draw-rate. Thomas Zipproth has chosen only lines, which were not too drawish. Using other "classical" openings-sets should lead to a higher draw-rate, than using HERT !!!

2) The order of rank is the same for all engines in both gamebases - no distorted results with SALC.
3) The scores of the engines are not getting closer to 50%, using SALC. The Elo-differences are not getting smaller (in fact, they are getting higher! (Elo-differences rank 1 to 6: 320 Elo using HERT, but 347 Elo using SALC), which proofs, that SALC does not contain a lot of lines, which are leading to a clear advantage (and easy wins) for white or black. And bigger Elo-differences make the results statistical more reliable.
4) SALC lowers the average game duration around 10%. That means, that in the same time, +10% more games can be played, which leads to statistical more valuable results in the same time.

5) At the moment, using a classical openings-set (like HERT) or book is OK, when playing with engines with a huge Elo-difference and when using a short thinking-time. But if you play only with very strong engines (with a small Elo-difference) and / or with very long thinking-time, than using SALC is strongly recommended, because in those cases, the draw-rate increases a lot. And in the future, when the hardware gets faster and faster, the draw-rate of computerchess will increase more and more (of course). Then using SALC will be the only solution, preventing the "draw-death" of computerchess...

So, do not hesitate and download the complete SALC-package (opening-books and more than 12000 opening-positions (PGN and EPD)) here

2017/05/19 In the end of 2016, I released the 2.0 version of my SALC opening book. The idea was to create a book, which lowers the draw-rate in computerchess, because the draw-rate increases more and more, when the engines get stronger and the hardware gets faster. In online engine-tournaments and in the TCEC-tournament, the draw-rates are already around 85% and so, the “draw-death“ of computerchess is coming closer and closer. As you can see below (experimental testruns 2016/12/09), my SALC V2.0 book lowered the draw-rate a lot in a Stockfish 8 selfplay testrun (compared to a classic opening book/position set)(from 83% to 68.2% (!)). But in the last months, some people criticized, that the openings in the SALC book just gave a huge advantage for one color, which lowers the number of draws. It is clear, that this way of creating a book would work: if all lines of a book would give one color an advantage of +9, the draw-rate would be (of course...) 0%...But on the other hand, the scores in an engine-tournament, using such book, would be 50% for all engines, because we would have a random distribution of the advantage of the opening lines, if the number of played games is high enough.
But this was not the idea of the SALC book. The idea was, that in all book lines, white and black castle to opposite sides, with both queens still on the board, which should lead to more attacks to the king and to a more tactical and a more thrilling computerchess. All book lines were checked with Komodo 10.2 (20'' per position, running on 3 cores), evaluation inside of [-0.6,+0.6]. So, no lines with a huge advantage for white or black are in the SALC book.
If the critics were right, that the SALC book lines lead to too huge advantage for one color, using the SALC book should bring the engine scores in an tournament closer to 50%, compared to a classical opening book.
To verify, that this will NOT happen, I did 3 testruns with 3 different opening sets:
1) SALC V2
2) Frank Quisinsky's FEOBOS 3.0 book (beta), a new and very well engine-analyzed and balanced opening book (get more information on his website www.amateurschach.de).
3) the 8-move openings collection, which is used in the Stockfish framework.

asmFish played 1000 games versus Komodo 10.4 with all 3 books/opening sets (=3000 games). Not bullet-speed, but 5'+3'' (!), singlecore, 256 MB Hash, no pondering, both engines with Contempt=+15. LittleBlitzerGUI (in RoundRobin playmode, in which for each game, one opening position is chosen per random out of an epd-openings file). It took more around 12 days, to complete these three long testruns.

Games Completed = 1000 of 1000 (Avg game length = 944.640 sec)

Settings = RR/256MB/300000ms+3000ms/M 450cp for 4 moves, D 120 moves/EPD:C:\LittleBlitzer\SALC_V2_10moves.epd(10000)

Time = 945199 sec elapsed, 0 sec remaining

1. asmFish 170426 x64 620.5/1000 351-110-draws: 539 (L: m=0 t=0 i=0 a=110) (D: r=149 i=231 f=38 s=0 a=121) (tpm=6659.0 d=30.93 nps=2552099)

2. Komodo 10.4 x64 379.5/1000 110-351-539 (L: m=0 t=0 i=0 a=351) (D: r=149 i=231 f=38 s=0 a=121) (tpm=6920.9 d=26.71 nps=1619591)

Games Completed = 1000 of 1000 (Avg game length = 1049.395 sec)

Settings = RR/256MB/300000ms+3000ms/M 450cp for 4 moves, D 120 moves/EPD:C:\LittleBlitzer2\FEOBOS_v03+.epd(24085)

Time = 1039157 sec elapsed, 0 sec remaining

1. asmFish 170426 x64 601.5/1000 293-90-draws: 617 (L: m=0 t=0 i=0 a=90) (D: r=132 i=221 f=38 s=1 a=225) (tpm=6315.9 d=30.83 nps=2477078)

2. Komodo 10.4 x64 398.5/1000 90-293-617 (L: m=0 t=0 i=0 a=293) (D: r=132 i=221 f=38 s=1 a=225) (tpm=6424.5 d=26.49 nps=1583220)

Games Completed = 1000 of 1000 (Avg game length = 1036.164 sec)

Settings = RR/256MB/300000ms+3000ms/M 450cp for 4 moves, D 120 moves/EPD:C:\LittleBlitzer3\34700_ok.epd(32000)

Time = 1036719 sec elapsed, 0 sec remaining

1. asmFish 170426 x64 603.0/1000 286-80-draws: 634 (L: m=0 t=0 i=0 a=80) (D: r=148 i=232 f=39 s=1 a=214) (tpm=6334.2 d=31.54 nps=2570164)

2. Komodo 10.4 x64 397.0/1000 80-286-634 (L: m=0 t=2 i=0 a=284) (D: r=148 i=232 f=39 s=1 a=214) (tpm=6473.6 d=27.00 nps=1614400)

Conclusions:
1) The SALC book lowers the draw-rate a lot (53.9%) , compared to the FEOBOS book (61.7%) and the Stockfish Framework opening set (63.4%), although the engines played with Contempt=+15.
2) The scores of the engines are not getting closer to 50%, using the SALC-book. The Elo-differences are not getting smaller (in fact, they are getting higher!), which proofs, that the SALC book does not contain a lot of lines, which are leading to a clear advantage (and easy wins) for white or black, compared to both other books.
3) The SALC book lowers the average game duration (compared to the other books) around 10%. That means, that in the same time, +10% more games can be played, which leads to statistical more valuable results in the same time (for example: This testrun using SALC ended more than one day before the FEOBOS and the Stockfish-openings testruns)
4) Although there is no doubt, that the FEOBOS book is very well balanced and analyzed, and this beta version contains only lines with both queens on board, the draw-rate is only a little bit lower than using the Stockfish Framework opening set. The number of 3fold-draws is a little bit lower with FEOBOS (compared to both other books), but 16 less 3fold draws of 1000 games isnt pretty much (1.6%) .

All three books/opening-sets were created only for playing engine tournaments/competitions.
But only using the SALC V2.0 book brings clearly measureable benefits: a clearly lower draw-rate, 10% lower game duration (10% more games in the same time) and the biggest Elo-differences/distances in the engine results/scores. So, the SALC book avoids the “draw-death“ of computerchess in the near future and engine-tournament results, using SALC, are statistical more valuable, than using other opening books/sets, because the "resolution" of Elos in the results is higher and more games can be played in the same time. Feel free to download the SALC V2 book/openings set and make your own tests. If the number of games (300+) is high enough, I have no doubt, that the results will confirm my testing results and you will see, how thrilling watching modern computerchess still can be.

2016/12/09: Some weeks ago, I created my SALC opening book for engine-engine matches. In all lines (created out of 10000 human-games, all lines 20 plies deep (all lines checked with Komodo 10.2 (20'' per position, running on 3 cores), evaluation inside of [-0.6,+0.6])), white and black castled to opposite sides, both queens still on the board. The idea is, to get more attacks to the king and a lower draw rate. Because the draw rate in computerchess increases more and more, the stronger the engines and the faster the hardware gets. For my Stockfish bullet-testruns, I use 500 SALC-positions since 2014, which lowered the draw rate a lot.

To verify, how much the draw rate is lowered by these new book / opening-positions set, I did two testruns. 3000 games each (=6000 games). Stockfish 8 in selfplay. 70''+700ms thinkingtime, singlecore, LittleBlitzerGUI (using the 10000 positions epd-files, playing in RoundRobin-mode, in which for each game one epd-position is chosen per random).

Test 1: 34700 standard 8-move opening epd. Draw rate: 83.0%
Test 2: 10000 SALC V2 epd. Draw rate: 68.2%

I think, the result is really impressive...

2016/03/12: Testrun of 3 new Stockfish-clones. Stockfish played 1000 games against them (LittleBlitzerGUI, singlecore, 70''+700ms, 128 Hash, no ponder, no bases, no largepages, 500 SALC-openings). None of the clones is stronger (no surprise), so don't waste your time with this "engines". The new popcount-versions of DON are not running on my system (and the LittleBlitzerGUI), so I could not test DON.

Program Elo + - Games Score Av.Op. Draws

1 Stockfish 160302 x64 : 3300 7 7 3000 51.9 % 3287 65.0 % 2 Venom 3 x64 : 3293 12 12 1000 48.8 % 3300 66.6 % 3 Anoa 1.1 x64 : 3285 12 12 1000 47.9 % 3300 63.6 % 4 Sanjib 3 x64 : 3283 13 13 1000 47.8 % 3300 64.7 %

2015/02/20: A little "Clone-Wars" testrun of Stockfish 6 against 5 of its clone-engines. (70''+700ms, singlecore, SALC-openings, 1000 games-Gauntlet). As you can see, none of the 5 clones is really measureable stronger (all results in a +/-1% score-interval and clearly inside errorbar).

Program Elo + - Games Score Av.Op. Draws

1 Pepper 150213 x64s : 3251 13 13 1000 51.0 % 3243 60.9 % 2 Sugar 5 x64s : 3248 13 13 1000 50.7 % 3243 57.0 % 3 Orka 150213 x64s : 3248 14 14 1000 50.7 % 3243 59.2 % 4 Salt 5 x64s : 3247 13 13 1000 50.5 % 3243 60.1 % 5 Stockfish 6 150128 : 3243 6 6 5000 49.4 % 3247 59.7 % 6 Shark 150209 x64s : 3241 13 13 1000 50.0 % 3243 61.3 %

Datenschutzerklärung