Stefan Pohl Computer Chess

private website for chessengine-tests


Latest Website-News (2017/08/14): Thomas Zipproth built a new 500-openings testset. Because of his stellar work on the Cerebellum-Library (Brainfish (www.zipproth.de)), he is definitly one of the best experts for enginechess openings on the planet. The new testset is called HERT (Human and Engine Relevant Theory) and contains openings, played in high-level engine (online-)chess and by human GMs. So, it will not contain all ECO-codes (like FEOBOS-project for example), but only those openings, strong humans and engines in tournaments really play on the chessboard. So, this testset has probably the highest practical relevance an openings-set ever had and it has a very low draw-rate (higher, than SALC-openings, of course, but measureable lower, than other "normal" openings-sets).

Because of this, it is strongly recommended to use it for engine testing and for engine development. Download the HERT openings-set here and check out the ReadMe-file for further information)

The replay of my gamebase for Stockfish-testing, using the new HERT set, is now completed. The first testrun of Stockfish 170810 was started. Result not before Saturday. When it is finished, the latest asmFish testrun will follow immediately.

 

Stay tuned.

 

 

The long thinking-time tournament is still running - I will update it from time to time. Latest update: 2017/08/11. My new SALC_V3_10moves opening book is ready. Download it here

The new 10moves book is bigger (nearly 14000 endpositions (+40%)) and better (all endpositions analyzed with Komodo 11.01 (100 seconds (!!) per move, 3 threads (=5x more thinking time, than older SALC-books), with Komodo-evaluation inside [-0.6,+0.6]). I use it for my long thinking-time tournament from now.

 

If you speak german, you will find a very interesting article about my SALC-books in the renowned online-magazine "Glarean" (Switzerland), written by Walter Eigenmann. Just click here to read it.


Stockfish testing

 

Playing conditions:

 

Hardware: i7-6700HQ 2.6GHz Notebook (Skylake CPU), Windows 10 64bit, 8GB RAM

Fritzmark: singlecore: 5.3 / 2521 (all engines running on one core, only), average meganodes/s displayed by LittleBlitzerGUI: Houdini: 2.6 mn/s, Stockfish: 2.2 mn/s, Komodo: 2.0 mn/s

Hash: 512MB per engine

GUI: LittleBlitzerGUI (draw at 130 moves, resign at 400cp (for 4 moves))

Tablebases: None

Openings: HERT testset (by Thomas Zipproth) (download the file at the "Download & Links"-section or here)

Ponder, Large Memory Pages & learning: Off

Thinking time: 180''+1000ms (= 3'+1'') per game/engine (average game-duration: around  7.5 minutes). One 5000 games-testrun takes about 7 days.The version-numbers of the Stockfish-development engines are the release-date, written backwards (year,month,day))(example: 170526 = May, 26, 2017), downloaded at chess.ultimaiq.net I always use the latest version of one day, if more than one version per day is released. And I use the version "Haswell+" (=bmi2).

 

Each Stockfish-version plays 1000 games against Komodo 11.2.2, Houdini 5, Shredder 13, Fizbo 1.9, Andscacs 0.91b.

To avoid distortions in the Ordo Elo-calculation, from now, only 2x Stockfish (latest official release (because of my new testsettings, at the moment Stockfish 170526 is the latest "official release", because it was the last version, which was tested with the old testsettings and the first one with the new testsettings)+ the latest version) and 1x asmFish and 1x Brainfish are stored in the gamebase (all older engine-versions games will be deleted, every time, when a new version was tested). Stockfish and asmFish Elo-results can still be seen in the Elo-diagrams below).

 

Latest update: 2017/08/14: Replay of the gamebase (using the new HERT opening-set) is completed...First testrun of Stockfish (170810) was started.

 

(Ordo-calculation fixed to Stockfish 170526 = 3420 Elo, which was the final result of Stockfish 170526 in the old gamebase. So the Elo-development of Stockfish has no "break" and can continue from the last point of the old gamebase)

 

See the individual statistics of engine-results here

Download all played games here

 

     Program                    Elo    +    -   Games   Score   Av.Op.  Draws

   1 Stockfish 170526 bmi2    : 3420    7    7  5000    71.3 %   3245   45.6 %
   2 Komodo 11.2.2 x64        : 3388    6    6  5000    66.9 %   3251   45.8 %
   3 Houdini 5 pext           : 3373    6    6  5000    64.7 %   3254   48.5 %
   4 Shredder 13 x64          : 3193    6    6  5000    37.8 %   3290   43.7 %
   5 Fizbo 1.9 bmi2           : 3169    6    6  5000    34.4 %   3295   38.2 %
   6 Andscacs 0.91b bmi2      : 3100    7    7  5000    24.9 %   3309   34.9 %

Below you find a diagram of the progress of Stockfish in my tests since the end of 2016

And below that diagram, the older diagrams.

 

You can save the diagrams (as a JPG-picture (in originial size)) on your PC with mouseclick (right button) and then choose "save image"...

The Elo-ratings of older Stockfish dev-versions in the Ordo-calculation can be a little different to the Elo-"dots" in the diagram, because the results/games of new Stockfish dev-versions - when getting part of the Ordo-calculation - can change the Elo-ratings of the opponent engines and that can change the Elo-ratings of older Stockfish dev-versions (in the Ordo-calculation / ratinglist, but not in the diagram, where all Elo-"dots" are the rating of one Stockfish dev-version at the moment, when the testrun of that Stockfish dev-version was finished).


Sie sind Besucher Nr.