<!DOCTYPE html><HTML lang="en"> <head><meta charset="utf-8"> <title>Silver (David), Etc. - Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm (Theo Todman's Book Collection - Paper Abstracts) </title> <link href="../../TheosStyle.css" rel="stylesheet" type="text/css"><link rel="shortcut icon" href="../../TT_ICO.png" /></head> <BODY> <CENTER> <div id="header"><HR><h1>Theo Todman's Web Page - Paper Abstracts</h1><HR></div><A name="Top"></A> <TABLE class = "Bridge" WIDTH=950> <tr><th><A HREF = "../../PaperSummaries/PaperSummary_23/PaperSummary_23084.htm">Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm</A></th></tr> <tr><th><A HREF = "../../Authors/S/Author_Silver (David).htm">Silver (David)</a>, Etc.</th></tr> <tr><th>Source: arxiv.org, 05 December, 2017</th></tr> <tr><th>Paper - Abstract</th></tr> </TABLE> </CENTER> <P><CENTER><TABLE class = "Bridge" WIDTH=400><tr><td><A HREF = "../../PaperSummaries/PaperSummary_23/PaperSummary_23084.htm">Paper Summary</A></td><td><A HREF = "../../PaperSummaries/PaperSummary_23/PapersToNotes_23084.htm">Notes Citing this Paper</A></td></tr></TABLE></CENTER></P> <hr><P><FONT COLOR = "0000FF"><U>Authors Abstract</U><FONT COLOR = "800080"><ol type="1"><li>The game of chess is the most widely-studied domain in the history of artificial intelligence. The strongest programs are based on a combination of sophisticated search techniques, domain-specific adaptations, and handcrafted evaluation functions that have been refined by human experts over several decades. </li><li>In contrast, the AlphaGo Zero program recently achieved superhuman performance in the game of Go, by tabula rasa reinforcement learning from games of self-play. In this paper, we generalise this approach into a single AlphaZero algorithm that can achieve, tabula rasa, superhuman performance in many challenging domains. </li><li>Starting from random play, and given no domain knowledge except the game rules, AlphaZero achieved within 24 hours a superhuman level of play in the games of chess and shogi (Japanese chess) as well as Go, and convincingly defeated a world-champion program in each case. </li></ol></FONT><hr><FONT COLOR = "0000FF"><B>Comment: </B><ul type="disc"><li>See <a name="W5962W"></a><A HREF = "https://arxiv.org/pdf/1712.01815.pdf" TARGET = "_top">Link</A>. </li><li>Annotated printout filed with <a name="1"></a>"<A HREF = "../../BookSummaries/BookSummary_06/BookPaperAbstracts/BookPaperAbstracts_6527.htm">Hains (Brigid) & Hains (Paul) - Aeon: Q-S</A>" for want of a better home. </li></ul><BR><FONT COLOR = "0000FF"><HR></P><a name="ColourConventions"></a><p><b>Text Colour Conventions (see <A HREF="../../Notes/Notes_10/Notes_1025.htm">disclaimer</a>)</b></p><OL TYPE="1"><LI><FONT COLOR = "0000FF">Blue</FONT>: Text by me; &copy; Theo Todman, 2018</li><LI><FONT COLOR = "800080">Mauve</FONT>: Text by correspondent(s) or other author(s); &copy; the author(s)</li></OL> <BR><HR><BR><CENTER> <TABLE class = "Bridge" WIDTH=950> <TR><TD WIDTH="30%">&copy; Theo Todman, June 2007 - August 2018.</TD> <TD WIDTH="40%">Please address any comments on this page to <A HREF="mailto:theo@theotodman.com">theo@theotodman.com</A>.</TD> <TD WIDTH="30%">File output: <time datetime="2018-08-02T10:02" pubdate>02/08/2018 10:02:34</time> <br><A HREF="../../Notes/Notes_10/Notes_1010.htm">Website Maintenance Dashboard</A></TD></TR> <TD WIDTH="30%"><A HREF="#Top">Return to Top of this Page</A></TD> <TD WIDTH="40%"><A HREF="../../Notes/Notes_11/Notes_1140.htm">Return to Theo Todman's Philosophy Page</A></TD> <TD WIDTH="30%"><A HREF="../../index.htm">Return to Theo Todman's Home Page</A></TD> </TR></TABLE></CENTER><HR> </BODY> </HTML>