Susan Schneider > Quotes > Quotable Quote

(?)

“A better alternative might be to combine the incentive method with the use of motivation selection to give the AI a final goal that makes it easier to control. Suppose that an AI were designed to have as its final goal that a particular red button inside a command bunker never be pressed. Since the pressing of the button is disvalued intrinsically and not because of its causal consequences, the button could be completely inert: it could be made of Play-Doh. Furthermore, it is irrelevant whether the AI can ever know whether the button had been pressed. What is essential is that the AI believes that the button will more likely remain unpressed if the AI continuously acts in the principal’s interest than if it rebels. Refinements to this setup are possible. Instead of trying to endow an AI with a final goal that refers to a physical button, one could build an AI that places final value on receiving a stream of “cryptographic reward tokens.”11 These would be sequences of numbers serving as keys to ciphers that would have been generated before the AI was created and that would have been built into its motivation system.12 These special number sequences would be extremely desirable to the AI, constituting a special kind of reward token that the AI could not attain though wireheading.13 The keys would be stored in a secure location where they could be quickly destroyed if the AI ever made an attempt to seize them. So long as the AI cooperates, the keys are doled out at a steady rate.”

To see what your friends thought of this quote, please sign up!

None yet!

Science Fiction and Philosophy: From Time Travel to Superintelligence by Susan Schneider
285 ratings, average rating, 36 reviews
Open Preview