Human-LikeMachineIntelligence
Editedby StephenMuggleton ImperialCollegeLondon
NickChater UniversityofWarwick
GreatClarendonStreet,Oxford,OX26DP, UnitedKingdom
OxfordUniversityPressisadepartmentoftheUniversityofOxford. ItfurtherstheUniversity’sobjectiveofexcellenceinresearch,scholarship, andeducationbypublishingworldwide.Oxfordisaregisteredtrademarkof OxfordUniversityPressintheUKandincertainothercountries
©OxfordUniversityPress2021
Themoralrightsoftheauthorhavebeenasserted
FirstEditionpublishedin2021
Impression:1
Allrightsreserved.Nopartofthispublicationmaybereproduced,storedin aretrievalsystem,ortransmitted,inanyformorbyanymeans,withoutthe priorpermissioninwritingofOxfordUniversityPress,orasexpresslypermitted bylaw,bylicenceorundertermsagreedwiththeappropriatereprographics rightsorganization.Enquiriesconcerningreproductionoutsidethescopeofthe aboveshouldbesenttotheRightsDepartment,OxfordUniversityPress,atthe addressabove
Youmustnotcirculatethisworkinanyotherform andyoumustimposethissameconditiononanyacquirer
PublishedintheUnitedStatesofAmericabyOxfordUniversityPress 198MadisonAvenue,NewYork,NY10016,UnitedStatesofAmerica
BritishLibraryCataloguinginPublicationData Dataavailable
LibraryofCongressControlNumber:2021932529
ISBN978–0–19–886253–6
DOI:10.1093/oso/9780198862536.001.0001
Printedandboundby CPIGroup(UK)Ltd,Croydon,CR04YY
LinkstothirdpartywebsitesareprovidedbyOxfordingoodfaithand forinformationonly.Oxforddisclaimsanyresponsibilityforthematerials containedinanythirdpartywebsitereferencedinthiswork.
Preface
Recentlytherehasbeenincreasingexcitementaboutthepotentialforartificialintelligencetotransformhumansociety.Thisbookaddressestheleadingedgeofresearchin thisarea.Thisresearchaimstoaddresspresentincompatibilitiesofhumanandmachine reasoningandlearningapproaches.
AccordingtotheinfluentialUSfundingagencyDARPA(originatoroftheInternet andSelf-DrivingCars)thisnewarearepresentstheThirdWaveofArtificialIntelligence (3AI,2020s–2030s),andisbeingactivelyinvestigatedintheUnitedStates,Europe andChina.TheUK’sEngineeringandPhysicalSciencesResearchCouncil(EPSRC) networkonHuman-LikeComputing(HLC)wasoneofthefirstnetworkinternationally toinitiateandsupportresearchspecificallyinthisarea.Startingactivitiesin2018,the networkrepresentsaround60leadingUKartificialintelligenceandcognitivescientists involvedinthedevelopmentoftheinterdisciplinaryareaofHLC.Theresearchofnetworkgroupsaimstoaddresskeyunsolvedproblemsattheinterfacebetweenpsychology andcomputerscience.
ThechaptersofthisbookhavebeenauthoredbyamixtureoftheseUKandother internationalspecialistsbasedonrecentworkshopsanddiscussionsattheMachine Intelligence20and21workshops(2016,2019)andtheThirdWaveArtificialIntelligence workshop(2019).Someofthekeyquestionsaddressedbythehuman-likecomputing programmeincludehowAIsystemsmight(1)explaintheirdecisionseffectively,(2) interactwithhumanbeingsinnaturallanguage,(3)learnfromsmallnumbersof examplesand(4)learnwithminimalsupervision.Solvingsuchfundamentalproblems involvesnewfoundationalresearchinboththepsychologyofperceptionandinteraction aswellasthedevelopmentofnovelalgorithmicapproachesinartificialintelligence.
Thebookisarrangedinfiveparts.Thefirstpartdescribescentralchallengesof human-likecomputing,rangingfromissuesinvolvedindevelopingabeneficialform ofAI(Russell,Berkeley),aswellasamodernphilosophicalperspectiveonAlan Turing’sseminalmodelofcomputationandhisviewofitspotentialforintelligence (Millican,Oxford).Twochaptersthenaddressthepromisingnewapproachofvirtual bargainingandrepresentationalrevisionasnewtechnologiesforsupportingimplicit human–machineinteraction(Chater,Warwick;Bundy,Edinburgh).
Part2addresseshuman-likesocialcooperationissues,providingboththeAIperspectiveofdialecticexplanations(Toni,Imperial)alongsiderelevantpsychologicalresearch onthelimitationsandbiasesofhumanexplanations(Hahn,Birkbeck)andthechallenges human-likecommunicationposesforAIsystems(Healey,QueenMary).Thepossibility ofreverseengineeringhumancooperationisdescribed(Kleiman-Weiner,Harvard)and contrastswithissuesinusingexplanationsinmachineteaching(Hernandez-Orallo, PolitecniaValencia).Part3concentratesonHuman-LikePerceptionandLanguage,
includingnewapproachestohuman-likecomputervision(Muggleton,Imperial),and therelatednewareaofapperception(Evans,DeepMind),aswellassuggestionsoncombininghumanandmachinevisioninanalysingcomplexsignaldata(Jay,Manchester).An ongoingUKstudyonsocialinteractionisdescribedin(Pickering,Edinburgh)together withachapterexploringtheuseofmulti-modalcommunication(Vigliocco,UCL).
InPart4,issuesrelatedtohuman-likerepresentationandlearningarediscussed.This startswithadescriptionofworkonhuman–machinescientificdiscovery(TamaddoniNezhad,Imperial)whichisrelatedtomodelsoffastandslowlearninginhumans (Mareschal),followedbyachapteronmachine-learningmethodsforgeneratingmutual explanations(Schmid,Bamberg).Issuesrelatinggraphicalandsymbolicrepresentation aredescribedin(Jamnik,Cambridge).Thishaspotentialrelevancetoapplicationsfor inductivelygeneratingprogramsforusewithspreadsheets(DeRaedt,Leuven).Lastly, Part5considerschallengesforevaluatingandexplainingthestrengthofhuman-like reasoning.Evaluationsarenecessarilycontextdependent,asshowninthepaperon automatedcommon-sensespatialreasoning(Cohn,Leeds),thoughasecondpaper arguesthatBayesian-inspiredapproacheswhichavoidprobabilitiesarepowerfulfor explaininghumanbrainactivity(Sanborn,Warwick).Bayesianapproachesarealso showntobecapableofexplainingvariousodditiesofhumanreasoning,suchasthe conjunctionfallacy(Tentori,Trento).Bycontrast,whenevaluatingsituatedAIsystems thereareclearadvantagesanddifficultiesinevaluatingrobotfootballplayersusing objectiveprobabilitieswithinacompetitiveenvironment(Sammut,UNSW).Thebook closeswithachapterdemonstratingtheongoingchallengesofevaluatingtherelative strengthsofhumanandmachineplayinchess(Bratko,Ljubjana).
June2020StephenMuggleton andNickChater Editors
Acknowledgements
Thisbookwouldnothavebeenpossiblewithoutagreatdealofhelp.Wewouldlike tothankAlirezaTamaddoni-Nezhadforhisvaluablehelpinorganisingthemeetings whichledtothisbook,andwithhishelpinfinalizingthebookitself,aswellas FrancescaMcMahon,oureditoratOUP,forheradviceandencouragement.Wealso thankourprincipalfunder,theEPSRC,forbackingtheNetworkonHuman-Like Computing(HLC,grantnumberEP/R022291/1);andacknowledgeadditionalsupport fromtheESRCNetworkforIntegratedBehaviouralScience(grantnumbergrant numberES/P008976/1).Finally,specialthanksareduetoBridgetGundryforherhard work,tenacity,andcheerfulnessindrivingthebookthroughtoaspeedyandsuccessful conclusion.
4
Part 2 Human-likeSocialCooperation
5 MiningProperty-drivenGraphicalExplanationsforData-centricAI fromArgumentationFrameworks
5.5ReasoningandExplainingwithAFsMinedfromLabelledExamples
5.6ReasoningandExplainingwithQBFsMinedfromRecommenderSystems
7.1Introduction
8 TooManycooks:BayesianinferenceforcoordinatingMulti-agent
RoseE.Wang,SarahA.Wu,JamesA.Evans,DavidC.Parkes, JoshuaB.Tenenbaum,andMaxKleiman-Weiner
9 TeachingandExplanation:AligningPriorsbetweenMachines
JoseHernandez-OralloandCesarFerri
12 Human–MachinePerceptionofComplexSignalData
AlaaAlahmadi,AlanDavies,MarkelVigo,KatherineDempsey,andCarolineJay 12.1Introduction
12.2Human–MachinePerceptionofECGData
12.3Human–MachinePerception:Differences,Benefits, andOpportunities
13 TheShared-WorkspaceFrameworkforDialogueandOther CooperativeJointActivities
MartinPickeringandSimonGarrod
13.1Introduction
13.2TheSharedWorkspaceFramework
14 BeyondRoboticSpeech:MutualBenefitstoCognitivePsychology andArtificialIntelligencefromtheStudyofMultimodal
BeataGrzybandGabriellaVigliocco
14.1Introduction
14.2TheUseofMultimodalCuesinHumanFace-to-faceCommunication
14.4CanEmbodiedAgentsRecognizeMultimodalCuesProducedbyHumans? 280 14.5CanEmbodiedAgentsProduceMultimodalCues? 282 14.6SummaryandWayForward:MutualBenefitsfromStudiesonMultimodal
Part 4
Human-likeRepresentationandLearning
15 Human–MachineScientificDiscovery 297
AlirezaTamaddoni-Nezhad,DavidBohan,GhazalAfrooziMilani, AlanRaybould,andStephenMuggleton
15.1Introduction
15.2ScientificProblemandDataset:FarmScaleEvaluations(FSEs) ofGMHTCrops 299
15.3TheKnowledgeGapforModellingAgro-ecosystems:EcologicalNetworks 301
15.4AutomatedDiscoveryofEcologicalNetworksfromFSEDataand EcologicalBackgroundKnowledge
UteSchmid
MatejaJamnikandPeterCheng
rep2rep
CléMentGautrais,YannDauxais,StefanoTeso,SamuelKolb, GustVerbruggen,andLucDeRaedt
Human-CompatibleArtificial Intelligence
StuartRussell
UniversityofCalifornia,Berkeley,USA
1.1Introduction
Artificialintelligence(AI)hasasitsaimthecreationofintelligentmachines.Anentityis consideredtobeintelligent,roughlyspeaking,ifitchoosesactionsthatareexpectedto achieveitsobjectives,givenwhatithasperceived.1 Applyingthisdefinitiontomachines, onecandeducethatAIaimstocreatemachinesthatchooseactionsthatareexpectedto achievetheirobjectives,givenwhattheyhaveperceived.
Now,whataretheseobjectives?Tobesure,theyare—uptonow,atleast—objectives thatweputintothem;but,nonetheless,theyareobjectivesthatoperateexactlyasifthey werethemachines’ownandaboutwhichtheyarecompletelycertain.Wemightcall thisthe standardmodel ofAI:buildoptimizingmachines,plugintheobjectives,andoff theygo.ThismodelprevailsnotjustinAIbutalsoincontroltheory(minimizingacost function),operationsresearch(maximizingasumofrewards),economics(maximizing individualutilities,grossdomesticproduct(GDP),quarterlyprofits,orsocialwelfare), andstatistics(minimizingalossfunction).Thestandardmodelisapillaroftwentiethcenturytechnology.
Unfortunately,thisstandardmodelisamistake.Itmakesnosensetodesignmachines thatarebeneficialtous only ifwewritedownourobjectivescompletelyandcorrectly. Iftheobjectiveiswrong,wemightbeluckyandnoticethemachine’ssurprisingly objectionablebehaviourandbeabletoswitchitoffintime.Or,ifthemachineismore intelligentthanus,theproblemmaybeirreversible.Themoreintelligentthemachine, theworsetheoutcomeforhumans:themachinewillhaveagreaterabilitytoalterthe
1 Thisdefinitioncanbeelaboratedandmademorepreciseinvariousways—particularlywithrespectto whetherthechoosingandexpectingoccurwithintheagent,withintheagent’sdesigner,orsomecombination ofboth.Thelattercertainlyholdsforhumanagents,viewingevolutionasthedesigner.Theword‘objective’here isalsousedinformally,anddoesnotreferjusttoendgoals.Formostpurposes,anadequatelygeneralformal definitionof‘objective’coverspreferencesoverlotteriesovercompletestatesequences.Moreover,‘state’here includesmentalstateaswellastheworldstateexternaltotheentity.
worldinwaysthatareinconsistentwithourtrueobjectivesandgreaterskillinforeseeing andpreventinganyinterferencewithitsplans.
In1960,afterseeingArthurSamuel’schecker-playingprogramlearntoplaycheckers farbetterthanitscreator,NorbertWiener(1960)gaveaclearwarning:
Ifweuse,toachieveourpurposes,amechanicalagencywithwhoseoperation wecannotefficientlyinterfere...wehadbetterbequitesurethatthepurpose putintothemachineisthepurposewhichwereallydesire.
EchoesofWiener’swarningcanbediscernedincontemporaryassertionsthat‘superintelligentAI’maypresentanexistentialrisktohumanity.(Inthecontextofthe standardmodel,‘superintelligent’meanshavingasuperhumancapacitytoachievegiven objectives.)ConcernshavebeenraisedbysuchobserversasNickBostrom(2014),Elon Musk(Kumparak2014),BillGates(2015),2 andStephenHawking(Osborne2017). Thereisverylittlechancethatashumanswecanspecifyourobjectivescompletelyand correctlyinsuchawaythatthepursuitofthoseobjectivesbymorecapablemachinesis guaranteedtoresultinbeneficialoutcomesforhumans.
Themistakecomesfromtransferringaperfectlyreasonabledefinitionofintelligence fromhumanstomachines.Thedefinitionisreasonableforhumansbecauseweare entitledtopursueourownobjectives—indeed,whosewouldwepursue,ifnotourown? Thedefinitionofintelligenceis unary,inthesensethatitappliestoanentitybyitself. Machines,ontheotherhand,arenotentitledtopursuetheirownobjectives.
AmoresensibledefinitionofAIwouldhavemachinespursuing our objectives.Thus, wehaveabinarydefinition:entityAchoosesactionsthatareexpectedtoachieve theobjectivesofentityB,givenwhatentityAhasperceived.Intheunlikelyevent thatwe(entityB)canspecifytheobjectivescompletelyandcorrectlyandinsertthem intothemachine(entityA),thenwecanrecovertheoriginal,unarydefinition.Ifnot, thenthemachinewillnecessarilybe uncertain astoourobjectiveswhilebeingobligedto pursuethemonourbehalf.Thisuncertainty—withthecouplingbetweenmachinesand humansthatitentails—iscrucialtobuildingAIsystemsofarbitraryintelligencethatare provablybeneficialtohumans.Wemust,therefore,reconstructthefoundationsofAI alongbinaryratherthanunarylines.
1.2ArtificialIntelligence
ThegoalofAIresearchhasbeentounderstandtheprinciplesunderlyingintelligent behaviourandtobuildthoseprinciplesintomachinesthatcanthenexhibitsuch behaviour.Inthe1960sand1970s,theprevailingtheoreticaldefinitionofintelligence wasthecapacityforlogicalreasoning,includingtheabilitytoderiveplansofaction
2 Gateswrote,‘Iaminthecampthatisconcernedaboutsuperintelligence....IagreewithElonMuskand someothersonthisanddon’tunderstandwhysomepeoplearenotconcerned.’
guaranteedtoachieveaspecifiedgoal.Apopularvariantwastheproblem-solving paradigm,whichrequiresfindingaminimum-costsequenceofactionsguaranteedto reachagoalstate.Morerecently,aconsensushasemergedinAIaroundtheideaofa rationalagentthatperceivesandactsinordertomaximizeitsexpectedutility.(InMarkov decisionprocessesandreinforcementlearning,utilityisfurtherdecomposedintoa sumofrewardsaccruedthroughthesequenceoftransitionsintheenvironmentstate.) Subfieldssuchaslogicalplanning,robotics,andnatural-languageunderstandingare specialcasesofthegeneralparadigm.AIhasincorporatedprobabilitytheorytohandle uncertainty,utilitytheorytodefineobjectives,andstatisticallearningtoallowmachines toadapttonewcircumstances.Thesedevelopmentshavecreatedstrongconnections tootherdisciplinesthatbuildonsimilarconcepts,includingcontroltheory,economics, operationsresearch,andstatistics.
Inboththelogical-planningandrational-agentviewsofAI,themachine’sobjective— whetherintheformofagoal,autilityfunction,orarewardfunction—isspecified exogenously.InWiener’swords,thisis‘thepurposeputintothemachine’.Indeed, ithasbeenoneofthetenetsofthefieldthatAIsystemsshouldbe general-purpose thatis,capableofacceptingapurposeasinputandthenachievingit—ratherthan special-purpose,withtheirgoalimplicitintheirdesign.Forexample,aself-drivingcar shouldacceptadestinationasinputinsteadofhavingonefixeddestination.However, someaspectsofthecar’s‘drivingpurpose’arefixed,suchasthatitshouldn’thit pedestrians.Thisisbuiltdirectlyintothecar’ssteeringalgorithmsratherthanbeing explicit:noself-drivingcarinexistencetoday‘knows’thatpedestriansprefernottobe runover.
Puttingapurposeintoamachinethatoptimizesitsbehaviouraccordingtoclearlydefinedalgorithmsseemsanadmirableapproachtoensuringthatthemachine’sbehaviour furthersourownobjectives.But,asWienerwarns,weneedtoputinthe right purpose. WemightcallthistheKingMidasproblem:Midasgotexactlywhatheaskedfor— namely,thateverythinghetouchedwouldturntogold—but,toolate,hediscoveredthe drawbacksofdrinkingliquidgoldandeatingsolidgold.Thetechnicaltermforputtingin therightpurposeis valuealignment.Whenitfails,wemayinadvertentlyimbuemachines withobjectivescountertoourown.Taskedwithfindingacureforcancerasfastas possible,anAIsystemmightelecttousetheentirehumanpopulationasguineapigs foritsexperiments.Askedtode-acidifytheoceans,itmightuseupalltheoxygeninthe atmosphereasasideeffect.Thisisacommoncharacteristicofsystemsthatoptimize: variablesnotincludedintheobjectivemaybesettoextremevaluestohelpoptimizethat objective.
Unfortunately,neitherAInorotherdisciplinesbuiltaroundtheoptimizationof objectiveshavemuchtosayabouthowtoidentifythepurposes‘wereallydesire’.Instead, theyassumethatobjectivesaresimplyimplantedintothemachine.AIresearch,inits presentform,studiestheabilitytoachieveobjectives,notthedesignofthoseobjectives. Inthe1980stheAIcommunityabandonedtheideathatAIsystemscouldhavedefinite knowledgeofthestateoftheworldoroftheeffectsofactions,andtheyembraced uncertaintyintheseaspectsoftheproblemstatement.Itisnotatallclearwhy,for themostpart,theyfailedtonoticethattheremustalsobeuncertaintyintheobjective.
AlthoughsomeAIproblemssuchaspuzzlesolvingaredesignedtohavewell-defined goals,manyotherproblemsthatwereconsideredatthetime,suchasrecommending medicaltreatments,havenopreciseobjectivesandoughttoreflectthefactthatthe relevantpreferences(ofpatients,relatives,doctors,insurers,hospitalsystems,taxpayers, etc.)arenotknowninitiallyineachcase.
SteveOmohundro(2008)haspointedtoafurtherdifficulty,observingthatany sufficientlyintelligententitypursuingafixed,knownobjectivewillacttopreserveits ownexistence(orthatofanequivalentsuccessorentitywithanidenticalobjective).This tendencyhasnothingtodowithaself-preservationinstinctoranyotherbiologicalnotion; it’sjustthatanentityusuallycannotachieveitsobjectivesifitisdead.Accordingto Omohundro’sargument,asuperintelligentmachinethathasanoff-switch—whichsome, includingAlanTuring(1951)himself,haveseenasourpotentialsalvation—willtake stepstodisabletheswitchinsomeway.Thuswemayfacetheprospectofsuperintelligent machines—theiractionsbydefinitionunpredictableandtheirimperfectlyspecified objectivesconflictingwithourown—whosemotivationtopreservetheirexistencein ordertoachievethoseobjectivesmaybeinsuperable.
1.31001ReasonstoPayNoAttention
Objectionshavebeenraisedtothesearguments,primarilybyresearcherswithintheAI community.Theobjectionsreflectanaturaldefensivereaction,coupledperhapswitha lackofimaginationaboutwhatasuperintelligentmachinecoulddo.Noneholdwateron closerexamination.Herearesomeofthemorecommonones:
• Don’tworry,wecanjustswitchitoff:3 Thisisoftenthefirstthingthatpops intoalayperson’sheadwhenconsideringrisksfromsuperintelligentAI—asifa superintelligententitywouldneverthinkofthat.Itisratherlikesayingthattherisk oflosingtoDeepBlueorAlphaGoisnegligible—allonehastodoismaketheright moves.
• Human-levelorsuperhumanAIisimpossible:4 ThisisanunusualclaimforAI researcherstomake,giventhat,fromTuringonward,theyhavebeenfendingoff suchclaimsfromphilosophersandmathematicians.Theclaim,whichisbackedby noevidence,appearstoconcedethatifsuperintelligentAIwerepossible,itwould beasignificantrisk.Itisasifabusdriver,withallofhumanityashispassengers, said,‘Yes,I’mdrivingtowardacliff—infact,I’mpressingthepedaltothemetal. Buttrustme,we’llrunoutofgasbeforewegetthere.’Theclaimalsorepresents afoolhardybetagainsthumaningenuity.We’vemadesuchbetsbeforeandlost.
3 AIresearcherJeffHawkins,forexample,writes,‘Someintelligentmachineswillbevirtual,meaningthey willexistandactsolelywithincomputernetworks....Itisalwayspossibletoturnoffacomputernetwork,even ifpainful.’https://www.recode.net/2015/3/2/11559576/.
4 TheAI100report(Stone etal. 2016)includesthefollowingassertion:‘Unlikeinthemovies,thereisno raceofsuperhumanrobotsonthehorizonorprobablyevenpossible.’
On11September1933,renownedphysicistErnestRutherfordstated,withutter confidence,‘Anyonewhoexpectsasourceofpowerfromthetransformationof theseatomsistalkingmoonshine’.On12September1933,LeoSzilardinventedthe neutron-inducednuclearchainreaction.Afewyearslater,hedemonstratedsucha reactioninhislaboratoryatColumbiaUniversity.Asherecalledinamemoir:‘We switchedeverythingoffandwenthome.Thatnight,therewasverylittledoubtin mymindthattheworldwasheadedforgrief.’
• It’stoosoontoworryaboutit:Therighttimetoworryaboutapotentiallyserious problemforhumanitydependsnotjustonwhentheproblemwilloccurbutalso onhowmuchtimeisneededtodeviseandimplementasolutionthatavoidsthe risk.Forexample,ifweweretodetectalargeasteroidpredictedtocollidewiththe Earthin2070,wouldwesay,‘It’stoosoontoworry’?Andifweconsidertheglobal catastrophicrisksfromclimatechangepredictedtooccurlaterinthiscentury,isit toosoontotakeactiontopreventthem?Onthecontrary,itmaybetoolate.The relevanttimescaleforhuman-levelAIislesspredictable,but,likenuclearfission,it mightarriveconsiderablysoonerthanexpected.Moreover,thetechnologicalpath tomitigatetherisksisalsoarguablylessclear.Thesetwoaspectsincombination donotargueforcomplacency;instead,theysuggesttheneedforhardthinkingto occursoon.Wiener(1960)amplifiesthispoint,writing,
Theindividualscientistmustworkasapartofaprocesswhosetime scaleissolongthathehimselfcanonlycontemplateaverylimited sectorofit....Evenwhentheindividualbelievesthatsciencecontributes tothehumanendswhichhehasatheart,hisbeliefneedsacontinual scanningandre-evaluationwhichisonlypartlypossible.Fortheindividual scientist,eventhepartialappraisalofthisliaisonbetweenthemanandthe processrequiresanimaginativeforwardglanceathistorywhichisdifficult, exacting,andonlylimitedlyachievable.Andifweadheresimplytothe creedofthescientist,thatanincompleteknowledgeoftheworldandof ourselvesisbetterthannoknowledge,wecanstillbynomeansalways justifythenaiveassumptionthatthefasterwerushaheadtoemploythe newpowersforactionwhichareopeneduptous,thebetteritwillbe.We mustalwaysexertthefullstrengthofourimaginationtoexaminewhere thefulluseofournewmodalitiesmayleadus.
Onevariationonthe‘toosoontoworryaboutit’argumentisAndrewNg’s statementthatit’s‘likeworryingaboutoverpopulationonMars’.Thisappealsto aconvenientanalogy:notonlyistheriskeasilymanagedandfarinthefuturebut alsoit’sextremelyunlikelythatwe’deventrytomovebillionsofhumanstoMars inthefirstplace.Theanalogyisafalseone,however.We’realreadydevotinghuge scientificandtechnicalresourcestocreatingevermorecapableAIsystems.Amore aptanalogywouldbeaplantomovethehumanracetoMarswithnoconsideration forwhatwemightbreathe,drink,oreatoncewearrived.
• It’sarealissuebutwecannotsolveituntilwehavesuperintelligence:Onewouldnot proposedevelopingnuclearreactorsand then developingmethodstocontainthe reactionsafely.Indeed,safetyshouldguidehowwethinkaboutreactordesign.It’s worthnotingthatSzilardalmostimmediatelyinventedandpatentedafeedback controlsystemformaintaininganuclearreactionatthesubcriticallevelforpower generation,despitehavingabsolutelynoideaofwhichelementsandreactionscould sustainthefissionchain.
Bythesametoken,hadracialandgenderbiasbeenanticipatedasanissuewith statisticallearningsystemsinthe1950s,whenlinearregressionbegantobeused forallkindsofapplications,theanalyticalapproachesthathavebeendevelopedin recentyearscouldeasilyhavebeendevelopedthen,andwouldapplyequallywell totoday’sdeeplearningsystems.
Inotherwords,wecanmakeprogressonthebasisofgeneralproperties ofsystems—e.g.,systemsdesignedwithinthestandardmodel—withoutnecessarilyknowingthedetails.Moreover,theproblemofobjectivemisspecificationappliestoallAIsystemsdevelopedwithinthestandardmodel,notjust superintelligentones.
• Human-levelAIisn’treallyimminent,inanycase:TheAI100report,forexample, assuresus,‘contrarytothemorefantasticpredictionsforAIinthepopularpress, theStudyPanelfoundnocauseforconcernthatAIisanimminentthreatto humankind’.Thisargumentsimplymisstatesthereasonsforconcern,whichare notpredicatedonimminence.Inhis2014book, Superintelligence:Paths,Dangers, Strategies,NickBostrom,forone,writes,‘Itisnopartoftheargumentinthis bookthatweareonthethresholdofabigbreakthroughinartificialintelligence, orthatwecanpredictwithanyprecisionwhensuchadevelopmentmightoccur.’ Bostrom’sestimatethatsuperintelligentAImightarrivewithinthiscenturyis roughlyconsistentwithmyown,andbothareconsiderablymoreconservativethan thoseofthetypicalAIresearcher.
• Anymachineintelligentenoughtocausetroublewillbeintelligentenoughtohave appropriateandaltruisticobjectives:5 ThisargumentisrelatedtoHume’sis–ought problemandG.E.Moore’snaturalisticfallacy,suggestingthatsomehowthe machine,asaresultofitsintelligence,willsimplyperceivewhatisrightgivenits experienceoftheworld.Thisisimplausible;forexample,onecannotperceive, inthedesignofachessboardandchesspieces,thegoalofcheckmate;thesame chessboardandpiecescanbeusedforsuicidechess,orindeedmanyothergames stilltobeinvented.Putanotherway:whereBostromimagineshumansdriven
5 RodneyBrooks(2017),forexample,assertsthatit’simpossibleforaprogramtobe‘smartenoughthat itwouldbeabletoinventwaystosubverthumansocietytoachievegoalssetforitbyhumans,without understandingthewaysinwhichitwascausingproblemsforthosesamehumans’.Often,theargumentadds thepremisethatpeopleofgreaterintelligencetendtohavemorealtruisticobjectives,aviewthatmaybe relatedtotheself-conceptionofthosemakingtheargument.Chalmers(2010)pointstoKant’sviewthatan entitynecessarilybecomesmoremoralasitbecomesmorerational,whilenotingthatnothinginourcurrent understandingofAIsupportsthisviewwhenappliedtomachines.
extinctbyaputativerobotthatturnstheplanetintoaseaofpaperclips,we humansseethisoutcomeastragic,whereastheiron-eatingbacterium Thiobacillus ferrooxidans isthrilled.Who’stosaythebacteriumiswrong?Thefactthatamachine hasbeengivenafixedobjectivebyhumansdoesn’tmeanthatitwillautomatically takeonboardasadditionalobjectivesotherthingsthatareimportanttohumans. Maximizingtheobjectivemaywellcauseproblemsforhumans;themachinemay recognizethoseproblemsasproblematicforhumans;but,bydefinition,theyarenot problematicwithinthestandardmodelfromthepointofviewofthegivenobjective.
• Intelligenceismultidimensional,‘sosmarterthanhumans’isameaninglessconcept:This argument,duetoKevinKelly(2017),drawsonastapleofmodernpsychology— thefactthatascalarIQdoesnotdojusticetothefullrangeofcognitiveskills thathumanspossesstovaryingdegrees.IQisindeedacrudemeasureofhuman intelligence,butitisutterlymeaninglessforcurrentAIsystemsbecausetheir capabilitiesacrossdifferentareasareuncorrelated.HowdowecomparetheIQ ofGoogle’ssearchengine,whichcannotplaychess,tothatofDeepBlue,which cannotanswersearchqueries?Noneofthissupportstheargumentthatbecause intelligenceismultifaceted,wecanignoretheriskfromsuperintelligentmachines. If‘smarterthanhumans’isameaninglessconcept,then‘smarterthangorillas’is alsomeaningless,andgorillasthereforehavenothingtofearfromhumans.Clearly, thatargumentdoesn’tholdwater.Notonlyisitlogicallypossibleforoneentityto bemorecapablethananotheracrossalltherelevantdimensionsofintelligence,it isalsopossibleforonespeciestorepresentanexistentialthreattoanotherevenif theformerlacksanappreciationformusicandliterature.
1.4Solutions
CanwetackleWiener’swarninghead-on?CanwedesignAIsystemswhosepurposes don’tconflictwithours,sothatwe’resuretobehappywithhowtheybehave?Onthe faceofit,thisseemshopelessbecauseitwilldoubtlessproveinfeasibletowritedown ourpurposescorrectlyorimagineallthecounterintuitivewaysasuperintelligententity mightfulfilthem.
IfwetreatsuperintelligentAIsystemsasiftheywereblackboxesfromouterspace, thenindeedthereisnohope.Instead,theapproachweseemobligedtotake,ifweareto haveanyconfidenceintheoutcome,istodefinesomeformalproblem F anddesignAI systemstobe F -solvers,suchthattheclosertheAIsystemcomestosolving F perfectly, thegreaterthebenefittohumans.Insimpleterms,themoreintelligentthemachine,the bettertheoutcomeforhumans:wehopethemachine’sintelligencewillbeappliedboth tolearningourtrueobjectivesandtohelpingusachievethem.Ifwecanworkoutan appropriate F thathasthisproperty,wewillbeabletocreateprovablybeneficialAI. Thereis,Ibelieve,anapproachthatmaywork.Humanscanreasonablybedescribed ashaving(mostlyimplicitandpartiallyformed)preferencesovertheirfuturelives—that is,givenenoughtimeandunlimitedvisualaids,ahumancouldexpressapreference (orindifference)whenofferedachoicebetweentwofutureliveslaidoutbeforehim