Introduction
1.1Introduction
Thesubjectofrandomorstochasticprocessanalysisisaveryimportantpartof scientificinquiry.Thetermsstochasticandrandomprocessareusedinterchangeably. Randomprocessesareusedasmathematicalmodelsforalargenumberofphenomenainphysics,chemistry,biology,computerscience,informationtheory,economics, environmentalscienceandothers.Manybooksaboutrandomprocesseshavebeen publishedovertheyears.Overtime,itbecomemoreandmoreimportanttoprovide notonlythetheoryandexamplesregardingaspecificprocesses,butalsothecomputercodeandexampledata.Therefore,thisbookisintendedtopresentconcepts, theoryandcomputercodewritteninR,thathelpsreaderswithlimitedinitialknowledgeofrandomprocessestobecomeoperationalwiththematerial.Eachsubjectis describedandproblemsareimplementedinRcode,withrealdatacollectedinexperimentsperformedbytheauthorsortakenfromtheliterature.Withthisintent,the readercanpromptlyapplytheanalysistoherorhisowndata,makingthesubject operational.Consistentwithmoderntrendsinuniversityinstruction,thisbookmake readersactivelearners,withhands-oncomputerexperimentsdirectingreadersthrough applicationsofrandomprocessanalysis(RPA).Eachchapterisalsointroducedwith abriefhistoricalbackground,withspecificreferences,forfurtherreadingsabouteach subject.
Chapter 2 providesabriefhistoricalbackgroundabouttheoriginofrandomprocessestheory.InChapter 3,thereaderwillfindanin-depthdescriptionofthefundamentaltheoryofstochasticprocesses.Thechapterintroducesconceptsofstationarity,ergodicity,MarkovprocessesandMarkovchains.Examplesfrommathematics andphysicsarepresentedtoexemplifyrandomprocessessuchastheBuffon’sneedleandtheEhrenfestUrnModel.InChapter 4,Poisson’sprocessesarepresented. Derivationofthewell-knowndistributionispresentedaswellashomogeneousand non–homogeneousPoissonprocessesarediscussed.Onethecornerstonesofrandom processes,randomwalk,isdescribedinChapter 5.Theconceptsofabsorbingand reflectingbarriersarepresentedalongwiththegambler’sruinexample,aswellasa two-dimensionalrandomwalkcodeanddiscussionaboutrandomwalkappliedtothe processofBrownianmotion.
Chapter 6 entersintostochastictimeseriesanalysiswiththedescriptionofmoving average,autoregressiveandautoregressivemovingaverageprocesses.Seasonaltime seriesanalysisisintroducedwithexamplesappliedtomeasuresoftemperatureand
waterbudget.Theanalysisofrandomprocessesrequiresathoroughunderstanding ofspectrumandnoiseanalysis.InChapter 7 Fouriertransformsfordeterministic andstochastictimeseriesarepresented,withapplicationtospectrumanalysis.The singularspectrumanalysistechniqueisalsopresented,foranalysisandremovalof trends.
Chapter 8 presentstheMarkovChainMonteCarlomethodwithadescriptionof probablythemostfamousalgorithminstochastictheory,theMetropolisalgorithm. Afterathroughdescriptionofthetheoryandcode,thetravellingsalesmanproblemis presented,withthesimulatedannealingapproach.TheconceptofBayesiananalysis isherebrieflyintroducedandleadingintothenextchapter.Chapter 9 focusona cornerstoneofmodernstatistics,Bayesianinference,whichisappliedtoadescription ofautoregressiveprocesses.Afterintroducingthemainconcepts,examplesapplied torealdataoftemperatureandCO2 concentrationinAntarctica,aswellasradar detection,arepresented.BayesiananalysisofthePoissonprocessispresentedwiththe waiting-timeparadox.Thechapterendswithanapplicationtolighthousedetection asaremarkableexampleofBayesianinference.
Randomprocessesareusedastoolsforrandomsearchinminimizationalgorithms, asanalternativetogradient-basedsearchalgorithmsusedforinstanceinleastsquare optimization.GeneticalgorithmsarepresentedinChapter 10,withapplicationto non–linearfitting,autoregressivemovingaveragemodels.Asanexampleofimproved optimizationwithrespecttootherapproaches,thetravellingsalesmanproblemishere solvedwithgeneticalgorithms.Themodellingofstochasticprocessesdependsonthe accuracyoftheestimatorsderivedintheprocessanalysis.Theproblemofaccuracyis discussedinChapter 11,withexamplesonaveragingoftimeseries,batchmeansmethods,movingbootstrapandothertechniquestoimproveaccuracyinrandomprocesses modelling.
Chapter 12 addressesatopicthatisnottraditionallydescribedinbooksabout randomprocesses:spatialanalysis.Itisneverthelessanimportantsubjectdealing withtheapplicationofstatisticalconceptstopropertiesvaryinginspace.Thechapter providesanintroductiontogeostatisticalconceptsandthenpresentanovelapproach, wherespatialandtemporalanalysisarecombinedintoastochasticanalysisofspatiotemporalprocesses.Attheendofthechapter,theoptimizationprocedureforspatial parametersiscomputedalsowithgeneticalgorithm,showingthepossibilityofconnectingandapplyingvarioustechniquespresentedinthebook.
ThebookendswithChapter 13,whichdiscussestheverydefinitionofarandom process,themathematicaldefinitionofrandomnessandadiscussionofthedefinitionof entropies.Thisdiscussionisdevelopedintoageneralframeworkanditsimplications forscientificinquiry.Thebookalsohastwoappendicesprovidingadditionaltools presentedinthemainpartofthebook.
ThecodespresentedinthisbookarewrittenusingtheRStudiointegrateddevelopmentenvironment(IDE).RStudioincludesaconsole,aneditorthatsupports directcodeexecution,aswellastoolsforplotting,debuggingandworkspacemanagement.TherearemanybooksaboutprogramminginRthatcanbeusedasreferenceandinparticularpublicationsandlinkspresentedintheofficialComprehensiveRArchiveNetwork(CRAN)availableat: https://cran.r-project.org/.The
Introduction 3
codesandexampledatawritteninthisbookcanbedowloadedfromthewebsite http://www.marcobittelli.it underthesection Computercodesforbooks.Exercisesarepresentedattheendofeachchapterandsolutionsaredownloadableonthe book’swebsite.
Opensourcelanguagesandrelatedlibrariesaresubjecttochanges,updatesand modifications,thereforethepackagespresentedheremayundergochangesinthefuture.Toobtainspecificinformationanddocumentationaboutalibrary,thefollowing instructionshouldbeused: library(help=GA),whereforexamplethelibrary (GA) forgeneticalgorithmscanbeexplored.Herewelistthelibrariesnecessarytorunthe examplesindifferentchapters:
Chapter 7 requiresthelibrary lubridate;Chapter 9 thelibrary rjags,describedin detailintheAppendix B;Chapter 12 requires ggplot2, gstat, lattice, mapview, GA, quantmod, reshape, sf, sp, stars, tidyverse, xts and zoo andChapter 13 requires entropy, tseriesEntropy.
HistoricalBackground
Itisnotcertainthateverythingisuncertain.
BlaisePascal
2.1ThePhilosopherandtheGambler
Tointroducetheroleofcomputerstudiesinstochasticprocessesanalysis,wewill gobackafewcenturiestotheinventionofprobabilitytheory.Itistheyear1654, accordingtoafamiliarstory(Hacking,1975).AntoineGombaudChevalierdeM´er´e, SieurdeBaussay(1607 1684),askssomequestionsonagameofchancetoBlaise Pascal(1623 1662).Later,Sim´eon DenisPoisson(1781 1840),callsAntoineGombaud‘manoftheworld’andBlaisePascal‘austereJansenist’: Unprobl`emerelatif auxjeuxdehasard,propos´e`aunaust`erejans´enisteparunhommedumonde,a´et´e l’origineducalculdesprobabilit´es (Aproblemaboutgamesofchanceproposedtoan austereJansenistbyamanoftheworldwastheoriginofthecalculusofprobabilities) (Poisson,1837).WeknowthatPascalwasnotonlyaphilosopher,butalsoaphysicist,amathematician,awriter,atheologian.AntoineGambaudwasawriteranda philosopher,notonlyagambler.
WenowdiscussoneofthequestionsthatourChevalieraskedofPascalconcerning thethrowsoftwodice.Wethrowtwodiceandbetonthedoublesix.Howmanythrows doweneedtohaveachangeofwinning?
AntoineGombaudsaidthatagamblingrule,basedonthemathematicalanalogy betweentheprobabilitiesofobtainingsixwithasingledieordoublesixwithacouple ofdice,indicatesthatyouneedatleast24throws,butfromhispersonalgambling experiencesthethrowsmustbeatleast25.Pascal,afterdiscussingtheproblemwith PierredeFermat(1601 1665),answeredthatmathematicsisnotcontrarytoexperience.Letusbrieflydiscussthetopic.
Let A1 betheevent {6, 6} atthefirstthrow,so P {A1} =1/36(thesymbol P { } means‘probability’).Then,theprobabilityof not obtainingtwosixisthatofthe complementary event A1: P A1 =1 1/36=35/36.Atthesecondthrow,theevent A:nonedoublesixatthefirstthrow and nonedoublesixatthesecondthrowhas probability P A = P A1 P A2 =(35/36)2,andsoon.Theprobabilityofnot winningin24throwsis:
sotheprobabilityofwinningis P {A}[24] =1 0.5086=0.4914.Whilein25throwsit is P A [25] =0.4945and P {A}[25] =0.5055.Noticethatthedifferenceisverysmall, andthishonoursthepowerofobservationofourChevalier.Buttherearedoubtsabout thetruthofthisstory(Ore,1960).
LetusimaginebeingtheChevalierdeM´er´ewho,for30nights,goestothegame tabletothrowtwodice.Everynight,weplay20games,with25and24throwseach. Ifinagametwosixesappears,wewinthegame.Attheendofthenight,thatisafter 20games,ifthevictoriesaremorethan10,wehadaluckynight.
Wecandescribethethrowofadieasa stochasticprocess.Inthenextchapter wewillrigorouslydefine‘stochasticprocess’,butherewesimplysaythatstochastic processesaremathematicalmodelsofdynamicalsystemsthatevolveovertimeor spaceinaprobabilisticmanner.
Inourcase,thedynamicalsystemisthedie,thatateachthrowshowsafacewith probability1/6.Thecodebelowisthe‘transcription’inRofthedicegameabove.
##Code_2_1.RThrowoftwodice #25throws
#p_25<-0.5055:probabilityofgettingtwosixesin25throws n.nights<-30#numberofnights n.games<-20#numberofgames n.throws<-25#numberofthrows spot<-c(1:6)#spotsofa6-sideddie p_fair<-rep(1/6,6)#probabilitiesofa"fair"die d6<-numeric() d6T<-numeric() nseed<-50 for(jin1:n.nights) {#looponthenights nseed<-nseed+1 set.seed(nseed) for(lin1:n.games) {#looponthegames d6[l]<-0 for(iin1:n.throws) {#looponthethrows die.1<-sample(spot,1,p_fair,replace=T)#i-ththrowwiththedie1 die.2<-sample(spot,1,p_fair,replace=T)#i-ththrowwiththedie2 s.points<-die.1+die.2 if(s.points==12)d6[l]<-1 }#endlooponthethrows }#endlooponthegames d6T[j]<-sum(d6) }#endlooponthenights
d6T
###24throws
##p_24<-0.4914#probabilityofgettingtwosixesin24throws
n.nights<-30#numberofnights n.games<-20#numberofgames n.throws<-24#numberofthrows spot<-c(1:6)#spotsofa6-sideddie
p_fair<-rep(1/6,6)#probabilitiesofa"fair"die d6<-numeric()
d6T<-numeric()
nseed<-500
for(jin1:n.nights)
{#looponthenights nseed<-nseed+1 set.seed(nseed) for(lin1:n.games)
{#looponthegames
d6[l]<-0 for(iin1:n.throws)
{#looponthethrows
die.1<-sample(spot,1,p_fair,replace=T)#i-ththrowwiththedie1 die.2<-sample(spot,1,p_fair,replace=T)#i-ththrowwiththedie2 s.points<-die.1+die.2 if(s.points==12)d6[l]<-1
}#endlooponthethrows
}#endlooponthegames
d6T[j]<-sum(d6)
}#endlooponthenights
d6T
Forthesakeofclarity,werepeatfor24throwstheinstructionsfor25throws, changingonlythelines n.throws<-24 and nseed<-500.Thevector d6 atthe beginningofeachgameis0,ifadoublesixisobtaineditsvaluebecomes1,bysumming the n.nights componentsof d6,weknowifwewonorlost.NoticetheRfunction set.seed(.) isusedtosettheinitialseedofthe(pseudo)randomnumbergenerator (RNG).RNGsareinfactfullydeterministicalgorithms,sothesameseedgenerates thesamesequences,changingtheseed,wegetdifferentsequences.Theresultofthe codeabovefor25throwsis:
d6T: 10101291113121012101399813912119111111151212 111081113
Weseethatthetwofirstgamesweretied,wewonthethirdandlostthefourth,and soon.Wewon18gamesoutof30,lost7gamesandtied5games.Theresultofthe codeabovefor24throwsis:
d6T: 1176118171191210131077111661297913121113 13971411
Inthiscase,wewon16games,sothatafter30nightsweareagainwin-making.These resultsappearnottosupporttheChevalier’sclaimthat24throwsarenotenoughto hopetowin.However,suchastatementisnotcorrect.Forinstance,ifweput nseed <-100 with n.throws<-25 wehave:
d6T: 121310101012910812129910119121099896912 13991110
Wewononly10gamesoutof30.Thereasonforthisvariabilityisthatthesamplesize istoosmall,thatisthenumberofgamesisnotenoughtogivereliableresults.Let usincreasethenumberofgames.Foreachnightweplay100gamesandthenights
are180.Inthe Code_2_1.R,itisnow: n.nights<-180 and n.games<-100.The resultisfor25throws:
d6T:
[1]594457535449515450454350504541...
[176]5955505845
thatis,thefirstnightwewon59gamesoutof100,thesecondonly44,andsoon. Theresultisfor24throws:
d6T:
[1]464952485241475340465550514255...
[176]5348524954
Wecanshowtheresultsbothwith25and24throwsasinFig. 2.1.Thefigureis obtainedaddingthefollowinglinesafter d6T,bothfor25and24throws.
Fig.2.1 Numberoftimesforobtainingdoublesixhasthevalueintheabscissa.Solidline: 25throws,dashedline:24throws.
For n.throws<-25 and nseed<-100: lbin<-2 par(lwd=3) hist(d6T,main="",freq=T,xlab="N{6-6}",ylab="counts",cex.lab=1.3,lty=1, border="black",ylim=c(0,40),br=seq(36,64,by=lbin),font.lab=3)
For n.throws<-24 and nseed<-301: hist(d6T,lty=2,br=seq(36,64,by=lbin),add=T)
Notethat freq=T meansthatthehistogramreportsthecountscomponentofthe result,if freq=F thehistogramreportsprobabilitydensities,inthiscasethetotalarea oftheplotis1.
Intheabscissaitreportsthenumberoftimesthedoublesixwasobtainedin100 games.Thesymbol N{6-6} indicatesthedoublesix.Intheordinateitreportsthe
N{6–6}
numberoftimesthiseventoccurredin180nights.Forinstance,thebin(50, 52]is 38forthegameswith25throws,meaningthatin38timesoutof180thedoublesix occurred51or52timesin100games.
Inthehistogrameachbinisclosedontherightandopenontheleft,therefore the50occurrencesofthedoublesixarenotcountedinthe(50, 52]bin,butratherin the(48, 50]bin.Theresultsshowthatinthetotalnumberofgames,whichis180 × 100=18000,thedoublesixoccurredin9204games,thentheprobabilityofadouble sixestimatedin18000gamesis P{A}[25] =9204/18000 ≈ 0.5113(thehatstands forestimate),veryclosetothe‘theoretical’probability.Here‘theoretical’meansthe probabilityofperfectdice,thatisthatexpectedassumingequiprobabilityforeachface. Rigorouslyspeakingweshouldnotdefineprobabilitybycountingon‘equiprobable’ events,becausethatmakesthedefinitionrecursive.However,puttingasidephilosophy, ifthedieisfair,itsfacesare‘equally’likelytooccurandtheprobabilityofoutcomes canbecomputedaswedid.
Forgameswith24throws,adoublesixoccurredin8797games,thentheestimated probabilityofthedoublesixis P{A}[24] ≈ 0 4887,inagreementwiththe‘theoretical’ one P {A}[24] =0.4914.
Letusconsiderthe18000gamesas N independenttrials,eachwithprobability p ofsuccess,andlet n bethenumberofsuccesses.Thestandarderroroftheestimate oftheproportion p of1’sinthe N longsequenceisˆ σ = ˆ p(1 ˆ p)/N ,whereˆ p isan estimateof p,denotedaboveas P{A}[25].Inourcaseˆ σ[25] =0 0037,practicallyequal tothetheoreticalone.Obviouslyalsoˆ σ[24] resultsinthesame.
Wehaveseenthat P{A}[25] =0 5113and P{A}[24] =0 4887,wecouldaskourselves ifthedifferencebetweenthetwomeans ˆ d =0.5113 0.4887=0.0226issignificant. Wecantestforthesignificanceofthedifferencebetweentwopopulationmeansusing theStudent’s t,whichcanbedoneinRbytheline: t.test(z,y,alt="greater",var.equal=TRUE)
where z arethewinningsinthe180nightswith25throwsineachgame(59,54, ...,58,45)and y with24throwsineachgame(46,49,...,49,54).Theoption alt="greater" istospecifyaone–tailedtestand var.equal=TRUE tospecifyequal variances.Thesignificancelevel(p value)is p-value=5.159e-06,thatishighlystatisticallysignificant.Inpassing,20throwsfor30nightsyieldnosignificantdifference, confirmingwhatwesaidaboveaboutthesmallnumberofgames.
Wecantestthedifferenceofthemeansalsobythebootstrapmethod.Wewill discussthismethodinAppendixA,herewelimitourselvestoshowtheresultin Fig. 2.2,whichisalsopresentedasExercise2.1.AscanbeseeninFig. 2.2,the difference ˆ d issignificant,sinceoutof B =1000replications ˆ d∗,noneofthemisless than0.
2.2Comments
ItisdifficulttobelievethatarealgentlemansuchasAntoineGombaudwentto playwithdiceforaboutthreemonthsplaying100gameseachnight.Regardlessof whetherthestoryistrueorfalse,itteachesussomething.Inthedoubtexpressed bytheChevalier,differentconceptsofprobabilityareinvolved.Doubtless,theterm
Fig.2.2 Distributionof1000bootstrapreplications ˆ d∗ ofthedifferenceofthemeansofthe twosamplesobtainedwith25and24throws(samplesize=180).Thedottedlinelocatesthe observeddifference ˆ d =0 0226.Therearenoreplicationslessthan0. ‘probability’hadnotyetappeared,butfromonesidehespeaksaboutatheoretical mathematicalargumenttocalculatethenumberofchancestogettwosixes.Fromthe othersidehereliesonhisexperienceasagamblertoevaluatethefrequenciesofthe results.Itisalreadyrecognizablethetensionsbetweentheoryandexperience,between probabilityassubjectofstudyofapurelymathematicaldisciplineandprobabilityas apropertyofarealphysicalrandomprocessevolvingovertime.
WecouldaskourselveswhytheChevalieraffirmedthatmathematicswaswrong inalleging24throws.Accordingtosomehistorians,perhapshebelievedthat.Ifthe probabilityofsuccessinonethrowis1/n,in m throwsitis m/n,thatis24/36=0.667. Ofcourse,thisreasoningiswrong:probabilitieshavetobemultiplied,notsummed.
Historianssaythatsimilarproblemsaboutgamesofchancewerepresentbefore PascalandFermat.GerolamoCardano,forinstance,wrote LiberdeLudoAleae (‘The BookonGamesofChance’),writtenaboutin1560andpublishedposthumouslyin 1663,inwhichmanyresultsinvariousgameswithdicearediscussed.Inparticular,the chanceofvariouscombinationsofpointsingameswiththreedicearepresented.The sameproblemsaboutthreedicewerestudiedalsobyGalileoGalilei(Todhunter,1865), inabout1610 1620inhis Sopralescopertedeidadi,translatedindifferentways,for instance,‘AnalysisofDiceGames’,‘OnaDiscoveryConcerningDice’,‘Concerningan InvestigationonDice’,andsoon.Actuallytheword‘scoperte’meansthefacesofthe dicethatappear,sowecouldtranslateitsimplyas‘OntheOutcomesofDice’. Galileowasaskedwhyplayingwiththreedicethesumofpoints10or11are observedmorefrequentlythanthesumofpoints9or12.Galileo’sanswerwas:
Chenelgiuocodedadialcunipuntisienopi`uvantaggiosidialtri,vihalasuaragioneassai manifesta,laquale`eilpoterquellipi`ufacilmenteepi`ufrequentementescoprirsichequesti (Thefactthatindicegamescertainoutcomesaremoreadvantageousthanothershasavery clearreason,whichisthatcertainoutcomescanappearmoreeasilyandmorefrequentlythan
others.)
Forinstance,thesum9isobtainedwiththefollowingsixtriplenumber(triplicit`a), thatisthe scoperte ofthethreedice:
1.2.6.;1.3.5.;1.4.4.;2.2.5.;2.3.4.;3.3.3. Sixtriplenumberarealsonecessarytogetthesum10:
1.3.6.;1.4.5.;2.2.6.;2.3.5.;2.4.4.;3.3.4. Howeverthesum 3.3.3,forinstance,canbeproducedbyonlyonethrow,whilethe sum 3.3.4. bythreethrows: 3.3.4.,3.4.3.,4.3.3. Inconclusion,thesumof points10canbeproducedby27differentthrows,whilethesumofpoints9by25 only.
Letusdoaforwardstimewarpandreadthefollowingquotation:
Fromanurn,inwhichmanyblackandanequalnumberofwhitebutotherwiseidentical spheresareplaced,let20purelyrandomdrawingsbemade.Thecasethatonlyblackballs aredrawnisnotahairlessprobablethanthecasethatonthefirstdrawonegetsablack sphere,onthesecondawhite,onthethirdablack,etc.Thefactthatoneismorelikelytoget 10blackspheresand10whitespheresin20drawingsthanoneistoget20blackspheresis duetothefactthattheformereventcancomeaboutinmanymorewaysthanthelatter.The relativeprobabilityoftheformereventascomparedtothelatteristhenumber20!/10!10!, whichindicateshowmanypermutationsonecanmakeofthetermsintheseriesof10white and10blackspheres[...].Eachoneofthesepermutationsrepresentsaneventthathasthe sameprobabilityastheeventofallblackspheres.
ThatiswhatLudwigBoltzmannwrotein1896inhis Vorlesungen ¨ uberGastheorie translatedby Brush(1964).Itis notimpossible todraw20blackballs,sincethisballs extractionhasthesameprobabilityasanyotherone,butthenumberofwaystodraw 10blackballsand10whiteballsisfargreaterthanthatofallblackballs.
InhisFoundationsofStatisticalMechanics,Boltzmannexplicitlyintroducesthe postulateofequalaprioriprobabilityof microstates beingcompatiblewithagiven macroscopic state.Onthis Ansatz,Boltzmannexplainswhythe‘arrowoftime’points tothemoreprobablemacrostate.Wewill–sotospeak–holdinourhandthese conceptsbystudyingthestochasticprocess‘Ehrenfest’surnmodel’inthenextchapter.
BothCardanoandGalileofoundthesolutionoftheproblemsofthreedice,assumingthatthepossibleoutcomesare equally possibleandcountsthechanceofcompound events.Instatisticalmechanics,eachmicrostatedescribesthepositionandvelocityof eachmolecule.Amacrostateisastatedescriptionofthemacroscopicpropertiesof thesystem:forinstanceitspressure,volumeandsuch.Eachmacrostateismadeup ofmanymicrostates.Tohaveanideaofmicrostatesandmacrostates,letusthink ofmacrostatesasthesumofthepointsofthethreediceandofmicrostatesasthe numberoffavourableoutcomes.Sowesaythatthe‘system’(thesystemisformedby thethreedice)isinthemacrostate10(thatis,thesumofthepointsis10)whichis realizedby27microstates.IntheBoltzmann’sexampleofthe20balls,eachextracted sequenceisamicrostate,whilethenumberofwhite(orblack)ballsisamacrostate.
Thehypothesisofequalaprioriprobability(explicitlystatedorimplicitlyassumed),initsturn,restsontheprincipleofindifference:equalprobabilitieshaveto beassignedtoeachoccurrenceifthereisnoreasontothinkotherwise.Withasmall timeleap,welearnfrom Einstein(1925)that,inwhatwillbecalled‘Bose-Einstein Statistics’,themicrostatesarenotequallypossible,eventhoughthereisnoreasonto
Exercises 11
consideranyoneofthesemicrostateseithermoreorlesslikelytooccurthananyother (onthissubject,see Rosa(1993)).
Toconcludethisintroductorychapter,wenoticethatinthecomputerexperiments performedwithtwodice(Code_2_1.R)(thetermmostusedis simulation andmore exactly MonteCarlosimulation),itispossibletoexperiencethenotionofprobability. Inotherwords,thecomputerisregardedassomethinglikea‘statisticallaboratory’ withwhichprobabilisticexperimentscanbeperformed,experimentsnotquitefeasibleinpractice.Wewillencounterexpressionslike‘measurements’,‘statisticaland systematicerrors’,‘errorpropagation’,andsoon,justasinalaboratoryexperiment.
2.3Exercises
Exercise2.1 Wehaveseenin Code_2_1.R,relativetothethrowoftwodice,thattheestimatedprobabilityofthedoublesixin18000gameswith24throwsis P{A}[24] ≈ 0.4887, whilewith25throwsitis P{A}[25] =0 5113,resultpracticallyequaltothetheoreticalone. Thedifference ˆ d =0 5113 0 4887=0 0226resultedsignificant.WriteacodetoobtainFig. 2.2,showingifthedifferenceofthemeansissignificantwiththebootstrapmethod.Before youhavetoread AppendixA ifyouarenotfamiliarwiththebootstrapmethod.
Exercise2.2 InthedicegameUndersandOvers(U&O),twodicearerolled.Playersbet ononeofthefollowingalternatives:(1)Theresult(sumofthedicefaces)isbelow7,(2)the resultis7,(3)theresultisabove7.
Incases(1)and(3)thepayoffoddsare1:1,i.e.ifyoubet £1thehousegivesyouback yourmoneyplusanadditional £1.Incase(2)theoddsare4:1,i.e.betting £1yougain £4 (youget £5).
Supposeyoubetat £1ontheoutcome(1).Whatisyourexpectedaveragewin/loss(i.e. inaninfinitenumberofthrows)?
Exercise2.3 Referringtothepreviousexercise,writeacodetosimulateafinitegame consistingof10,100or1000throws.Discusstheresultofthesimulations,comparedtothe theoreticalwin/lossexpectation.
Hint:UsetheRfunction sample forsamplingadieface,i.e.anintegernumberfrom1:6
Exercise2.4 AvariantoftheU&Ogameallowstheplayertobetuptotwoalternatives (placing £1overeachone).Howdoesthewin/lossexpectationchange?
Exercise2.5 Justifythefollowingassertion:‘thehousealwayswins’.Hint:Ifyoucanbet £1oneachofthethreealternatives,whatistheexpectedoutcome?
Exercise2.6 Withreferencetoexercise 2.5,comparethetheoreticalresultforaninfinite numberofthrowswiththoseobtainedinasmallnumberofthem,e.g.10.Simulatingthe probleminR,in100repetitionshowmanytimesdoyouwinandhowmanydoyouloseyour money?
IntroductiontoStochasticProcesses
Noicorriamosempreinunadirezione, maqualsiaechesensoabbiachilosa...
Weareallheadedinonedirection, butwhichitisandwhatsenseitmakes,whoknows...
FrancescoGuccini,Incontro
Probabilitytheoryisessentialtotheunderstandingofmanyprocesses(physical,chemical,biological,economic,etc.).Bymeansofrandomvariables,webuildmodelsofsuch processes,thatisofsystemsthatevolveovertime.Weareinterestedinwhathappens inthefuture.Ifweknowtheprobabilitydistributionuntilnow,howwillitbemodified ifcarriedforwards,throughtime?Theanswerisamatterofstochasticprocesses.For furtherreadingmanybooksareavailableonthesubject(Feller,1970; Lawler,2006; YatesandGoodman,2015; JonesandSmith,2018; Grimmett,2018).
3.1Basicnotion
IndictionariesofclassicalGreek,theword στoχ ´ αζεσϑαι (stochazesthai )means‘to aimatsomething’,‘toaimatatarget,atagoal’,ata στoκ ´ oς (st´ochos).Later,figuratively,‘toaimatsomething’becomes‘tohavesomethinginview’,or‘toconjecture’, fromwhich στ oχαστικ´oς (stochastik´os),‘skilfulinaimingat’,‘abletoconjecture’. Sothe‘target’becomesthe‘conjecture’.Conjectureofwhat?Ofsomethingbelow theapparentchance?Ofundisclosedcauses?Isthereahidden‘determinism’evenin (seemingly)randomphenomena?Thatisthequestion.Theinterestedreadercanrefer toChapter2,wheresomehistoricalanswerswerecalledsuccinctly.
Astochastic(orrandom)processisdefinedasafamilyofrandomvariables:
X1,X2,...,Xt,...,
indexedbyaparameter t,anddefinedonthesameprobabilityspace(Ω, F , P)formally definedasfollows:Ωisthesamplespace,i.e.thespaceofallpossibleoutcomes, F isa familyofsubsetsofΩ,mathematicallydefineda σ-algebra,withparticularproperties (forexample,thatofincludingthewholesamplespaceandallpossibleunionsof subsets)thatmakeΩameasurablespace. P isaprobabilitymeasurefunctionoperating on F,suchthat P(Ω)=1and P(Φ)=0,Φbeingtheemptyset.Theindex t often, butnotalways,standsforatime(days,years,seconds,nanoseconds,etc.).The‘time’ canalsobeanon-physicaltime,asforinstance‘MonteCarlosteps’.
Astochasticprocessiswrittenas {Xt; t ∈ T}.Theset T isthe parametricspace, itcanbeasubsetofnaturalnumbersorintegers,thatis T = {0, 1, 2,... },or T = {..., 2, 1, 0, 1, 2,... },or T = {0, 1, 2,...,n}.Inthesecases {Xt} issaidtobe a discrete-timestochasticprocess.If T istherealline R oritssubset,forinstance T =(−∞, ∞),or T =[0, ∞),or T =[a,b),or T =[a,b], {Xt} issaidtobea continuous-timestochasticprocess.
Discrete-timeandcontinuous-timeprocessesessentiallydifferinthetimescale:in theformercaseeventsoccurinapredeterminedsuccessionoftimepoints t1,t2,... , inthelattereventscanoccurateachtimepoint t ofacontinuousrangeofpossible values.
Thename stochasticprocess refersthereforetotwoinherentaspects:theterm ‘process’referstoatimefunction;theadjective‘stochastic’referstorandomness,in thesensethatarandomvariableisassociatedtoeacheventinthetimescale.Insome casesthe stochasticprocess canalsobeassociatedtospaceandnotjusttime.
Timehasanarrow.Theprocesshasthereforeabeforeandanafter,apastanda future.Therealization xt attime t oftherandomvariable Xt issupposedtobecloser toobservations xt 1 and xt+1,ratherthantothosefartherintime.Thismeansthat the chronologicalorder ofobservationsplaysanessentialrole.
Wesaidthatthe Xt’sarerandomvariables.Theyaredefinedonthesameprobabilityspace(Ω, F , P).Theytakevaluesinameasurablespace,whosevaluesarecalled states.Wesaythat‘theprocessattime t isinthestate xi’,ormoresimply,‘the processat t isin i’,tomeanthattherandomvariable Xt hastakenthevalue xi.The setofallvaluestakenbythevariablesoftheprocessiscalledthe statespace andit willbedenotedas S.Wecansaythat‘the system attime t isinthestate i’,orthat it‘occupies’or‘visits’thestate i,if Xt = xi for xi ∈S
Aprocessis discrete,orisindiscretevalues,if S isdiscrete,thatisif S iscountable (finiteorinfinite): S⊆ N or S⊆ Z.Theprocessis continuous,orisincontinuous values,if S⊆ R.Sothat,stochasticprocessesmaybeclassifiedintofourtypes:
1. discrete-timeanddiscretestatespace,
2. discrete-timeandcontinuousstatespace,
3. continuous-timeanddiscretestatespace,
4. continuous-timeandcontinuousstatespace.
Inotherwords,discreteorcontinuoustimeconcernsthedomainofthetimevariable t,whilediscreteorcontinuousstateconcernsthedomainof Xt foragiven t Letustakeapracticalexample.Weareinterestedin continuously recordingthe temporalvariationsofthetemperatureofadevice.Fortechnicalreasons,thetemperaturedoesnotremainconstant,butfloatswithinacertainrange.Supposethe measurementsarereadonananaloguescaleand,tobespecific,supposealsothat measurementsaredowntothousandthsofadegree,withprecisionoftheorderof1%. Wecanconsiderthemeasurementsexpressedinrealnumbers,eventhough,obviously, anymeasurementhasafinitenumberofdigits.Withsuchpremises,thesequence intimeoftherandomvariable‘temperature’canberepresentedasacontinuous-time stochasticprocessincontinuousvalues.Ifwedecidetogroupthemeasurementswithin arange,say,oftenthsofadegree,theprocessisstillacontinuous-typeprocess,butin