Applied statistics with r: a practical guide for the life sciences justin c. touchon - Experience th

Page 1


https://ebookmass.com/product/applied-statistics-with-r-apractical-guide-for-the-life-sciences-justin-c-touchon/

Instant digital products (PDF, ePub, MOBI) ready for you

Download now and discover formats that fit your needs...

Data Analysis for the Life Sciences with R 1st Edition

https://ebookmass.com/product/data-analysis-for-the-life-scienceswith-r-1st-edition/

ebookmass.com

Applied Statistics for Environmental Science With R 1st Edition Abbas F. M. Alkarkhi

https://ebookmass.com/product/applied-statistics-for-environmentalscience-with-r-1st-edition-abbas-f-m-alkarkhi/

ebookmass.com

R and Python for Oceanographers: A Practical Guide with Applications Hakan Alyuruk

https://ebookmass.com/product/r-and-python-for-oceanographers-apractical-guide-with-applications-hakan-alyuruk/

ebookmass.com

Enforcing Ecocide: Power, Policing & Planetary Militarization Alexander Dunlap

https://ebookmass.com/product/enforcing-ecocide-power-policingplanetary-militarization-alexander-dunlap-2/

ebookmass.com

Courts and Criminal Justice in America (3rd Edition ) 3rd Edition

https://ebookmass.com/product/courts-and-criminal-justice-inamerica-3rd-edition-3rd-edition/ ebookmass.com

Strategies for Palladium-Catalyzed Non-Directed and Directed C-H Bond Functionalization 1st Edition Edition

Anant R. Kapdi And Debabrata Maiti (Eds.)

https://ebookmass.com/product/strategies-for-palladium-catalyzed-nondirected-and-directed-c-h-bond-functionalization-1st-edition-editionanant-r-kapdi-and-debabrata-maiti-eds/ ebookmass.com

China in Global Governance of Intellectual Property: Implications for Global Distributive Justice Wenting Cheng

https://ebookmass.com/product/china-in-global-governance-ofintellectual-property-implications-for-global-distributive-justicewenting-cheng/ ebookmass.com

Fundamentals of General, Organic and Biological Chemistry in SI Units 8th Edition

John E. Mcmurry Et Al.

https://ebookmass.com/product/fundamentals-of-general-organic-andbiological-chemistry-in-si-units-8th-edition-john-e-mcmurry-et-al/ ebookmass.com

The UN Convention on the Rights of Persons with Disabilities: A Commentary Ilias Bantekas

https://ebookmass.com/product/the-un-convention-on-the-rights-ofpersons-with-disabilities-a-commentary-ilias-bantekas/ ebookmass.com

https://ebookmass.com/product/explanation-beyond-causationphilosophical-perspectives-on-non-causal-explanations-alexanderreutlinger-editor/ ebookmass.com

AppliedStatisticswithR

APracticalGuidefortheLifeSciences

JUSTINC.TOUCHON

DepartmentofBiology,VassarCollege,USA

GreatClarendonStreet,Oxford,OX26DP, UnitedKingdom

OxfordUniversityPressisadepartmentoftheUniversityofOxford. ItfurtherstheUniversity’sobjectiveofexcellenceinresearch,scholarship, andeducationbypublishingworldwide.Oxfordisaregisteredtrademarkof OxfordUniversityPressintheUKandincertainothercountries ©JustinC.Touchon2021

Themoralrightsoftheauthorhavebeenasserted FirstEditionpublishedin2021

Impression:1

Allrightsreserved.Nopartofthispublicationmaybereproduced,storedin aretrievalsystem,ortransmitted,inanyformorbyanymeans,withoutthe priorpermissioninwritingofOxfordUniversityPress,orasexpresslypermitted bylaw,bylicenceorundertermsagreedwiththeappropriatereprographics rightsorganization.Enquiriesconcerningreproductionoutsidethescopeofthe aboveshouldbesenttotheRightsDepartment,OxfordUniversityPress,atthe addressabove

Youmustnotcirculatethisworkinanyotherform andyoumustimposethissameconditiononanyacquirer

PublishedintheUnitedStatesofAmericabyOxfordUniversityPress 198MadisonAvenue,NewYork,NY10016,UnitedStatesofAmerica

BritishLibraryCataloguinginPublicationData Dataavailable

LibraryofCongressControlNumber:2021934831

ISBN978–0–19–886997–9(hbk.) ISBN978–0–19–886933–7(pbk.)

DOI:10.1093/oso/9780198869979.001.0001

Printedandboundby CPIGroup(UK)Ltd,Croydon,CR04YY LinkstothirdpartywebsitesareprovidedbyOxfordingoodfaithand forinformationonly.Oxforddisclaimsanyresponsibilityforthematerials containedinanythirdpartywebsitereferencedinthiswork.

ForMyra

Preface

Welcome!

Thestatisticalanalysesthatlife-scientistsarebeingexpectedtoperform areincreasinglyadvancedandyetmostgraduateprogramsintheUnited StatesdonotevenofferastatisticscoursethatteachesbeyondAnalysis ofVariance(ANOVA)andlinearregression.Undergraduateandgraduate studentsarethusrarelyprovidedwiththeopportunitytolearnthetypesof analysestheyneedtoknowinordertopublishandcompeteonthejobmarket,muchlesssimplyanalyzetheirdataappropriately.Partofthereason forthisisthatthewaystatisticsaretraditionallytaughtcanbefrustratingly slowandtedious.WhenIwasagraduatestudent,Irememberexcitedly enrollinginastatisticsclasswiththehopeoflearninghowtoanalyzethe dataIwascollectingeachsummerinthefield.Unfortunately,wespentthe entiresemesterlearninghowtoperformananalysisofvarianceandalinear regression,byhand.Therehastobeabetterway!

Thisbookiswrittenwiththebeliefthatacomprehensiveunderstanding ofpracticaldataanalysesisnotasdauntingasitmightseem.Ihave beenteachinganannualstatisticsworkshopattheSmithsonianTropical ResearchInstituteformorethan10yearsandIknowthatmyapproach works.Myteachingperspectiveisrootedintheideathatinsteadofspendingtimemiredinstatisticaltheoryandlearningdataanalysisbyhand, themostimportantthingtounderstandiswhatkindofdatayouhave. Onceyouknowyourdata,youcanthenfigureouthowtoanalyzethem

effectively.Whetherattheundergraduate,graduate,orpost-graduatelevel, thisbookwillprovidethetoolsneededtoproperlyanalyzeyourdata inanefficient,accessible,plainspoken,frank,and(hopefully)humorous manner,ensuringthatreaderscomeawaywiththeknowledgeofwhich analysestheyshoulduseandwhentheyshouldusethem.

ThisbookusesthestatisticallanguageR,whichisthechoiceofecologists worldwideandisrapidlybecomingthe“go-to”statsprogramthroughout thelifesciences.Theexamplesinthebookarerootedinasingle,realdataset (publishedinthejournal Ecology in2013)anduseactualanalysesthatI haveconductedinmyprofessionalcareerasanecologist.Thedatasetis admittedlysomewhatmessy,andearlychaptersaredesignedsothatstudents“clean”therawdataasawayoflearningbasicdatamanipulation skillsandbuildinggoodhabits.Moreover,usingasinglerelativelylarge dataset(~2500observations)allowsstudentstogetagoodunderstanding ofwhattheyareanalyzingfromchaptertochapter,insteadofjumpingfrom onesmallpre-cleaneddatasettoanotherthroughoutthebook.Italsoallows readerstoseehowtheycanviewthesamedatathroughdifferentlenses andallowsaneasyandnaturalprogressionfromlinearandgeneralized linearmodelstomixedeffectsversionsofthosesameanalyses,giventhe hierarchicallynesteddesignoftheexampleexperiment.

Goalsforthebook

Itismysincerehopethatyoufindthisbookusefulandinstructive.Ihave triedmyhardesttodistilldowneverythingIknowandthinkaboutdata analysisintothesepages.YouwillundoubtedlyfindthatsomeofwhatI suggestmaydifferfromwhatyoureadelsewhere,eitherontheweborin otherbooks.Justabouteveryonethesedayshappenstoberatheropinionated,andstatisticiansandRusersarecertaintlynodifferent.Wherever possible,Ihavetriedtoincludetherationalebehindmythinking.

Sinceyouarereadingthisbook,youevidentlywanttolearnaboutdata analysis.Iapplaudyourinitiativeandtohopetorewardyoubyteachingyou howtodojustthat,efficientlyandeffectively.Herearethegoalsofthisbook.

• IhopetobuildyourfamiliaritywithRfromthegroundupvia thechaptersandassignments.Evenifyouhavesomeexperience withR,youwilllikelylearnnewwaystoapproachyourdataIfyou arerelativelynewtoR,Ihopethehandsonexperienceoftyping alongwiththeinstructionswillhelpyouovercome“fearoftheR prompt.”

• Iwanttoempoweryoutonotonlyfollowinstructionscarefully andanalyzethedatapresentedinthesechapters,buthopefullyto beabletoanalyzeyourowndataandtothinkcriticallyaboutdata whenyouseethempresentedinresearchandinthepublicrealm. Asyoumayalreadyknow,scienceliteracyisseriouslylackinginthe publicsphereandincreasingthenumberofpeoplewhocanthink criticallyaboutdatapresentedinthenewsorelsewhereisextremly important.

• Lastly,IhopeyoucanbecomeapartoftheglobalRcommunity.Ris sobigthereisnosinglerepositoryofinformationaboutitnoristhere asinglemanualthatcontainsallthepossibleinstructionsyoumight needtoexecute.Thus,inadditiontobookslikethisone,youwillneed tobecomefamiliarwithusingthewebtofindanswerstoquestions.I willprovideexamplesinthelaterchaptersofhowyoumightseekout informationtohelpyourselfwhen(not if,mindyou,but when)you getstuckorencounteranerror.

Basiclayoutofthebook

Thematerialspresentedinthesechaptersaresetupasfollows.Thereareten topics,eachanexplanatorychapterwhichwillallowyoutoteachyourself thecode.Icannotstressenoughthatyoureallydowanttotypethingsin andyouneedtothinkaboutwhatthecodemeansandwhatitisdoingif youwanttolearnthisstuff.Ifyouhaveanelectroniccopyofthebook,avoid anytemptationtocutandpaste.Ifyouarereadingthis,youareinterested inlearningR,right?Trustme,ifyoucutandpastecodeyou willnot learn aswellasifyoutypeitinbyhand.

Justanoteabouthoweachofthechapterswillbeformatted.Bitsofcode thatyoucan/shouldtypeinaredisplayedinlightgreyboxes,andtheoutput fromthatcodeisgenerallydisplayeddirectlybelowit.Forexample,check outthecodebelow.Whatisshowninthegreybox“(2+2)”iswhatyou wouldtypeattheRprompt,andthebitofcodebelowitistheoutputfrom executingthatcommand.

Ingeneral,ifyoutypeinexactlywhatisinthegreyboxesyouwillget whatisshownafterit!Amazing,Iknow.Yourmindisalreadyblown,right?

Thecodethatwillbepresentedinthisbookisoftenwritteninarelatively “long”formatinordertomakeitmorereadable.Thismightnotexactlybe howyoutypeittoyourcomputerthough,whichisperfectlyfine.

Attheendofeachchapterisashortsetofassignmentstogiveyouthe opportunitytopracticewhatyouhavejustlearned.Youcanfindsolutions totheassigmentsattheGitHubpageforthebook(https://github.com/ jtouchon/Applied-Statistics-with-R)aswellasotherimportantinformation.SinceRisanopensourcelanguageitislikelythatsomeofthecode

##[1]4

neededtoruntheexamplesinthisbookmaychangeovertime,andIwill postcodeupdatesonthatsite.

AlittlebackgroundaboutR

Risastatisticalprogrammingpackageandapowerfulgraphicsengine. RisconsideredtobeadialectoftheSandS+languagethatwascreatedbyAT&TBellLabs.SiscommerciallyavailablewhileRisopen sourceandfreelyavailablethroughtheComprehensiveRArchiveNetwork: (https://cran.r-project.org).Rhasmanyadvantagesbesidesbeingfreely available.Forexample,ausermightprogramloopstoconductmany repetitivestatisticalanalysesorsimulatethousandsofdatasetswithknown parameters.Inaddition,inthefieldsofEcologyandEvolutionaryBiology atleast,Risnowbyfarthemostcommonlyusedstatisticalprogram(see TouchonandMcCoy2016 Ecosphere).Thereissubstantialevidencethat similarshiftsareoccurringinPsychologyandNeuroscienceaswell.

AlittleabouthowRworks

BecauseRcreatesobjectsfromanalysesthatarestoredinitsmemory, newusersoftenaresurprisedbythefactthattheresultsoftheiranalyses arenotimmediatelydisplayedonthescreen.Whenyourunsomething successfully,allyougenerallyseeistheprompt,whichisdenotedbythe ‘>’sign.

Thereareseveralreasonsforthis.First,Rdoesexactlywhatyoutellit todo.Thus,ifyoutellittorunanANOVAandstorethatoutputasan object,itdoesthat,butyouhavetotellitaseparatefunctiontoshowyou theobjectyoucreated.Second,printingstuffonthescreentakestimeand computerpower.Bynotshowingeverythingthatisgoingon,Risbeingvery effcient.Forexample,ifyouwantedtodo100regressionsondifferentdata sets,Rcandothiswithoutopening100separatewindows.Onecanstore onlytheregressioncoefficientsanddisplayalloftheminasinglelinefor comparision.ItisthisflexibilitythatmakesRafantasticstatisticalprogram. Also,it’sfree.DidImentionthatitisfreeyet?

ThisbookprovidesanintroductiontousingRindataanalyseswith practicalexamplesdesignedtobereadilyaccessibletoalllifescientists. AlthoughtheexampledatasetIwilluseisecologicalinnature,the parallelswillhopefullybeeasytoseewithotherdisciplines.Amoreexplicit discussionofthisisattheendofChapter2.Risalsoaverypowerful graphingtoolandIwillgetyoustartedonyourwaytomakingpublication qualityfigures.

Thisbookisnotacomprehensiveoverviewofallavailablestatistical approachesandmethodsorexperimentaldesign.Nosinglebookcould dothat.Iwillofcoursetouchonmanydifferenttopics,butthereareover 16,000packagesavailabletouseinR(asofJuly2020),anumberwhichis growingbytheday,sosuchanoverviewisimpossible.

LearningRislikelearninganylanguage.Attimesitwillbediffcult andfrustrating,butitisworthitandifyoustickwithityouwillhave breakthroughsthatfeelamazing(Icallthese“R-gasms”).Overtime,you maygrowtoloveworkinginR!

ThereisaquoteIlovefromthemusician,actor,author,poet,andall aroundamazinghumanHenryRollins,whichencapsulatesalotofhow IthinkaboutdoingstatisticalanalysesandusingR.

Numbersareperfect,infallibleandeverlasting.Youaren’t.Numbersarealwaysright intheend.Youmayseeanincorrectfigure,butthat’snotthefaultofthenumber, thefaultliesinthepersondoingthecalculating.

–HenryRollins, HighAdventureintheGreatOutdoors

WhydoIlikethatquotesomuch?It’sbecausewhenyougetanerror inR,itisalmostcertainlyyourfault.Rdidn’tmessup,youdid.Sorry,but that’sthehonesttruth.Socheckyourcode!:)

WhylearnR?

Youmightbethinkingtoyourself“WhydoIneedtolearnR?”or“Seriously, Ihavetotypeeverythinginbyhand?!”or“Can’tIdothiseasierinanother program?”Therearemanyanswerstothesequestions.

• Ifyouareanundergraduatethinkingofgoingtograduateschool,it isusefulforyoutolearnRbecauseyouwillalmostcertainlyuseRas agraduatestudent.Thus,youwillhavealeguponeveryoneelse!Get startednowandbethebestyoucanbe.

• Yes,youhavetotypeeverythingin,butthatalsohelpsyoulearnwhat youaredoing.Itisveryeasytoclicksomebuttonsandgetananswer thatyoudon’treallyunderstand.Ifyouhavetotypeinthecodefor thestatisticsyouaredoing,youwillhaveabetterunderstandingof whatyouaredoing.

• Havingsomebasicfamiliaritywith“coding”isincreasinglyuseful acrossavarietyofdisciplines.Youdon’tneedtobeapro,butbeing comfortablewithacomputerandwithtypingcodetoachievearesult isveryuseful.

• Becauseitisfreeandextremelypowerful,Ristheonlystatistics programyouwilleverreallyneedtoknow.Ifyougoontograduate

Figure0.1 Thisfigure,fromTouchonandMcCoy(2016), demonstratestheriseinusageofRascomparedtoSAS, SPSS,andJMP,inthefieldofecology.Rreallyisthego-to program,soitisinyourbestinteresttolearnit.Touchon,J.C. andMcCoy,M.W.(2016).“Themismatchbetweencurrent statisticalpracticeanddoctoraltraininginecology.” Ecosphere.7(8):e01394.ReproducedunderCreative CommonsAttributionLicense(CC-BY)

schoolorintoconsultingoranyfieldthatdealswithdata,youwillbe abletouseR.Thisbookwillteachyoumanyofthebasicsyouwill needtoknowinR,butoneofthebestthingsaboutRisthatitcan beexpandedtoaccomplishnearlyanystatistical(or,moregenerally, dataanalytic)needsyoumighthave.Thesamecannotbesaidwith otherprogramslikeJMP,SPSS,orSAS,whichareveryexpensive andmaynotbeavailabletoyouatanotherinstitution.Checkout Figure0.1forevidencethatRhasbecometheprogramofchoice(at leastinEcology,butthesameistrueinotherfieldsaswell).

Okay,shallwegetstarted?

Acknowledgments

Thisbookowesatremendousdebttomanypeople.Firstandforemost, thankyoutoAndyJonesandStuartDennis.Thethreeofustooka germofanidea—adesiretoteachfolksthepracticaltoolstheywould needtoanalyzetheirdatainR—andcreatedtheinitialworkshopatthe SmithsonianTropicalResearchInstitute(STRI)thatthismaterialevolved from.ThankyoutoOwenMcMillan,AdrianaBilgray,andPaolaGomez atSTRIfortheircontinuedsupportofmeandmydesiretoteachpeople howtouseR.Moregenerally,thankyoutotheamazingcommunityof scientistsatSTRIforprovidingsuchanincredibleenvironmenttolearn andconductresearch.ManythankstoJamesVoneshandMikeMcCoy,two invaluablementors,colleagues,andfriendsovertheyears.Yourknowledge ofRcertainlyeclipsesmine,andIhopeI’vedonejusticetoallthatyou havetaughtme.Thankyoutomydoctoralandpost-doctoraladvisorKaren Warkentin.KarenandJameswrotetheNationalScienceFoundationgrant thatgeneratedthedatausedthroughoutthisbook.ManythankstoTim Thurmanforopeningmyeyestotheworldof ggplot2 and dplyr.Thank youtothehundredsofinterns,undergraduate,andgraduatestudents, postdocs,andprofessionalscientiststhatIhavehadthepleasureofteaching overthepastdecadeorso.Thelessonsinthisbookhavebeencontinually refinedandimprovedbasedonyourfeedback,sothankyouformaking meabetterteacher.Inparticular,thankyoutothestudentsinmy2020

AppliedBiostatisticsclassatVassarCollegeforthecountlesstyposthey foundinearlydraftsofthesechapters.Lastly,thankyoutomywifeMyra Hugheyforherpatience,support,andeditorialadviceovertheyears.You arethebestpartnerinresearchandlifeIcouldeverhopefor.

3.2Readinginthedatafile

4.1Principlesofeffectivefiguremaking

4.2Dataexplorationusing ggplot2

4.3Plottingyourdata

4.4Assignment!

5.1Determiningwhattypeofanalysistodo

5.5Introducinglinearmodels

5.6One-wayanalysisofvariance—ANOVA

5.7Multiplecomparisons

5.8Assignment!

6.1Gettingstarted

6.2Multi-wayAnalysisofVariance—ANOVA

6.4Analysisofcovariance(ANCOVA)

6.5The predict() function

6.6Plottingwith ggplot() insteadof qplot()

7.1Understandingnon-normaldata

7.2GLMs

7.3UnderstandingandinterpretingtheGLM

7.4CalculatingstatisticalsignificancewithGLMs

7.5CodingthedataasabinomialGLM

7.6MixingGLMsandANCOVAstogether

7.7Usingthe predict() functionwithaGLM

7.8MakingamucheasierGLM/ANCOVAplot using ggplot2

8 MixedEffectsModels

8.1Understandingmixedeffectsmodels

Chapter 9: AdvancedDataWranglingandPlotting

9 AdvancedDataWranglingandPlotting

9.1The“tidyverse”

9.2Basicdatawrangling

9.3Advanceddatawrangling:Spreadingandgatheringyourdata

9.4Evenmoreadvanceddatawrangling! Usingthe do() function

9.5Makingbetterfigureswith ggplot2

9.6Basicsof ggplot2

9.7Customizingyourfigure

9.8Combiningdatawranglingwithplotting with ggplot2

11.1Understandingyourdataisthemostimportantprecursorto analyzingit

11.2Knowinghowtogethelpisessential

11.3Yourdataanalysisshouldbeclearfromtheoutsetandyoushould avoidquestionabletechniques

11.4Presentingyourdatainwell-constructedfiguresiskey

1 IntroductiontoR

1IntroductiontoR

1.1overview

Thepurposeofthisfirstchapteristointroduceyoutothebasicworkings ofRandgetyouuptospeed.Someofthismaterialmightbefamiliarto youifyou’veusedRbefore,butthegoalistogetanyonereadingthebook uptoabasicleveloffamiliarity.Youwilllearnmanyofthebasicandvery importantfunctionsofR,suchas:

• Creatingobjects

• WritingarticulateRcode

• Usingfunctions

• Generatingartificialdata

• EnteringdatainaformatthatcanbereadandanalyzedbyR

Thischapterdoesnotintendtobeanexhaustiveintroductiontoallthe basicworkingsofR.Inotherwords,we’llmoveprettyquicklyhere.Ifyou wouldlikeagreaterintroduction,Ihighlyrecommendcheckingoutthe

AppliedStatisticswithR:APracticalGuidefortheLifeSciences.JustinC.Touchon,Oxford UniversityPress(2021).©JustinC.Touchon.DOI:10.1093/oso/9780198869979.003.0001

excellentbook GettingStartedWithR:AnIntroductionforBiologists by AndrewBeckerman,DylanChilds,andOwenPetchey.

1.2gettingstarted

1.2.1ObtainingR

Ifyouarebrandnewtothis,youwillhavetodownloadRinorderto doanything.Justnavigateyourwebbrowserofchoicetohttp://cran.rproject.orgtodownloadtheappropriateversionofRforyouroperating system.Thereisanotherprogramyoumayhaveheardofandmaywantto usecalledRStudio,whichcanbefoundathttp://www.rstudio.com.

Box1.1- RStudio

Pleaserememberthis:RStudioisaprogramthatusesR.Ithelpskeepthingsorganized andhassomeniceautocompletefunctions,butRistheactualprogramthatdoes everything thatwewillcoverinthisbook.RStudiohasplentyofgreatfeatures,don’t getmewrong.It’sreallygreatforwritinginRMarkdownandLaTeX,ifyouchoosetodo that.But,likeIsaid,RStudioisaprogramthatusesR.Rdoesalltheheavylifting.Risthe statisticsprogram.Personally,IuseregularplainoldRandnotRStudio.Toeachthere own….

1.2.2Installingandloadingpackages

Risdesignedtobeasmallprogram(currentlyjustabout80mb)which makesiteasytodownloadandinstallanywhereintheworld.Thebase versionofRcontainsagreatnumberoffunctionsfororganizingand analyzingdata,buttherealstrengthcomesinwhatarecalled packages PackagesarefreelydownloadableadditionstoRthatprovidenewfunctions anddatasetsforparticularanalyses.Forexample,thebaseversionofRcan conductlinearmodelsandgeneralizedlinearmodels(Chapters5–7)but cannotconductmixedeffectsmodels(Chapter8).Todomixedeffects models,youneedtodownloadaspecificpackage(ofwhichthereare several).

Theonlyimportantthingtorememberaboutpackagesisthatadding themtoRisatwo-stepprocess.First,youhaveto install apackage,which (perhapscounterintuitively)justdownloadsthepackagetoyourcomputer.

Secondly,youhaveto load thepackage,whichiswhenyouhaveactively placeditinthecurrentmemoryforuse.Youwillgenerallyobtainpackages fromtheComprehensiveRArchiveNetwork(https://cran.r-project.org/) (CRAN)directlythroughR.

Box1.2- Installsomepackages

AssumingyouhaveinstalledRonyourcomputer,youshouldrunthefollowingcode toinstallthevariouspackagesyouwillneedtohaveinordertoexecutethecommands presentedthroughoutthisbook.IfyouareusingRStudioyoucanclickonthepackages tabandsearchfortheseoneattimebyusingthelittlesearchwindow.Makesureto clickthebuttonto“InstallDependencies.”Wewon’tdoanythingwiththeserightnow, buttheywillbenecessarylaterinthebook.

install.packages(c(”lme4”,”multcomp”,”car”,”ggplot2”,”gplots”, ”MASS”,”tidyr”,”dplyr”,”broom”,”gridExtra”, ”cowplot”,”emmeans”,”glmmTMB”,”lattice”), dependencies= T, repos= ”http://cran.us.r-project.org”)

1.3workingfromthescriptwindow

ThebiggestmistakethatmostnewRusersmakeistojusttypecommands intothecommandprompt.Theproblemwiththisisthatonceyouhitenter thecommandisgone.Ifyouhittheup-arrow,Rwillscrollthroughthe previouslyexecutedcommands,butasidefromthiswhatyoutypedisgone and itcannotbeedited!Itisofcoursereasonabletorunlinesfromthe commandlinefromtimetotime,butitismuchbettertoworkfromascript window.

Thescriptwindowallowsyoutoeasilysaveandedityourcode,and toexecuteoneormultiplelinesofcodeatonce.Toopenablankscript window,gototheFilemenuandclickonNewDocument,orjusthit command-N(Mac)orcontrol-N(PC)onyourkeyboard.

Inthescriptwindowyoucantypeinyourcommandsandthenexecute thembyhittingcommand-enter(Mac)orcontrol-R(PC).Thismeansyou typecodeintothescriptwindowandthentheprogramsendsthelineof

codetothecommandpromptforyou.Donotcutandpastecodefromthe scriptwindowtothecommandprompt;thatisawasteoftime.Youcanalso highlightmultiplelinesofcodeandexecutethemallatonce.Tosaveyour codesimplygototheFilemenuandsaveasyouwouldanyotherfile(or justhitcommand-Sorcontrol-Sonyourkeyboard).

Ascriptallowsyoutoedit,run,andtweakyourcode,saveit,returntoit later,senditcollaboratorsormentors,andsoon.Anythingyouthinkwill wanttorunmorethanonce,orthatyoumightwanttoedit,shouldbetyped intoascriptwindow(whichisprettymucheverything).

1.4creatingwell-documentedandannotatedcode

Oneofthemostimportantthingsyoucandoiswriteorderly,wellannotatedcodethatnotonlyfunctionswellbutexplainswhatishappening andwhyitishappeninganddoessoineasytoreadandunderstand language.ThisideawasfirstintroducedbycomputerscientistDonald Knuthandisknownas“literateprogramming.”Literateprogrammingis theprocessofinterspersingyourcomputercode,inthiscaseRcode,with plain-languagedescriptionsofwhatthecodeisdoing.Thisallowsareader tohaveafullyformedideaofwhatisgoingon.InR,youdothiswith annotation,whichissimplytheprocessofleavingnoteswithinthecode thatarenotactuallycodethemselves.It’slikeyouareHanselandGretel gettingdraggedintothewoods:youwanttoleaveplentyofcluesforyour futureself(orothers)tobeabletodiscernthetrailyoutook.

Box1.3- Writegoodcode,foryourselfandothers

ForanybitofRcodeyouwrite,youshouldconsiderthatyouarewritingforthree audiences:

1) Yourfutureself

2) Yourcollaborators

3) Everyoneelsethatmightlookatyourcodeoneday

1.Writingforyourself

Seldom(never?)willyouhavetheopportunitytositdownwithadataset andanalyzeitstarttofinishinasinglesitting.Itisrarethatyouevenwill havetheopportunitytoworkonitonconsecutivedayswherewhatyoudid yesterdayisstillfreshinyourmindtoday.Whatismorerealisticisthatyou workonsomethingforsomeperiodoftime(hours,days,maybeevenweeks ifyouarereallylucky!)thenhavetoputitdownforsometimebecauseyou aredistractedbyothertasks(teaching,otherresearchdemands,manuscript revisions,parenting,apandemic,etc.).Bythetimeyoucomebacktoyour codeevenaweeklater,youwilllikelyhavetoinvestsomesubstantialtime gettingbacktowhereyouwere.Writinggood,clearcodewillreducethat restarttimeconsiderably.

2.Writingforcollaborators

Ifyouare,orareplanningtobe,aprofessionalscientist,youareunlikely toworkexclusivelybyyourself.Therewillbetimeswhenyoucollaborate withothers.Maybeit’syourgraduateadvisor,maybeacolleagueatanother institution.Whateverthescenario,itmeansyoumightberesponsiblefor analyzingororganizingsomesetofdata,thensharingitwithothers.If that’sthecase,youwanttomakesurewhenyousendyourcodeitisclear whatyoudidandwhyyoudidit.Imaginetheembarrassmentofyour collaboratorsendingyouquestionafterquestiontryingtofigureoutwhat yourcodemeans!

3.Writingforfolksinthefuturewhomightwanttoseeyourcode Increasingly,itisnecessarytopostboththedatathatgointoascientific articleandalsothecodethatwasusedtoanalyzeit.Thisisatremendous steptowardsincreasingtransparencyinscienceandistobeapplaudedfor sure.Butitalsomeansthatsomestrangermightlookatyourcodeamonth orayearormoredowntheroad,evenafteryouthoughtyouwerelongdone withitall.Thus,justlikewritingforyourfutureselforyourcollaborators,

Turn static files into dynamic content formats.

Create a flipbook