An overview of making rangeModelMetadata objects

Cory Merow, Brian Maitner, Hannah Owens, Jamie Kass, Brian Enquist, Rob Guralnick

2021-06-10

library(rangeModelMetadata)
library(sp)
library(spocc)
library(dplyr)
## Warning: package 'dplyr' was built under R version 3.6.2

What is an rmm (rangeModelMetadata) object?

A simple rmm object is a list object that is structured to contain metadata pertaining to species range models. Here we make an empty rmm object containing only the obligate set of fields.

rmm1=rmmTemplate(family=c('base')) 
str(rmm1)
## List of 8
##  $ authorship    :List of 8
##   ..$ rmmName          : NULL
##   ..$ names            : NULL
##   ..$ license          : NULL
##   ..$ contact          : NULL
##   ..$ relatedReferences: NULL
##   ..$ authorNotes      : NULL
##   ..$ miscNotes        : NULL
##   ..$ doi              : NULL
##  $ studyObjective:List of 4
##   ..$ purpose  : NULL
##   ..$ rangeType: NULL
##   ..$ invasion : NULL
##   ..$ transfer : NULL
##  $ data          :List of 4
##   ..$ occurrence :List of 6
##   .. ..$ taxon          : NULL
##   .. ..$ dataType       : NULL
##   .. ..$ yearMin        : NULL
##   .. ..$ yearMax        : NULL
##   .. ..$ sources        : NULL
##   .. ..$ spatialAccuracy: NULL
##   ..$ environment:List of 9
##   .. ..$ variableNames: NULL
##   .. ..$ yearMin      : NULL
##   .. ..$ yearMax      : NULL
##   .. ..$ extentSet    : NULL
##   .. ..$ extentRule   : NULL
##   .. ..$ resolution   : NULL
##   .. ..$ projection   : NULL
##   .. ..$ sources      : NULL
##   .. ..$ notes        : NULL
##   ..$ observation:List of 3
##   .. ..$ variableNames: NULL
##   .. ..$ minVal       : NULL
##   .. ..$ maxVal       : NULL
##   ..$ dataNotes  : NULL
##  $ dataPrep      :List of 1
##   ..$ dataPrepNotes: NULL
##  $ model         :List of 8
##   ..$ algorithms        : NULL
##   ..$ algorithmCitation : NULL
##   ..$ speciesCount      : NULL
##   ..$ selectionRules    : NULL
##   ..$ finalModelSettings: NULL
##   ..$ notes             : NULL
##   ..$ partition         :List of 3
##   .. ..$ partitionSet : NULL
##   .. ..$ partitionRule: NULL
##   .. ..$ notes        : NULL
##   ..$ references        : NULL
##  $ prediction    :List of 4
##   ..$ binary       :List of 2
##   .. ..$ thresholdSet : NULL
##   .. ..$ thresholdRule: NULL
##   ..$ extrapolation: NULL
##   ..$ transfer     :List of 2
##   .. ..$ environment1:List of 1
##   .. .. ..$ extrapolation: NULL
##   .. ..$ notes       : NULL
##   ..$ uncertainty  :List of 4
##   .. ..$ units : NULL
##   .. ..$ minVal: NULL
##   .. ..$ maxVal: NULL
##   .. ..$ notes : NULL
##  $ assessment    :List of 2
##   ..$ references: NULL
##   ..$ notes     : NULL
##  $ code          :List of 8
##   ..$ software        :List of 2
##   .. ..$ platform: NULL
##   .. ..$ packages: NULL
##   ..$ demoCodeLink    : NULL
##   ..$ vignetteCodeLink: NULL
##   ..$ fullCodeLink    : NULL
##   ..$ demoDataLink    : NULL
##   ..$ vignetteDataLink: NULL
##   ..$ fullDataLink    : NULL
##   ..$ codeNotes       : NULL
##  - attr(*, "class")= chr [1:2] "list" "RMM"

A more complex rmm object with all predefined fields. It may seem like a lot at first, but we’re trying to keep you all happy, and many of these won’t be needed. For example, you’ll only be using one modeling algorithm, so you’ll only be selecting one of the fields below under $model. Or, you may not use five different methods for cleaning geographic issues with your data under $dataPrep$geographic and so you’ll end up omitting many of those fields. As we explore the hierarchy below, it’ll seem simpler…

rmm2=rmmTemplate(family=NULL)
str(rmm2)
## List of 8
##  $ authorship    :List of 8
##   ..$ rmmName          : NULL
##   ..$ names            : NULL
##   ..$ license          : NULL
##   ..$ contact          : NULL
##   ..$ relatedReferences: NULL
##   ..$ authorNotes      : NULL
##   ..$ miscNotes        : NULL
##   ..$ doi              : NULL
##  $ studyObjective:List of 7
##   ..$ purpose    : NULL
##   ..$ rangeType  : NULL
##   ..$ invasion   : NULL
##   ..$ transfer   : NULL
##   ..$ assumptions: NULL
##   ..$ hypotheses : NULL
##   ..$ workflow   : NULL
##  $ data          :List of 5
##   ..$ occurrence :List of 12
##   .. ..$ samplingDesign          : NULL
##   .. ..$ taxon                   : NULL
##   .. ..$ dataType                : NULL
##   .. ..$ occurrenceType          : NULL
##   .. ..$ yearMin                 : NULL
##   .. ..$ yearMax                 : NULL
##   .. ..$ sources                 : NULL
##   .. ..$ presenceSampleSize      : NULL
##   .. ..$ absenceSampleSize       : NULL
##   .. ..$ backgroundSampleSizeSet : NULL
##   .. ..$ backgroundSampleSizeRule: NULL
##   .. ..$ spatialAccuracy         : NULL
##   ..$ environment:List of 11
##   .. ..$ variableNames: NULL
##   .. ..$ minVal       : NULL
##   .. ..$ maxVal       : NULL
##   .. ..$ yearMin      : NULL
##   .. ..$ yearMax      : NULL
##   .. ..$ extentSet    : NULL
##   .. ..$ extentRule   : NULL
##   .. ..$ resolution   : NULL
##   .. ..$ projection   : NULL
##   .. ..$ sources      : NULL
##   .. ..$ notes        : NULL
##   ..$ observation:List of 3
##   .. ..$ variableNames: NULL
##   .. ..$ minVal       : NULL
##   .. ..$ maxVal       : NULL
##   ..$ transfer   :List of 1
##   .. ..$ environment1:List of 9
##   .. .. ..$ minVal    : NULL
##   .. .. ..$ maxVal    : NULL
##   .. .. ..$ yearMin   : NULL
##   .. .. ..$ yearMax   : NULL
##   .. .. ..$ sources   : NULL
##   .. .. ..$ extentSet : NULL
##   .. .. ..$ extentRule: NULL
##   .. .. ..$ resolution: NULL
##   .. .. ..$ notes     : NULL
##   ..$ dataNotes  : NULL
##  $ dataPrep      :List of 4
##   ..$ geographic   :List of 6
##   .. ..$ geographicStandardization :List of 2
##   .. .. ..$ rule : NULL
##   .. .. ..$ notes: NULL
##   .. ..$ geographicalOutlierRemoval:List of 2
##   .. .. ..$ rule : NULL
##   .. .. ..$ notes: NULL
##   .. ..$ centroidRemoval           :List of 2
##   .. .. ..$ rule : NULL
##   .. .. ..$ notes: NULL
##   .. ..$ pointInPolygon            :List of 2
##   .. .. ..$ rule : NULL
##   .. .. ..$ notes: NULL
##   .. ..$ altitudeRemoval           :List of 1
##   .. .. ..$ rule: NULL
##   .. ..$ spatialThin               :List of 2
##   .. .. ..$ rule : NULL
##   .. .. ..$ notes: NULL
##   ..$ biological   :List of 5
##   .. ..$ duplicateRemoval        :List of 2
##   .. .. ..$ rule : NULL
##   .. .. ..$ notes: NULL
##   .. ..$ questionablePointRemoval:List of 2
##   .. .. ..$ rule : NULL
##   .. .. ..$ notes: NULL
##   .. ..$ taxonomicHarmonization  :List of 2
##   .. .. ..$ rule : NULL
##   .. .. ..$ notes: NULL
##   .. ..$ cultivatedRemoval       :List of 2
##   .. .. ..$ rule : NULL
##   .. .. ..$ notes: NULL
##   .. ..$ nonNativeRemoval        :List of 2
##   .. .. ..$ rule : NULL
##   .. .. ..$ notes: NULL
##   ..$ environmental:List of 3
##   .. ..$ environmentalOutlierRemoval:List of 2
##   .. .. ..$ rule : NULL
##   .. .. ..$ notes: NULL
##   .. ..$ environmentalThin          :List of 2
##   .. .. ..$ rule : NULL
##   .. .. ..$ notes: NULL
##   .. ..$ notes                      : NULL
##   ..$ dataPrepNotes: NULL
##  $ model         :List of 15
##   ..$ algorithms         : NULL
##   ..$ algorithmCitation  : NULL
##   ..$ speciesCount       : NULL
##   ..$ covariateScaling   : NULL
##   ..$ occurrenceTreatedAs: NULL
##   ..$ selectionRules     : NULL
##   ..$ finalModelSettings : NULL
##   ..$ notes              : NULL
##   ..$ partition          :List of 5
##   .. ..$ occurrenceSubsampling: NULL
##   .. ..$ numberFolds          : NULL
##   .. ..$ partitionSet         : NULL
##   .. ..$ partitionRule        : NULL
##   .. ..$ notes                : NULL
##   ..$ resampling         :List of 2
##   .. ..$ resamplingRule: NULL
##   .. ..$ notes         : NULL
##   ..$ samplingBias       :List of 1
##   .. ..$ notes: NULL
##   ..$ algorithm          :List of 12
##   .. ..$ maxent      :List of 19
##   .. .. ..$ featureSet                 : NULL
##   .. .. ..$ featureRule                : NULL
##   .. .. ..$ regularizationMultiplierSet: NULL
##   .. .. ..$ regularizationRule         : NULL
##   .. .. ..$ convergenceThresholdSet    : NULL
##   .. .. ..$ samplingBiasRule           : NULL
##   .. .. ..$ samplingBiasNotes          : NULL
##   .. .. ..$ targetGroupSampleSize      : NULL
##   .. .. ..$ offsetSet                  : NULL
##   .. .. ..$ offsetRule                 : NULL
##   .. .. ..$ expertMapProbSet           : NULL
##   .. .. ..$ expertMapProbRule          : NULL
##   .. .. ..$ expertMapRateSet           : NULL
##   .. .. ..$ expertMapRateRule          : NULL
##   .. .. ..$ expertMapSkewSet           : NULL
##   .. .. ..$ expertMapSkewRule          : NULL
##   .. .. ..$ expertMapShiftSet          : NULL
##   .. .. ..$ expertMapShiftRule         : NULL
##   .. .. ..$ notes                      : NULL
##   .. ..$ ppm         :List of 3
##   .. .. ..$ formula: NULL
##   .. .. ..$ fitting: NULL
##   .. .. ..$ notes  : NULL
##   .. ..$ glm         :List of 4
##   .. .. ..$ family : NULL
##   .. .. ..$ formula: NULL
##   .. .. ..$ weights: NULL
##   .. .. ..$ notes  : NULL
##   .. ..$ mars        :List of 7
##   .. .. ..$ formula: NULL
##   .. .. ..$ degree : NULL
##   .. .. ..$ penalty: NULL
##   .. .. ..$ nk     : NULL
##   .. .. ..$ thresh : NULL
##   .. .. ..$ pmethod: NULL
##   .. .. ..$ notes  : NULL
##   .. ..$ brt         :List of 8
##   .. .. ..$ formula         : NULL
##   .. .. ..$ distribution    : NULL
##   .. .. ..$ nTrees          : NULL
##   .. .. ..$ interactionDepth: NULL
##   .. .. ..$ shrinkage       : NULL
##   .. .. ..$ bagFraction     : NULL
##   .. .. ..$ trainFraction   : NULL
##   .. .. ..$ notes           : NULL
##   .. ..$ bioclim     :List of 1
##   .. .. ..$ notes: NULL
##   .. ..$ ann         :List of 4
##   .. .. ..$ formula: NULL
##   .. .. ..$ size   : NULL
##   .. .. ..$ decay  : NULL
##   .. .. ..$ notes  : NULL
##   .. ..$ gam         :List of 8
##   .. .. ..$ family     : NULL
##   .. .. ..$ formula    : NULL
##   .. .. ..$ smoothTerms: NULL
##   .. .. ..$ weights    : NULL
##   .. .. ..$ offset     : NULL
##   .. .. ..$ method     : NULL
##   .. .. ..$ select     : NULL
##   .. .. ..$ notes      : NULL
##   .. ..$ fda         :List of 3
##   .. .. ..$ formula: NULL
##   .. .. ..$ method : NULL
##   .. .. ..$ notes  : NULL
##   .. ..$ randomForest:List of 5
##   .. .. ..$ ntree   : NULL
##   .. .. ..$ mtry    : NULL
##   .. .. ..$ maxnodes: NULL
##   .. .. ..$ try     : NULL
##   .. .. ..$ notes   : NULL
##   .. ..$ rangeBagging:List of 4
##   .. .. ..$ votes           : NULL
##   .. .. ..$ nDimensions     : NULL
##   .. .. ..$ proportionSubset: NULL
##   .. .. ..$ notes           : NULL
##   .. ..$ occupancy   :List of 3
##   .. .. ..$ formula           : NULL
##   .. .. ..$ observationFormula: NULL
##   .. .. ..$ notes             : NULL
##   ..$ ensemble           :List of 3
##   .. ..$ algorithms: NULL
##   .. ..$ weighting : NULL
##   .. ..$ notes     : NULL
##   ..$ references         : NULL
##   ..$ justification      :List of 1
##   .. ..$ modelComplexity: NULL
##  $ prediction    :List of 6
##   ..$ continuous   :List of 3
##   .. ..$ units : NULL
##   .. ..$ minVal: NULL
##   .. ..$ maxVal: NULL
##   ..$ binary       :List of 2
##   .. ..$ thresholdSet : NULL
##   .. ..$ thresholdRule: NULL
##   ..$ notes        : NULL
##   ..$ extrapolation: NULL
##   ..$ transfer     :List of 3
##   .. ..$ environment1 :List of 7
##   .. .. ..$ units        : NULL
##   .. .. ..$ minVal       : NULL
##   .. .. ..$ maxVal       : NULL
##   .. .. ..$ thresholdSet : NULL
##   .. .. ..$ thresholdRule: NULL
##   .. .. ..$ extrapolation: NULL
##   .. .. ..$ notes        : NULL
##   .. ..$ notes        : NULL
##   .. ..$ extrapolation: NULL
##   ..$ uncertainty  :List of 9
##   .. ..$ units        : NULL
##   .. ..$ minVal       : NULL
##   .. ..$ maxVal       : NULL
##   .. ..$ algorithmic  : NULL
##   .. ..$ parameter    : NULL
##   .. ..$ scenario     : NULL
##   .. ..$ inputData    : NULL
##   .. ..$ extrapolation: NULL
##   .. ..$ notes        : NULL
##  $ assessment    :List of 7
##   ..$ trainingDataStats  :List of 16
##   .. ..$ AUC               : NULL
##   .. ..$ pearsonCor        : NULL
##   .. ..$ cohensKappa       : NULL
##   .. ..$ trueSkillStatistic: NULL
##   .. ..$ truePositiveRate  : NULL
##   .. ..$ trueNegativeRate  : NULL
##   .. ..$ falsePositiveRate : NULL
##   .. ..$ falseNegativeRate : NULL
##   .. ..$ boyce             : NULL
##   .. ..$ pAUC              : NULL
##   .. ..$ pAUCLoThreshold   : NULL
##   .. ..$ pAUCHiThreshold   : NULL
##   .. ..$ AIC               : NULL
##   .. ..$ BIC               : NULL
##   .. ..$ DIC               : NULL
##   .. ..$ metrics           : NULL
##   ..$ testingDataStats   :List of 13
##   .. ..$ AUC               : NULL
##   .. ..$ AUCDiff           : NULL
##   .. ..$ pearsonCor        : NULL
##   .. ..$ cohensKappa       : NULL
##   .. ..$ trueSkillStatistic: NULL
##   .. ..$ truePositiveRate  : NULL
##   .. ..$ trueNegativeRate  : NULL
##   .. ..$ falsePositiveRate : NULL
##   .. ..$ falseNegativeRate : NULL
##   .. ..$ boyce             : NULL
##   .. ..$ omissionRate      : NULL
##   .. ..$ notes             : NULL
##   .. ..$ metrics           : NULL
##   ..$ evaluationDataStats:List of 10
##   .. ..$ AUC               : NULL
##   .. ..$ pearsonCor        : NULL
##   .. ..$ cohensKappa       : NULL
##   .. ..$ trueSkillStatistic: NULL
##   .. ..$ truePositiveRate  : NULL
##   .. ..$ trueNegativeRate  : NULL
##   .. ..$ falsePositiveRate : NULL
##   .. ..$ falseNegativeRate : NULL
##   .. ..$ boyce             : NULL
##   .. ..$ metrics           : NULL
##   ..$ expertJudgement    : NULL
##   ..$ responseCurves     : NULL
##   ..$ references         : NULL
##   ..$ notes              : NULL
##  $ code          :List of 9
##   ..$ wallace         :List of 9
##   .. ..$ occsNum           : NULL
##   .. ..$ userCSV           : NULL
##   .. ..$ removedIDs        : NULL
##   .. ..$ occsCellPolyCoords: NULL
##   .. ..$ userBgExt         : NULL
##   .. ..$ userBgPath        : NULL
##   .. ..$ userBgShpParams   : NULL
##   .. ..$ maxentEvalPlotCell: NULL
##   .. ..$ bcPlotSettings    : NULL
##   ..$ software        :List of 2
##   .. ..$ platform: NULL
##   .. ..$ packages: NULL
##   ..$ demoCodeLink    : NULL
##   ..$ vignetteCodeLink: NULL
##   ..$ fullCodeLink    : NULL
##   ..$ demoDataLink    : NULL
##   ..$ vignetteDataLink: NULL
##   ..$ fullDataLink    : NULL
##   ..$ codeNotes       : NULL
##  - attr(*, "class")= chr [1:2] "list" "RMM"

Populating an rmm object

rmm objects can be populated manually by entering data direcetly into the fields (see vignette('rmm_Multispecies',package='rangeModelMetadata')), or through the use of several helper functions. Although the rmm object template already contains a number of fields that depend on the specified family (see options with rmmFamilies()), users can also add new fields as needed. We provide suggestions of both common fields to add and common values for many fields.

Not sure which fields are available or what values to enter? We can suggest options.

rmmSuggest('dataPrep',fullFieldDepth=FALSE)
## $type
## [1] "field2"
## 
## $suggestions
## [1] "dataPrep$geographic"    "dataPrep$biological"    "dataPrep$environmental"
## [4] "dataPrep$dataPrepNotes"
rmmSuggest('dataPrep',fullFieldDepth=TRUE) # for all fields below the specified one
## $type
## [1] "field2"
## 
## $suggestions
##  [1] "dataPrep$geographic$geographicStandardization$rule"      
##  [2] "dataPrep$geographic$geographicStandardization$notes"     
##  [3] "dataPrep$geographic$geographicalOutlierRemoval$rule"     
##  [4] "dataPrep$geographic$geographicalOutlierRemoval$notes"    
##  [5] "dataPrep$geographic$centroidRemoval$rule"                
##  [6] "dataPrep$geographic$centroidRemoval$notes"               
##  [7] "dataPrep$geographic$pointInPolygon$rule"                 
##  [8] "dataPrep$geographic$pointInPolygon$notes"                
##  [9] "dataPrep$geographic$altitudeRemoval$rule"                
## [10] "dataPrep$geographic$spatialThin$rule"                    
## [11] "dataPrep$geographic$spatialThin$notes"                   
## [12] "dataPrep$biological$duplicateRemoval$rule"               
## [13] "dataPrep$biological$duplicateRemoval$notes"              
## [14] "dataPrep$biological$questionablePointRemoval$rule"       
## [15] "dataPrep$biological$questionablePointRemoval$notes"      
## [16] "dataPrep$biological$taxonomicHarmonization$rule"         
## [17] "dataPrep$biological$taxonomicHarmonization$notes"        
## [18] "dataPrep$biological$cultivatedRemoval$rule"              
## [19] "dataPrep$biological$cultivatedRemoval$notes"             
## [20] "dataPrep$biological$nonNativeRemoval$rule"               
## [21] "dataPrep$biological$nonNativeRemoval$notes"              
## [22] "dataPrep$environmental$environmentalOutlierRemoval$rule" 
## [23] "dataPrep$environmental$environmentalOutlierRemoval$notes"
## [24] "dataPrep$environmental$environmentalThin$rule"           
## [25] "dataPrep$environmental$environmentalThin$notes"          
## [26] "dataPrep$environmental$notes"                            
## [27] "dataPrep$dataPrepNotes"
rmmSuggest('dataPrep$biological$duplicateRemoval')
## $type
## [1] "entity"
## 
## $suggestions
## [1] "dataPrep$biological$duplicateRemoval$rule" 
## [2] "dataPrep$biological$duplicateRemoval$notes"
rmmSuggest('dataPrep$biological$duplicateRemoval$rule')
## $suggestions
## [1] "environmental"            "coordinate"              
## [3] "other (specify in Notes)"

Here, it may help to learn the two pieces of unavoidable jargon we use: * A field describes levels of the hierarchy. E.g., dataPrep is field 1, errors is field 2 and duplicateRemoval is field 3. * An entity is described by a complete set of fields and has a particular value. In the example above 'dataPrep$errors$duplicateRemoval$rule is an entity, and it can take one of the four suggested values: “Environmental duplicates” “coordinate duplicates”, “other (specify in Notes)”, “NA”.

Note above, that when you ask for suggestions for a field (the first three lines), you get suggestions of the relevant fields to consider. But the last line refers to to the lowest level in the hierachy, an entity and so values are suggested.

Another more complex example:

rmmSuggest('model')
## $type
## [1] "field2"
## 
## $suggestions
##  [1] "model$algorithms"          "model$algorithmCitation"  
##  [3] "model$speciesCount"        "model$covariateScaling"   
##  [5] "model$occurrenceTreatedAs" "model$selectionRules"     
##  [7] "model$finalModelSettings"  "model$notes"              
##  [9] "model$partition"           "model$resampling"         
## [11] "model$samplingBias"        "model$algorithm"          
## [13] "model$ensemble"            "model$references"         
## [15] "model$justification"
rmmSuggest('model$algorithm$maxent')
## $type
## [1] "entity"
## 
## $suggestions
##  [1] "model$algorithm$maxent$featureSet"                 
##  [2] "model$algorithm$maxent$featureRule"                
##  [3] "model$algorithm$maxent$regularizationMultiplierSet"
##  [4] "model$algorithm$maxent$regularizationRule"         
##  [5] "model$algorithm$maxent$convergenceThresholdSet"    
##  [6] "model$algorithm$maxent$samplingBiasRule"           
##  [7] "model$algorithm$maxent$samplingBiasNotes"          
##  [8] "model$algorithm$maxent$targetGroupSampleSize"      
##  [9] "model$algorithm$maxent$offsetSet"                  
## [10] "model$algorithm$maxent$offsetRule"                 
## [11] "model$algorithm$maxent$expertMapProbSet"           
## [12] "model$algorithm$maxent$expertMapProbRule"          
## [13] "model$algorithm$maxent$expertMapRateSet"           
## [14] "model$algorithm$maxent$expertMapRateRule"          
## [15] "model$algorithm$maxent$expertMapSkewSet"           
## [16] "model$algorithm$maxent$expertMapSkewRule"          
## [17] "model$algorithm$maxent$expertMapShiftSet"          
## [18] "model$algorithm$maxent$expertMapShiftRule"         
## [19] "model$algorithm$maxent$notes"
rmmSuggest('$model$algorithm$maxent$featureSet')
## $suggestions
## [1] "L"     "LQ"    "LQP"   "LQPT"  "LQPTH" " H"    "HT"

To make it easier to fill some rmm fields, we provide autofill functions that extract relevant information from common R objects used in a range modeling workflow.

rmm=rmmTemplate()
rmm=rmmAutofillPackageCitation(rmm,c('raster','sp'))
# search GBIF for occurrence data to demonstrate the autofill function
bv=spocc::occ('Bradypus variegatus', 'gbif', limit=50, has_coords=TRUE)
## Registered S3 method overwritten by 'crul':
##   method                 from
##   as.character.form_file httr
## Warning: gbif: No records returned in GBIF for Bradypus variegatus
## Warning: gbif: Column 50 of item 1 is length 4 inconsistent with column 1 which
## is length 50. Only length-1 columns are recycled.
## Warning: `data_frame()` was deprecated in tibble 1.1.0.
## Please use `tibble()` instead.
rmm=rmmAutofillspocc(rmm,bv$gbif)
## Warning: `as_data_frame()` was deprecated in tibble 2.0.0.
## Please use `as_tibble()` instead.
## The signature and semantics have changed, see `?as_tibble`.
# get some env layers to demonstrate the autofill function
rasterFiles=list.files(path=paste(system.file(package='dismo'), '/ex', sep=''),
                       pattern='grd', full.names=TRUE)
# make a stack of the rasters
env=raster::stack(rasterFiles)
rmm=rmmAutofillEnvironment(rmm,env,transfer=0) # for fitting environment
# just using the same rasters for demonstration; in practice these are different
rmm=rmmAutofillEnvironment(rmm,env,transfer=1) # for transfer environment 1
rmm=rmmAutofillEnvironment(rmm,env,transfer=2) # for transfer environment 2 

To see what fields you might’ve missed…

empties=rmmCheckEmpty(rmm)
## ===================================
## There are  53 empty obligate fields:
## $authorship$rmmName
## $authorship$names
## $authorship$license
## $authorship$contact
## $authorship$relatedReferences
## $authorship$authorNotes
## $authorship$miscNotes
## $authorship$doi
## $studyObjective$purpose
## $studyObjective$rangeType
## $studyObjective$invasion
## $studyObjective$transfer
## $data$occurrence$spatialAccuracy
## $data$environment$yearMin
## $data$environment$yearMax
## $data$environment$extentRule
## $data$environment$projection
## $data$environment$sources
## $data$environment$notes
## $data$observation$variableNames
## $data$observation$minVal
## $data$observation$maxVal
## $data$dataNotes
## $dataPrep$dataPrepNotes
## $model$algorithms
## $model$algorithmCitation
## $model$speciesCount
## $model$selectionRules
## $model$finalModelSettings
## $model$notes
## $model$partition$partitionSet
## $model$partition$partitionRule
## $model$partition$notes
## $model$references
## $prediction$binary$thresholdSet
## $prediction$binary$thresholdRule
## $prediction$extrapolation
## $prediction$transfer$environment1$extrapolation
## $prediction$transfer$notes
## $prediction$uncertainty$units
## $prediction$uncertainty$minVal
## $prediction$uncertainty$maxVal
## $prediction$uncertainty$notes
## $assessment$references
## $assessment$notes
## $code$software$platform
## $code$demoCodeLink
## $code$vignetteCodeLink
## $code$fullCodeLink
## $code$demoDataLink
## $code$vignetteDataLink
## $code$fullDataLink
## $code$codeNotes
## 
## ===================================
## 
## ===================================

Checking an rmm object

To check the field names in your object, use rmmNameCheck

# Make an empty template
rmm1<-rmmTemplate() 
# Add a new, non-standard field
rmm1$dataPrep$biological$taxonomicHarmonization$taxonomy_source<-"The Plant List" # # Checking the names identifies the new, non-standard field we've added ("taxonomy_source")
rmm1=rmmCheckName(rmm1) 
## The following names are not similar to any suggested names, please verify that these are accurate:
## $dataPrep$biological$taxonomicHarmonization$taxonomy_source
## 
## 

This check identifies the entity $dataPrep$biological$taxonomicHarmonization$taxonomy_source as non-standard. It’s non-standard because it’s not an entity in the data dictionary, and I just added it. That’s not a violation or anything bad, this check just let’s you know it’s there. This check can be useful to detect misspellings of standardized field names too.

To check the field values in your rmm object, use the function rmmValueCheck

#First, we create an empty rmm template
rmm1<-rmmTemplate() 
#We add 3 of the bioclim layers, including a spelling error (an extra space) in bio2, and a word that is clearly not a climate layer, 'cromulent'.
rmm1$data$environment$variableNames<- c("bio1", "bio 2", "bio3", "cromulent") 
#Now, when we check the values, we see that bio1 and bio2 are reported as exact matches, while 'bio 2' is flagged as a partial match with a suggested value of 'bio2', and 'cromulent' is flagged as not matched at all.
rmmCheckValue(rmm = rmm1) 
## 
## ==========================================
## For the field rmm$data$environment$variableNames
## 
## 
## The following entries appear accurate:
## 
##  bio1; bio3
## 
## 
## The following entries are similar to suggested values, please verify:
## bio 2
## 
## 
## Suggested alternatives include: 
## bio2
## 
## 
## The following entries are not similar to any suggested values, please verify that these are accurate:
## cromulent
## 
## 
#If we'd like to return a dataframe containing this information in a perhaps more useful format:
rmmCheckValueOutput<-rmmCheckValue(rmm = rmm1,returnData = TRUE)
## 
## ==========================================
## For the field rmm$data$environment$variableNames
## 
## 
## The following entries appear accurate:
## 
##  bio1; bio3
## 
## 
## The following entries are similar to suggested values, please verify:
## bio 2
## 
## 
## Suggested alternatives include: 
## bio2
## 
## 
## The following entries are not similar to any suggested values, please verify that these are accurate:
## cromulent
## 
## 

These ‘check’ functions work by comparing the values or field names within an rmm object to those in a data dictionary. These functions are designed to check for non-standard values and names, and DO NOT necessarily identify correct vs. incorrect values/names. Non-standard values may be perfectly valid, or they may be erroneous, and the user will have to make this distinction.

To run all the available checks at once, we’ll check the object that we filled in a few chunks back.

rmmCheckFinalize(rmm, family='base')
## 
## 
##  Element name '$data$transfer$environment2$resolution' not found in data dictionary!
##  Did you mean: '$data$transfer$environment1$resolution'?
## 
## 
##  Element name '$data$transfer$environment2$extentSet' not found in data dictionary!
##  Did you mean: '$data$transfer$environment1$extentSet'?
## The following names are similar to suggested names, please verify:
## $data$transfer$environment2$resolution
## $data$transfer$environment2$extentSet
## 
## Suggested alternatives include: 
## $data$transfer$environment1$resolution
## $data$transfer$environment1$extentSet
## 
## 
## 
## ==========================================
## For the field rmm$data$occurrence$dataType
## 
## 
## The following entries appear accurate:
## 
##  presence only
## 
## 
## 
## ==========================================
## For the field rmm$data$environment$variableNames
## 
## 
## The following entries appear accurate:
## 
##  bio1; bio12; bio16; bio17; bio5; bio6; bio7; bio8
## 
## 
## The following entries are similar to suggested values, please verify:
## biome
## 
## 
## Suggested alternatives include: 
## bio1
## 
## 
## ===================================
## There are  53 empty obligate fields:
## $authorship$rmmName
## $authorship$names
## $authorship$license
## $authorship$contact
## $authorship$relatedReferences
## $authorship$authorNotes
## $authorship$miscNotes
## $authorship$doi
## $studyObjective$purpose
## $studyObjective$rangeType
## $studyObjective$invasion
## $studyObjective$transfer
## $data$occurrence$spatialAccuracy
## $data$environment$yearMin
## $data$environment$yearMax
## $data$environment$extentRule
## $data$environment$projection
## $data$environment$sources
## $data$environment$notes
## $data$observation$variableNames
## $data$observation$minVal
## $data$observation$maxVal
## $data$dataNotes
## $dataPrep$dataPrepNotes
## $model$algorithms
## $model$algorithmCitation
## $model$speciesCount
## $model$selectionRules
## $model$finalModelSettings
## $model$notes
## $model$partition$partitionSet
## $model$partition$partitionRule
## $model$partition$notes
## $model$references
## $prediction$binary$thresholdSet
## $prediction$binary$thresholdRule
## $prediction$extrapolation
## $prediction$transfer$environment1$extrapolation
## $prediction$transfer$notes
## $prediction$uncertainty$units
## $prediction$uncertainty$minVal
## $prediction$uncertainty$maxVal
## $prediction$uncertainty$notes
## $assessment$references
## $assessment$notes
## $code$software$platform
## $code$demoCodeLink
## $code$vignetteCodeLink
## $code$fullCodeLink
## $code$demoDataLink
## $code$vignetteDataLink
## $code$fullDataLink
## $code$codeNotes
## 
## ===================================
## 
## ===================================

Outputing an rmm object

To make rmm objects portable to other interfaces, they are readily written to csv format.

outFile='~/Desktop/demo_rmmToCSV.csv'
rmmObj=rmmTemplate()
rmmToCSV(rmmObj,filename=outFile)
system(paste0('open ', outFile, ' -a "Microsoft Excel"'))

Miscellaneous

It can be helpful to simply view the data dictionary:

dd=rmmDataDictionary()
str(dd)
## 'data.frame':    254 obs. of  10 variables:
##  $ field1           : chr  "authorship" "authorship" "authorship" "authorship" ...
##  $ field2           : chr  NA NA NA NA ...
##  $ field3           : chr  NA NA NA NA ...
##  $ entity           : chr  "rmmName" "names" "license" "contact" ...
##  $ class            : chr  "character" "character vector" "character" "character" ...
##  $ taxonSpecific    : chr  "no" "no" "no" "no" ...
##  $ constrainedValues: chr  "NULL" "NULL" "CC; CC BY; CC BY-SA; CC BY-ND; CC BY-NC; CC BY-NC-SA; CC BY-NC-ND; other" "NULL" ...
##  $ family           : chr  "base" "base" "base" "base" ...
##  $ examples         : chr  "MerowMaitnerOwensKassEnquistGuralnick_2018_Acer_Maxent_b3" "Merow, Cory and Maitner, Brian and Owens, Hannah and Kass, Jamie and Enquist, Brian and Guralnick, Rob;" "CC; CC BY; CC BY-SA; CC BY-ND; CC BY-NC; CC BY-NC-SA; CC BY-NC-ND" "[email protected]" ...
##  $ description      : chr  "Use the format Author_Year_Taxa_Model_fw, where the last two characters (here, fw) are alphanumeric and random." "The names of those who created this model. Use the format Last, First and Last, First and Last, First, followin"| __truncated__ "The license under which this model has been produced.  See https://creativecommons.org/licenses/ for common options." "An email address for someone responsible for creating this model." ...
# rmmDataDictionary(excel=TRUE) # try this if you have excel