Vessel Monitoring System (VMS) data comes organized by year. The
vms_download() automatically downloads it and
store into the working directory in a VMS-data folder. Within the folder
raw data are organized by monthly folders (with names in Spanish) that
contain several .csv files that usually store byweekly data intervals.
Each file have different rows and some have different column names. For
that we highly recommend to use the
The latter corrects several inconsistencies within the raw data. If you
have any suggestion or spot some errors we will be very pleased if you
create an issue.
The function below downloads data from the year 2019.
library(dafishr) vms_download(year = 2019, destination.folder = getwd())
vms_clean() function works on the VMS
data.frame. You can either load downloaded data or use the
sample_dataset that you can call and clean like so:
library(dafishr) data("sample_dataset") <- vms_clean(sample_dataset)vms_cleaned
vms_clean() function returns a message with the
number of rows that were cleaned because they contained
NULL values in coordinates.
Once the dataset is wrangled, there are some other preprocessing
steps to follow. First, all points that fall inland should be
eliminated. This is because VMS data are vessels, thus points falling
inland are errors in data registration. For that we will upload the
mx_inland shapefile which helps eliminating all the points
within a certain distance from the coastline.
data("mx_inland") # Shapefile of inland Mexico area <- clean_land_points(vms_cleaned, mx_inland)vms_cleaned_land
Once all land points are eliminated, we can use the
join_ports_locations() function to label all the points
where a vessel was inside a port or a marina. We achieve this by using
mx_ports shapefile that will be used to create a buffer
around each port or marina location. Then each VMS point that falls
within these buffers will be labelled as
at_sea in a new column that will be automatically called
data("mx_ports") # If you are just testing, it is a good idea to subsample... # it takes a while on the full data! <- dplyr::sample_n(vms_cleaned, 1000) vms_subset <- join_ports_locations(vms_subset)with_ports
Now we can check the results in a map:
<- sf::st_as_sf(with_ports, with_ports_sf coords = c("longitude", "latitude"), crs = 4326) data("mx_shape") library(ggplot2) ::ggplot(mx_shape) + ggplot2geom_sf(col = "gray90") + geom_sf(data = with_ports_sf, aes(col = location)) + facet_wrap(~ location) + theme_bw()