dplyr is a grammar of data manipulation, providing a consistent set of verbs that help you solve the most common data manipulation challenges:
dplyr is designed to abstract over how the data is stored. That means as well as working with local data frames, you can also work with remote database tables, using exactly the same R code. Install the dbplyr package then read vignette(“databases”, package = “dbplyr”). mutate
There are a number of helper functions you can use within select():
delays <- flights %>%
group_by(dest) %>%
filter(distance < 5)
delays <- flights %>%
group_by(dest) %>%
summarise(
count = n(),
dist = mean(distance, na.rm = TRUE),
delay = mean(arr_delay, na.rm = TRUE)
) %>%
filter(count > 20, dest != "HNL")
arrange(flights, year, month, day)
arrange(flights, desc(dep_delay))