Ten Time-saving R Hacks
I use these regularly to try to minimize distractions and keep up my production pace
--
The more coding I do, the more sensitive I become to inefficiency. For me, Nirvana is where you can code super quickly without having to do stuff outside your favorite code editor.
So I get pretty frustrated when I can’t do what I want using lean, efficient code, or when I have to go into my filesystem or another program to configure something, or anything else that I view as taking up unnecessary time and effort.
Here are ten hacks I use regularly to try to minimize distractions and keep up my production pace. When I tell some people about these, I often get a few reactions like ‘Why didn’t I know about this?’. So I hope at least some of these will be new and useful to you.
1. Downloading and reading files straight from source
This tip should help you minimize time administering local data files and make your entire project more replicable by others. If you have a data file which is sitting on the web somewhere, like in Google Drive or some other URL, the readr
package allows you to read it direct from the URL into a dataframe, using functions like read_csv()
or read_rds()
. For example:
my_df <- readr::read_csv("https://www.website.com/data.csv")
If the data is in a Github repo, you can get the download the raw file by adding ?raw=true
to the URL, so here's how you would get some data on speed dating from one of my Github repos:
speed_dating_data <- readr::read_rds("https://github.com/keithmcnulty/speed_dating/blob/master/speed_data_data.RDS?raw=true")
If you have a weird file type that readr
can't process, you can simply use base R's download.file()
function to get it into your session where you can then read it using whatever the right package is. You don't need the url function for this, so to use this to download my speed-dating data:
download.file("https://github.com/keithmcnulty/speed_dating/blob/master/speed_data_data.RDS?raw=true", destfile = "speed_dating_data.RDS")
speed_dating_data <- readRDS("speed_dating_data.RDS")