Speedup R by moving temporary files into memory

Nov 20, 2013. | By: Paul

There are certain situations when you need to work with temporary files in R. For instance, my package Jaatha requires that an external simulation tool is called, which writes it’s results into files. However, reading and writing to disk is really slow compared to data in memory, and hence the use of many temporary files can be a severe performance bottleneck.

However, there is an solution easy for this on (reasonably modern) Linux systems, at least in case the files fit into memory. There normally is a memory disk available under /run/shm, /var/run/shm or /dev/shm, e.g. a file system that behaves like a ‘normal’ file system in any way, but saves the files into memory. Usually, can check whether you have such a memory disk available doing a

df -h | grep shm

You can tell R to place it’s temporary file in this directory by setting the TMPDIR environment variable before starting R:

TMPDIR=/path/to/shm R

This affects only the started R session. If you want to make the change permanently, add a line like

TMPDIR=${TMPDIR-/path/to/shm}

to your ~/.Renviron file.

To see if it works, check the output of

tempdir()

in R.

Update 04.02.2014: Changed .Renviron file so that it does not overwrite existing values of $TMPDIR.

Android Bloatware compiler benchmark scrm ETL Spark

About

I'm a ML Engineer in Munich, Germany.

Social Links