A Modern and Flexible Web Client for R
The curl() and curl_download() functions provide highly configurable drop-in replacements for base url() and download.file() with better performance, support for encryption (https, ftps), gzip compression, authentication, and other libcurl goodies. The core of the package implements a framework for performing fully customized requests where data can be processed either in memory, on disk, or streaming via the callback or connection interfaces. Some knowledge of libcurl is recommended; for a more-user-friendly web client see the 'httr' package which builds on this package with http specific tools and logic.
About the R package:
Other resources:
- libcurl handle options overview (use with
handle_setopt
in R)
There are three download interfaces (memory, disk and streaming). Always start by setting up a request handle:
library(curl)
h <- new_handle(copypostfields = "moo=moomooo")
handle_setheaders(h,
"Content-Type" = "text/moo",
"Cache-Control" = "no-cache",
"User-Agent" = "A cow"
)
Perform request and download response in memory:
# Perform the request
req <- curl_fetch_memory("http://httpbin.org/post", handle = h)
# Show some outputs
parse_headers(req$headers)
cat(rawToChar(req$content))
str(req)
Or alternatively, write response to disk:
tmp <- tempfile()
curl_download("https://httpbin.org/post", tmp, handle = h)
readLines(tmp)
Or stream response via Connection interface:
con <- curl("https://httpbin.org/post", handle = h)
open(con)
# Get 3 lines
readLines(con, n = 3)
# Get remaining lines and close connection
readLines(con)
close(con)
Binary packages for OS-X or Windows can be installed directly from CRAN:
install.packages("curl")
Installation from source on Linux requires libcurl
. On Debian or Ubuntu use libcurl4-openssl-dev:
sudo apt-get install -y libcurl-dev
On Fedora, CentOS or RHEL use libcurl-devel:
sudo yum install libcurl-devel
On MacOS libcurl is included with the system, so usually nothing extra is needed. However if you want to build against the very most recent version of libcurl, which also has many extra features enabled (sftp, http2), install curl from homebrew and then recompile the R package.
brew install curl pkg-config
You need to set the PKG_CONFIG_PATH
environment variable to help R find the non-default curl, when building from source. Run this in a clean R session which does not have the curl package loaded already:
Sys.setenv(PKG_CONFIG_PATH="/usr/local/opt/curl/lib/pkgconfig")
install.packages("curl", type = "source")
Afterwards confirm the version using curl::curl_version()
.
Because devtools
and httr
depend on curl
, installing with install_github
does not work well. The easiest way to install the development version of curl
is a clean R session:
install.packages("https://github.com/jeroen/curl/archive/master.tar.gz", repos = NULL)
Of course windows users need Rtools to compile from source.