openneuroR is the GitHub repository for the R package openneuro, a tibble-first client for discovering and downloading public OpenNeuro datasets from R.
The package is built for practical data access workflows:
- search the OpenNeuro catalogue
- inspect dataset metadata, snapshots, files, and subjects
- download full datasets, selected files, or selected subjects
- discover derivative outputs such as fMRIPrep and MRIQC
- reuse a local cache instead of re-downloading the same data
- bridge downloaded datasets into BIDS-aware tooling
Installation
The package is not currently on CRAN. Install it from GitHub:
install.packages("pak")
pak::pak("bbuchsbaum/openneuroR")
# or
install.packages("remotes")
remotes::install_github("bbuchsbaum/openneuroR")Then load the package:
Optional system tools
Basic usage works with the built-in HTTPS backend. For faster or more robust downloads, openneuro can also use external tools when they are available:
-
awsCLI for fast S3-based downloads -
dataladplusgit-annexfor verified, resumable dataset fetches
Check which backends are available on your machine:
Backend selection is automatic by default:
-
dataladif available - otherwise
s3if AWS CLI is available - otherwise
https
Quick Start
Search and inspect datasets
library(openneuro)
results <- on_search(modality = "MRI", limit = 10)
results[, c("id", "name", "n_subjects")]
meta <- on_dataset("ds000001")
snaps <- on_snapshots("ds000001")
files <- on_files("ds000001")
subjects <- on_subjects("ds000001")Download only what you need
Download a few files:
on_download(
id = "ds000001",
files = c("dataset_description.json", "participants.tsv")
)Download specific subjects without derivatives:
on_download(
id = "ds000001",
subjects = c("01", "02"),
include_derivatives = FALSE
)Use the regex() helper for subject selection:
on_download(
id = "ds000001",
subjects = regex("sub-0[1-5]")
)Discover and download derivatives
derivs <- on_derivatives("ds000001")
derivs[, c("dataset_id", "pipeline", "source")]
spaces <- on_spaces(derivs[1, ])
spacesDownload fMRIPrep outputs for selected subjects in a specific space:
on_download_derivatives(
dataset_id = "ds000001",
pipeline = "fmriprep",
subjects = c("01", "02"),
space = "MNI152NLin2009cAsym"
)Work lazily with handles
If you want to define a dataset reference first and fetch it later, use a handle:
handle <- on_handle("ds000001", files = "participants.tsv")
handle <- on_fetch(handle)
path <- on_path(handle)This pattern is useful in pipelines where data should only be downloaded when it is actually needed.
Core Functions
| Function | Purpose |
|---|---|
on_search() |
Search or list datasets |
on_dataset() |
Retrieve dataset metadata |
on_snapshots() |
List versioned dataset snapshots |
on_files() |
List files within a dataset |
on_subjects() |
List subjects in a dataset |
on_download() |
Download raw data, specific files, or subject subsets |
on_derivatives() |
Discover available derivative datasets |
on_spaces() |
Inspect output spaces for derivatives |
on_download_derivatives() |
Download derivative outputs |
on_handle() / on_fetch()
|
Create and materialize lazy dataset handles |
on_cache_info() / on_cache_list() / on_cache_clear()
|
Inspect and manage the local cache |
on_bids() |
Convert a fetched dataset to a bidser BIDS project |
Cache And Download Behavior
By default, downloads go into a local cache. Repeated downloads skip files that are already present and tracked in the manifest, which makes it practical to:
- pull just a few files for exploration
- expand a partial download later
- revisit the same dataset without starting from scratch
Use the cache helpers to inspect or clean up local state:
on_cache_info()
on_cache_list()
on_cache_clear("ds000001", confirm = FALSE)