Skip to contents

openneuroR is the GitHub repository for the R package openneuro, a tibble-first client for discovering and downloading public OpenNeuro datasets from R.

The package is built for practical data access workflows:

  • search the OpenNeuro catalogue
  • inspect dataset metadata, snapshots, files, and subjects
  • download full datasets, selected files, or selected subjects
  • discover derivative outputs such as fMRIPrep and MRIQC
  • reuse a local cache instead of re-downloading the same data
  • bridge downloaded datasets into BIDS-aware tooling

Installation

The package is not currently on CRAN. Install it from GitHub:

install.packages("pak")
pak::pak("bbuchsbaum/openneuroR")

# or
install.packages("remotes")
remotes::install_github("bbuchsbaum/openneuroR")

Then load the package:

Optional system tools

Basic usage works with the built-in HTTPS backend. For faster or more robust downloads, openneuro can also use external tools when they are available:

  • aws CLI for fast S3-based downloads
  • datalad plus git-annex for verified, resumable dataset fetches

Check which backends are available on your machine:

Backend selection is automatic by default:

  • datalad if available
  • otherwise s3 if AWS CLI is available
  • otherwise https

Quick Start

Search and inspect datasets

library(openneuro)

results <- on_search(modality = "MRI", limit = 10)
results[, c("id", "name", "n_subjects")]

meta <- on_dataset("ds000001")
snaps <- on_snapshots("ds000001")
files <- on_files("ds000001")
subjects <- on_subjects("ds000001")

Download only what you need

Download a few files:

on_download(
  id = "ds000001",
  files = c("dataset_description.json", "participants.tsv")
)

Download specific subjects without derivatives:

on_download(
  id = "ds000001",
  subjects = c("01", "02"),
  include_derivatives = FALSE
)

Use the regex() helper for subject selection:

on_download(
  id = "ds000001",
  subjects = regex("sub-0[1-5]")
)

Discover and download derivatives

derivs <- on_derivatives("ds000001")
derivs[, c("dataset_id", "pipeline", "source")]

spaces <- on_spaces(derivs[1, ])
spaces

Download fMRIPrep outputs for selected subjects in a specific space:

on_download_derivatives(
  dataset_id = "ds000001",
  pipeline = "fmriprep",
  subjects = c("01", "02"),
  space = "MNI152NLin2009cAsym"
)

Work lazily with handles

If you want to define a dataset reference first and fetch it later, use a handle:

handle <- on_handle("ds000001", files = "participants.tsv")
handle <- on_fetch(handle)
path <- on_path(handle)

This pattern is useful in pipelines where data should only be downloaded when it is actually needed.

Bridge into BIDS-aware workflows

If you use bidser, a fetched handle can be converted into a BIDS project:

handle <- on_handle("ds000001")
handle <- on_fetch(handle)
bids <- on_bids(handle)

Core Functions

Function Purpose
on_search() Search or list datasets
on_dataset() Retrieve dataset metadata
on_snapshots() List versioned dataset snapshots
on_files() List files within a dataset
on_subjects() List subjects in a dataset
on_download() Download raw data, specific files, or subject subsets
on_derivatives() Discover available derivative datasets
on_spaces() Inspect output spaces for derivatives
on_download_derivatives() Download derivative outputs
on_handle() / on_fetch() Create and materialize lazy dataset handles
on_cache_info() / on_cache_list() / on_cache_clear() Inspect and manage the local cache
on_bids() Convert a fetched dataset to a bidser BIDS project

Cache And Download Behavior

By default, downloads go into a local cache. Repeated downloads skip files that are already present and tracked in the manifest, which makes it practical to:

  • pull just a few files for exploration
  • expand a partial download later
  • revisit the same dataset without starting from scratch

Use the cache helpers to inspect or clean up local state:

on_cache_info()
on_cache_list()
on_cache_clear("ds000001", confirm = FALSE)

License

MIT