Skip to Main Content
Stony Brook University

Research Data Guide

Resources to help you manage your research data.

Storage vs. Preservation vs. Access

Modified venn diagram showing overlap between Storage, Preservation, and Access

storage ≠ preservation ≠ access

Just like data storage does not guarantee data preservation, neither storage nor preservation automatically mean that your data has been made accessible

This diagram shows the relationship between preservation (white), access (yellow), and storage (green). For data to be preserved or accessible, it must be stored. However, storing data does not guarantee that the data is preserved or that any kind of public access is provided. Access can also be provided without data being properly preserved.   

There are also preservation repositories that are not set up for access. These are called “dark archives”.

Data Repositories

Data can be made accessible by putting it into a repository.

There are three broad types of repositories:

  1. Many institutions have an institutional repository to store data created by the institution’s researchers.
  2. In some areas, there are discipline-specific repositories to hold data within a specific subject domain.
  3. Finally, cross-disciplinary repositories hold data from across many disciplines. 

Finding a Repository

Data Publishing

Another way to provide access to data is through a data publication. In a data journal, data itself is described, unlike most journals that feature the analysis of that data and results. Some of these journals will also store the dataset. 

Examples of Data Journals

Funder Mandates and Publisher Requirements

Federal funders have a wide variety of data management plan and data sharing requirements. The general trend from funding agencies is towards increased openness and stricter requirements around data sharing.

Funders

NSF

All proposals submitted since 2011 require a two-page data management plan. Data sharing is required. Per Chapter XI of the “Proposal & Award Policies & Procedures Guide”: “Investigators are expected to share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants. Grantees are expected to encourage and facilitate such sharing.”

NIH

In 2003, the NIH released a data sharing policy requiring all grants greater than $500k per year in direct costs to submit a data sharing plan. 

In 2020, the NIH issued the new Final NIH Policy for Data Management and Sharing, "which will require NIH funded researchers to prospectively submit a plan outlining how scientific data from their research will be managed and shared." This policy goes into effect on January 25, 2023, replacing the 2003 NIH Data Sharing Policy. 

NEH

The NEH Office of Digital Humanities requires a data management plan that: “clearly articulate[s]” how grantees will share their data. 

Publishers

As with funding agencies, the trend among publishers is towards increased openness and stricter requirements around data sharing.

PLOS 

In 2014, PLOS was the first publisher to require that authors show proof that they have shared their data somewhere. If proof is not provided, PLOS can reject the paper outright, or retract it if it has already been published and an author removes the data from public view. 

Other Publishers

Many other journals like Nature, Science, and Cell have language written into their policies that state that data necessary to understand and assess the conclusions of a manuscript must be shared.