https://www.nature.com/sdata/
Novel DNA sequence, novel RNA sequence, and novel genome assembly data must be deposited to repositories that are part of the International Nucleotide Sequence Collaboration (INSDC) or to those which are working towards INSDC inclusion (as listed below), unless there are privacy or ethics restrictions that prevent open sharing of such data. These data may in addition be deposited to regional and national repositories as required. For human data that requires special controls, please see our recommended health sciences repositories.
Data types | Repository options | Data and metadata standards |
Raw sequencing data (reads or traces)
|
INSDC repositories
|
Browse data and metadata standards endorsed by the Genome Standards Consortium |
Genetic variation data |
dbSNP (human variations less than 50bp) |
UniProtKB | view FAIRsharing entry |
These repositories accept structural data for small molecules; peptides and proteins (all); and larger assemblies (EMDB).
Small molecule crystallographic data should be uploaded to Dryad or figshare before manuscript submission, and should include a .cif file, and structure factors for each structure. Both the structure factors and the structural output must have been checked using the IUCR's CheckCIF routine, and a copy of the output must be included at submission, together with a justification for any alerts reported.
These data repositories all accept human-derived data (NeuroMorpho.org and G-Node also accept data from other organisms). Please note that human-subject data submitted to OpenfMRI must be de-identified.
Functional genomics
Functional genomics is a broad experimental category, and Scientific Data's recommendations in this discipline likewise bridge disparate research disciplines. Data should be deposited following the relevant community requirements where possible.
Please refer to the MIAME standard for microarray data. Molecular interaction data should be deposited with a member of the International Molecular Exchange Consortium (IMEx), following the MIMIx recommendations.
For data linking genotyping and phenotyping information in human subjects, we strongly recommend submission to dbGAP, EGA or JGA, which have mechanisms in place to handle sensitive data.
Metabolomics & Proteomics
Metabolomics data should be submitted following the MSI guidelines.
We ask authors to submit proteomics data to members of the ProteomeXchange consortium (listed below), following the MIAPE recommendations.
MassIVE | view FAIRsharing entry |
MetaboLights | view FAIRsharing entry |
PeptideAtlas | view FAIRsharing entry |
PRIDE | view FAIRsharing entry |
Panorama Public | view FAIRsharing entry |
BioModels Database | view FAIRsharing entry |
Kinetic Models of Biological Systems (KiMoSys) | view FAIRsharing entry |
The Network Data Exchange (NDEx) | view FAIRsharing entry |
FlowRepository | view FAIRsharing entry |
ImmPort | view FAIRsharing entry |
These resources provide information specific to a particular organism or disease pathogen. They may accept phenotype information, sequences, genome annotations and gene expression patterns, among other types of data. Incorporating data into these resources can be very valuable for promoting reuse within these specific communities; however, where applicable, we ask that data records be submitted both to a community repository and to one suitable for the type of data (e.g. transcriptome profiling; please see above).
Some of the repositories in this section are suitable for datasets requiring restricted data access, which may be required for the preservation of study participant anonymity in clinical datasets. We suggest contacting repositories directly to determine those with data access controls best suited to the specific requirements of your study.
Earth, Environmental and Space sciences ⤴
SIMBAD Astronomical Database | view re3data entry |
UK Solar System Data Centre | view re3data entry |
EarthChem | view re3data entry |
Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC) | view re3data entry |
World Data Center for Climate at DRKZ (WDCC) | view re3data entry |
TERN Data Discovery Portal | view FAIRsharing entry |
Environmental Data Initiative (formerly LTER Network Information System Data Portal) | view re3data entry |
Global Biodiversity Information Facility (GBIF) | view FAIRsharing entry |
KNB: The Knowledge Network for Biocomplexity | view FAIRsharing entry |
Magnetics Information Consortium (MagIC) | view re3data entry |
Australian Antarctic Data Centre (AADC) | view re3data entry |
Australian Ocean Data Network (DOIs only assigned to deposited data on request) | view re3data entry |
Marine Data Archive | |
Marine Geosciences Data System | view re3data entry |
SEANOE | view FAIRsharing entry |
HEPData | view re3data entry |
NoMaD Repository | view FAIRsharing entry |
Materials Cloud | view FAIRsharing entry |
MPContribs | view re3data entry |
Scientific Data encourages authors to archive data to one of the above data-type specific repositories where possible. Where a data-type specific repository is not available, the following generalist repositories might be suitable. Generalist repositories may also be appropriate for archiving associated analyses, or experimental-control data, supplementing the primary data in a discipline-specific repository.
The generalist repositories listed below are able to accept data from all researchers, regardless of location or funding source. If your institution has its own generalist data repository this can be used to host your data as long as the repository is able to mint DataCite DOIs, and allows data to be shared under open terms of use (for example the CC0 waiver). Please note that if your chosen repository is unable to support confidential peer-review, you will be asked to temporarily deposit a copy of the dataset to one of our integrated generalist repositories to facilitate review of your article. Upon completion of peer review, the temporary copy will be erased. To use a repository which does not appear in the manuscript submission system, select 'DataCite DOI' as the repository name during the submission process.
Repository Name | Information on fees/costs | Size limits | Integrated with Scientific Data's manuscript submission system | Re3data / FAIRSharing entry |
Dryad Digital Repository | $120 USD for first 20 GB, and $50 USD for each additional 10 GB | None stated | Yes ![]() |
view FAIRsharing entry |
figshare | 100 GB free per Scientific Data manuscript. | 1 TB per dataset |
Yes |
view FAIRsharing entry |
Harvard Dataverse | Contact repository for datasets over 1 TB |
2.5 GB per file, 10 GB per dataset |
No | view re3data entry |
Open Science Framework | Free of charge | 5 GB per file, multiple files can be uploaded | No | view FAIRsharing entry |
Zenodo | Donations towards sustainability encouraged | 50 GB per dataset | No | view re3data entry |
Science Data Bank | Free of charge | 8 GB per file, no limit to dataset size | No | view FAIRsharing entry |