If you are in the Earth sciences, the USGS data management web site will be a good resource for you. It provides comprehensive information on how to manage spatial data, including the creation of metadata, links to training modules, and federal policy documents.
U.S. funding agencies such as the National Science Foundation and the National Institutes of Health require researchers to supply detailed, cost-effective plans for managing research data, called Data Management Plans.
Several universities and organizations are developing the DMPTool to help researchers meet these new requirements. In specific, the DMPTool will help researchers:
> Create ready-to-use data management plans for specific funding agencies
> Meet requirements for data management plans
> Get step-by-step instructions and guidance for data management plan
> Learn about resources and services available at your institution to fulfill the data management requirements of their grants
If you need to bundle and transfer a dataset, BagIt is the tool for the job. "A hierarchical file packaging format designed to support disk-based storage and network transfer of arbitrary digital content. A "bag" consists of a "payload" (the arbitrary content) and "tags", which are metadata files intended to document the storage and transfer of the bag. A required tag file contains a manifest listing every file in the payload together with its corresponding checksum."
"The Investigator Toolkit is a collection of software tools for finding, using, and contributing data in DataONE. Some of these tools have been custom written for DataONE, some are existing tools that have been modified to use the DataONE Application Programming Interface (API), and some are tools that have well defined interfaces of their own which can be called by DataONE tools."
"Kepler is designed to help scientists, analysts, and computer programmers create, execute, and share models and analyses across a broad range of scientific and engineering disciplines. Kepler can operate on data stored in a variety of formats, locally and over the internet, and is an effective environment for integrating disparate software components, such as merging "R" scripts with compiled "C" code, or facilitating remote, distributed execution of models."