Data

Data binds all of today’s research together. Even if you don’t consider yourself to do data-based research, the results of your work becomes data before it is published. The highest levels of funding agencies are beginning to demand good data management and openness. Knowing how to manage data is probably one of the most important untaught modern skills.

It’s not just “get it done”: there are good and bad methods of managing data. For example,

  • A bad strategy is to store everything in one folder on your own laptop: there’s a very high chance that you will someday lose it all. A better strategy is to use a secure centralized service - preferably an Aalto service, since you get guaranteed support for free.
  • A bad strategy is for everyone to do their own thing and put no effort into recording what they have done: in five years, when the group is almost completely changed, that data will be unusable. A better strategy is to make sure things are documented and archived as soon you get them (and keep this up to date), so that the data can continue to serve you in the future.
  • A bad strategy is to assume all data is proprietary Aalto information: eventually funders will demand more and you won’t be prepared, and Aalto will remain an island, instead of a hub that others want to work with and build on. A better strategy is to always consider openness, licensing, and privacy from the start (even if you don’t do it right away), and always separate data based on level of confidentiality so that you can share or open later.

You can find more formal information at the Aalto Research Data Management pages, and here we focus on the practical side of things.

Applying for funding

When applying for funding, you may need to submit a data management plan (DMP) along with the grant application. For hints on making one, see our data management plan page or the Aalto Research Data Management pages.. However, be aware that a grant application data management plan (“Funder DMP”) usually focuses on sounding like a grant, not being a usable work plan (“Practical DMP”). Before you start accumulating data, browse the other links on this site and make sure that you organize things well! Aalto info will only help you make a funder DMP, not organize your data during the project.

Grantwriters and the Open Science and ACRIS teams can help you with producing data management plans for funding. Science-IT can help you with funding or practical data management plans.

During the project

Make sure that you manage data well - just think, your data is possibly worth more than all your other devices combined. Check out the core lessons to learn of the most common problems, and see if any of them apply to you.

You may want to read our welcome to researchers and outline of data management at Aalto pages. For specific Aalto storage services, see Data storage, and for other options see the general services page.

We recommend that each project or group gets a network drive, which is used as the centralized place for data storage, safekeeping, and possibly daily work. See the outline of data management at Aalto page.

Internal reporting

Data is a top-level research output, even if not everyone considers it valuable now. Open or not, the university wants to know what data exists. Currently, this is done via ACRIS (primary instructions). In particular, you should create a “dataset” object for data you create (it doesn’t have to be open). For some hints, for now see the ACRIS point on the services page or the ACRIS instructions on data.

Sharing and collaboration

Obviously, you will often need to share data within projects. Emailing things back and forth is rarely a good way to do things. Check other data sharing services from our services page or Aalto’s IT services for research page.

We recommend, instead of seeing this as a sharing problem, see this as a storage problem: find a place to store data which everyone can access, and share via that. This promotes long-term organization.

Archival after the project

After a project is done, you may need to store data long-term for follow-up use. You shouldn’t do this just by assuming everyone keeps their copy: people leave, and eventually that a data will get lost. The easiest and recommended way of doing this is by opening data and publishing it on a reputable worldwide archive once it is time. For the most part, the university wants to avoid creating its own internal permanent archives, because they will end up requiring large effort to maintain. It’s better to use the publically-funded and managed worldwide services.

Publication

See our list of storage services for recommendations on archival. If you don’t know what to pick (there isn’t something specialized for your field), use Zenodo and report it in ACRIS (see “internal reporting” above).

Licensing and intellectual property

Just because data is “out there” doesn’t mean it’s usable by others: big companies have ensured that data is by default closed. Luckily, it is easy to make data reusable: just add a license. There are plenty of options that can balance between “public domain, do anything” and “if you help us too”.

See our Open Source page for more info.