Skip to Main Content

Research Data Management: Managing your Research Data

File formats, management and version control

Create, name and organise your data files according to best practice.

File formats for archiving, preservation and access

File formats suitable for long term preservation and access are recommended by State and National Archives authorities.

Organising your Data folders

Organising your data folders on network drives and other locations will make it easy to find and organise your data files.

Useful file names

  • Project or experiment name or acronym
  • Location or spatial coordinates
  • Researcher name and initials
  • Date or date range of experiment
  • Type of data Conditions
  • Version number of file
  • Three letter file extension for application specific files
  • These are suggestions; include whatever information will allow you to distinguish your files from each other and clearly indicate to you what is in them.

Other tips for file naming

  • A good format for dates is YYYYMMDD (or YYMMDD). This ensures all your files stay in chronological order.
  • Keep file names short, longer names do not work well with all types of software.
  • Special characters should be avoided: for example ~ ! @ # $
  • Avoid using spaces, they are not recognized by some software.
  • Instead use underscores (file_name), dashes (file--‐‑name), no separation (filename), or camel case (FileName).
  • Include a README.txt file in your directory that explains your naming convention along with any abbreviations or codes you have used.

Data and Text Mining and Library Databases

A number of publishers are allowing data and text mining of their licensed resources by members of subscribing institutions. This access is generally governed by Database Licenses, terms and conditions and existing copyright provisions.

Some publishers will require you to use tools that they provide to mine their content, or will conduct the process for you. In this way they can manage the quantity of data being accessed and the impact on their servers.

Downloading large amounts of data can trigger automatic lockouts and prevent access to resources by other users. Some publishers may apply a fee for the additional usage that sits outside of our existing agreement.

Please consult with your Faculty Librarians if you are considering using Library subscription databases as a source of data.

Data Management and Text Mining software

Knowing which software to use for data management and analysis is important, and can differ according to your research discipline.

Software for Statistical analysis

The Quantitative Consulting Unit has tutorials or training for analysis software such as R, SPSS, G*Power3.1, @Risk and Netica

Software for Data and Text Mining

NVivo - CSU has licensed Nvivo software available for Researchers. Training is available in the Research Office PD calendar, and in Lynda.com Training.  See also the Endnote and NVivo Library Guide.

Other Open Source Software for Data and Text Mining

Voyant -  Voyant is a free online program for text analysis

Orange -  open source software for data analysis, data visualisation and has add-ons for text mining and bioinformatics

VosViewer - VOSviewer also offers text mining functionality that can be used to construct  networks of terms extracted from a body of scientific literature.

PubVenn - PubVenn takes a complex PubMed search and divides it into its constituent parts. It then shows the citations using a proportionally-sized Venn diagram.

Working with your data

OpenRefine (formerly Google Refine) is a free open source tool for working with messy data "... cleaning it; transforming it from one format into another; and extending it with web services and external data...."