Gathering and managing your research data

During your project, you will be gathering and collating data, possibly from different sources. This data may be generated by your research, or it may be information that you have obtained from another source. Either way, you will have to consider how it is managed.

Ethics

"Charles Sturt University is required, like all other Australian universities and research organisations, to ensure that all research involving human participants must be conducted in accordance with the National Statement on the Ethical Conduct in Human Research (National Statement)."

Source:Charles Sturt University.(2018) Ethics and Compliance Unit: Human Research Ethics.

The CSU Ethics application and approval process requires all researchers and HDRs to complete and submit a HREA application.

Legal requirements for retention or disposal of data

Data should be retained in a durable and retrievable format. The requirement for retention of research data is five years from the date of the last publication related to the data, or five years from the date the data was last accessed. This can vary depending on the type of research as detailed in the Research Data Management Guidelines.

If a decision is made to dispose of research data, and after obtaining approval from the Records Management Unit or CSU Regional Archives, disposal of research data should be planned and deliberate, using secure disposal mechanisms to prevent unauthorised re-use. 

For advice on retention or disposal of your research data contact researchsupport@csu.edu.au

File formats, management and version control

Create, name and organise your data files according to best practice.

File formats for archiving, preservation and access

File formats suitable for long term preservation and access are recommended by State and National Archives authorities.

Organising your Data folders

The UK Data Service: Organising Folders

A suggested folder structure

Folder Structure for Research Data files

Useful file names

  • Project or experiment name or acronym
  • Location or spatial coordinates
  • Researcher name and initials
  • Date or date range of experiment
  • Type of data Conditions
  • Version number of file
  • Three letter file extension for application specific files
  • These are suggestions; include whatever information will allow you to distinguish your files from each other and clearly indicate to you what is in them.

Other tips for file naming

  • A good format for dates is YYYYMMDD (or YYMMDD). This ensures all your files stay in chronological order.
  • Keep file names short, longer names do not work well with all types of software.
  • Special characters should be avoided: for example ~ ! @ # $
  • Avoid using spaces, they are not recognized by some software.
  • Instead use underscores (file_name), dashes (file--‐‑name), no separation (filename), or camel case (FileName).
  • Include a README.txt file in your directory that explains your naming convention along with any abbreviations or codes you have used.

Purchasing data

data.gov.au provides an easy way to find, access and reuse public datasets from Government. In addition to open datasets, the data.gov.au catalogue includes unpublished data and data available for purchase.

Marketing Edge Dataset Library. Datasets are made available online to approved academics for classroom use, dissertations and/or other research and are free of charge to members of the Marketing EDGE Professors’ Academy. Data set usage rules and costs may vary.

Data and Text Mining and Library Databases

A number of publishers are allowing data and text mining of their licensed resources by members of subscribing institutions. This access is generally governed by Database Licenses, terms and conditions and existing copyright provisions.

Some publishers will require you to use tools that they provide to mine their content, or will conduct the process for you. In this way they can manage the quantity of data being accessed and the impact on their servers.

Downloading large amounts of data can trigger automatic lockouts and prevent access to resources by other users. Some publishers may apply a fee for the additional usage that sits outside of our existing agreement.

Please consult with your Faculty Liaison Librarian if you are considering using Library subscription databases as a source of data.

Data and Text Mining software

Knowing which software to use for data management and analysis is important, and can differ according to your research discipline.

Software for Statistical analysis

The Quantitative Consulting Unit has tutorials or training for analysis software such as R, SPSS, G*Power3.1, @Risk and Netica

Software for Data and Text Mining

Nvivo - CSU has licenced Nvivo software available for download by Researchers. Training is available in the Research Office PD calendar,and in Lynda.com Training

Other Open Source Software for Data and Text Mining

Voyant -  Voyant is a free online program for text analysis

Orange -  open source software for data analysis, data visualisation and has add-ons for text mining and bioinformatics

VosViewer - VOSviewer also offers text mining functionality that can be used to construct  networks of terms extracted from a body of scientific literature.

PubVenn - PubVenn takes a complex PubMed search and divides it into its constituent parts. It then shows the citations using a proportionally-sized venn diagram.

Working with your data
OpenRefine is a free open source tool for working with messy data "... cleaning it; transforming it from one format into another; and extending it with web services and external data...."

Data security

You may have data in CSU storage, on your hard drive or on portable media. You may also have hard copy resources you need to manage. Whatever the format, you will need to keep this data safe and secure during your research.  Hard copies such as interview notes, prints of photographs, or video or audio tapes need to be kept securely locked away - for example in a locked filing cabinet that can only be accessed by agreed members of the research team.

The CSU Research Data Storage Framework explains options for digital data during the active phase of your research, as well as once it is finalised.

To help you with pointers, you can read through the data storage and data security advice provided by the Research Ethics Guidebook.

The ANDS Guide to data storage looks at the strengths and weaknesses of various storage options.

Datasets available for public access

Data.gov.au - Public datasets from many Australian government agencies.

Research Data Australia  -  Details of Data sets from over one hundred Australian research organisations

Atlas of Living Australia  - A collaborative aggregator of biodiversity knowledge.

Australian Ocean Data Network - Interactive maps, geospatial datasets, satellite imagery. This is a primary access point for the Australian marine community.

CSIRO Data Access Portal - Data published by CSIRO, across a range of disciplines

Australian National Map - An Australian government sponsored portal of government spacial data.

ATSIDA -The Aboriginal and Torres Strait Islander Data Archive works closely with researchers to support the management and archiving of research data.

Google Dataset Search

Google Dataset Search - Dataset Search (Beta) aims to help users to find data sets stored across thousands of repositories on the Web, making these data sets universally accessible and useful.

Other Guides you might find useful