Gathering and managing your research data
During your project, you will be gathering and collating data, possibly from different sources. This data may be generated by your research, or it may be information that you have obtained from another source. Either way, you will have to consider how it is managed.
"Charles Sturt University is required, like all other Australian universities and research organisations, to ensure that all research involving human participants must be conducted in accordance with the National Statement on the Ethical Conduct in Human Research (National Statement)."
Source:Charles Sturt University.(2018) Ethics and Compliance Unit: Human Research Ethics.
The CSU Ethics application and approval process requires all researchers and HDRs to complete and submit a HREA application.
Legal requirements for retention or disposal of data
Data should be retained in a durable and retrievable format. The requirement for retention of research data is five years from the date of the last publication related to the data, or five years from the date the data was last accessed. This can vary depending on the type of research as detailed in the Research Data Management Guidelines.
If a decision is made to dispose of research data, and after obtaining approval from the Records Management Unit or CSU Regional Archives, disposal of research data should be planned and deliberate, using secure disposal mechanisms to prevent unauthorised re-use.
For advice on retention or disposal of your research data contact firstname.lastname@example.org
File formats, management and version control
Create, name and organise your data files according to best practice.
File formats for archiving, preservation and access
File formats suitable for long term preservation and access are recommended by State and National Archives authorities.
- The UK Data Service Recommended formats
- Tasmanian Archive and Heritage Office: Guideline Digital Preservation Formats : Appendix 1 Table of Recommended File Formats
Organising your Data folders
The UK Data Service: Organising Folders
Useful file names
- Project or experiment name or acronym
- Location or spatial coordinates
- Researcher name and initials
- Date or date range of experiment
- Type of data Conditions
- Version number of file
- Three letter file extension for application specific files
- These are suggestions; include whatever information will allow you to distinguish your files from each other and clearly indicate to you what is in them.
Other tips for file naming
- A good format for dates is YYYYMMDD (or YYMMDD). This ensures all your files stay in chronological order.
- Keep file names short, longer names do not work well with all types of software.
- Special characters should be avoided: for example ~ ! @ # $
- Avoid using spaces, they are not recognized by some software.
- Instead use underscores (file_name), dashes (file--‐‑name), no separation (filename), or camel case (FileName).
- Include a README.txt file in your directory that explains your naming convention along with any abbreviations or codes you have used.
data.gov.au provides an easy way to find, access and reuse public datasets from Government. In addition to open datasets, the data.gov.au catalogue includes unpublished data and data available for purchase.
Marketing Edge Dataset Library. Datasets are made available online to approved academics for classroom use, dissertations and/or other research and are free of charge to members of the Marketing EDGE Professors’ Academy. Data set usage rules and costs may vary.
Data and Text Mining and Library Databases
A number of publishers are allowing data and text mining of their licensed resources by members of subscribing institutions. This access is generally governed by Database Licenses, terms and conditions and existing copyright provisions.
Some publishers will require you to use tools that they provide to mine their content, or will conduct the process for you. In this way they can manage the quantity of data being accessed and the impact on their servers.
Downloading large amounts of data can trigger automatic lockouts and prevent access to resources by other users. Some publishers may apply a fee for the additional usage that sits outside of our existing agreement.
Please consult with your Faculty Liaison Librarian if you are considering using Library subscription databases as a source of data.
Data and Text Mining software
Knowing which software to use for data management and analysis is important, and can differ according to your research discipline.
Software for Statistical analysis
The Quantitative Consulting Unit has tutorials or training for analysis software such as R, SPSS, G*Power3.1, @Risk and Netica
Software for Data and Text Mining
Other Open Source Software for Data and Text Mining
Voyant - Voyant is a free online program for text analysis
Orange - open source software for data analysis, data visualisation and has add-ons for text mining and bioinformatics
VosViewer - VOSviewer also offers text mining functionality that can be used to construct networks of terms extracted from a body of scientific literature.
PubVenn - PubVenn takes a complex PubMed search and divides it into its constituent parts. It then shows the citations using a proportionally-sized venn diagram.
Working with your data
OpenRefine is a free open source tool for working with messy data "... cleaning it; transforming it from one format into another; and extending it with web services and external data...."
You may have data in CSU storage, on your hard drive or on portable media. You may also have hard copy resources you need to manage. Whatever the format, you will need to keep this data safe and secure during your research. Hard copies such as interview notes, prints of photographs, or video or audio tapes need to be kept securely locked away - for example in a locked filing cabinet that can only be accessed by agreed members of the research team.
The CSU Research Data Storage Framework explains options for digital data during the active phase of your research, as well as once it is finalised.
To help you with pointers, you can read through the data storage and data security advice provided by the Research Ethics Guidebook.
The ANDS Guide to data storage looks at the strengths and weaknesses of various storage options.
Datasets available for public access
Data.gov.au - Public datasets from many Australian government agencies.
Research Data Australia - Details of Data sets from over one hundred Australian research organisations
Atlas of Living Australia - A collaborative aggregator of biodiversity knowledge.
Australian Ocean Data Network - Interactive maps, geospatial datasets, satellite imagery. This is a primary access point for the Australian marine community.
CSIRO Data Access Portal - Data published by CSIRO, across a range of disciplines
Australian National Map - An Australian government sponsored portal of government spacial data.
ATSIDA -The Aboriginal and Torres Strait Islander Data Archive works closely with researchers to support the management and archiving of research data.