Skip to Main Content

Research Data Management at Charles Sturt: Collect and Create

Collect or Create Your Data

What type of data are you collecting or creating? 

Consider how you will collect, document and describe the data in a way that it can be used later. If you have sensitive data, collecting, storing and sharing will have extra requirements.

Documentation and Metadata

Providing documentation and metadata means others can find (metadata); and make sense of your data (documentation). 

Documentation is contextual information about your data that you are likely to produce during the course of your research, and this information will aid anyone else to reuse your data. Keep documentation alongside your research data, securely stored and backed up regularly.

Metadata, unlike documentation, is standardised data about your data. The reason why metadata exists is to allow for data preservation (if you were to save your data in a data repository); discovery for sharing; and data citation.

Documentation of your data

Data documentation provides context for your data and ensures that the data can be understood in the long term. Here is a list of what you should consider storing:

Document all this information in a spreadsheet or README.txt file and store it alongside your dataset.

File naming and version control

There is no point in collecting data that you then can't find! Create a strategy for file naming and a folder structure.


Metadata is often described as data about data.

Think of it as the keywords in an article and a way or "selling" your data to others. It can help others find your data and decide if they need it so important in ensuring findability, reuse and citation of your work.

Metadata is usually structured using recognsied standards or schemas such as Data Documentation Initiative (DDI).

Store the metadata within the data (e.g. in file properties) or in separate databases (e.g. XML) or files (README.txt).

See the ARDC Metadata Guide for more details and see the table below outlining the elements you can use to describe your data.

Metadata schemas

Many disciplines have a specific way of structuring metadata - these specific structures are called schemas. A schema will list the information you'll need to include about your data and how that information should be structured. Below are a few examples of schemas:

Discipline Metadata standard

Dublin Core (DC)
Metadata Object Description Schema (MODS)

Metadata Encoding and Transmission Standard (METS)


Categories for the Description of Works of Art (CDWA)
Visual Resources Association (VRA Core)


Darwin Core


Ecological Metadata Language (EML)


Content Standard for Digital Geospatial Metadata (CSDGM)

Social sciences

Data Documentation Initiative (DDI)

Metadata example

You can use these metadata elements in a README.txt:

Identifier Unique alpha-numeric identifier used to identify the data (such as DOI)
Date Any key dates associated with the data, including project start and end dates
Title Name of the research project or dataset
Version Information on the relevant version(s) of the dataset
Creator(s) Names, contact details and identifiers (such as ORCID) for all organisations and/or persons who collected and created the data
Source Citations for any data obtained or derived from other sources, including the creator, the year, the title of the dataset, identifier and access information
Location Relevant geographic information, including cities, regions, states, countries or coordinates
Keywords Keywords or phrases describing the data, this could also include relevant Field of Research codes
Methodology Information on how the data was created, including specific software or equipment (with model or version numbers), formulae, algorithms or methodologies
Processing Information on how the data has been transformed, altered or processed including quality assurance/control measures
Technical details All relevant technical information including a list of all the files that make up the dataset with extensions and relevant file formats and structures, an explanation of any codes or abbreviations used in the file names, a list of all variables in the data files, as well as the names and version numbers of all software packages required to use, view, or analyse the data
Rights Any known intellectual property rights, statutory rights, licenses, or restrictions on use of the data
Access How and where the data can be accessed

Charles Sturt University acknowledges the traditional custodians of the lands on which its campuses are located, paying respect to Elders, both past and present, and extend that respect to all First Nations Peoples.Acknowledgement of Country

Charles Sturt University is an Australian University, TEQSA Provider Identification: PRV12018. CRICOS Provider: 00005F.