Step-by-Step Guide 1

Get your Data Ready

Overview

Here you will learn the following:

Often you will already have a dataset (whether from Qualtrics, SurveyMonkey or MS Forms). Here we will teach you how to get your dataset ready for analysis. This is an extremely important step, but it can be very time-consuming. 

If you need to create a dataset from scratch, then you can do so in Jamovi, this is covered here. Sometimes you may want to clean a dataset that on initial inspection looks ready. You can easily download your Jamovi dataset as a .csv file. 

In the example below, we will be using a .csv file opened in Excel. However, you can use a multitude of programs such as Keynote. If you know how to code you can automate some of the data cleaning functions using R or Python (this is however not covered here). What your dataset looks like and how much cleaning is required depends on where your dataset is from. 

For this example, you should have a basic knowledge of how to use Excel.

Datasets used

Skoczylis, Joshua, 2021, "Extremism, Life Experiences and the Internet", https://doi.org/10.7910/DVN/ICTI8T, Harvard Dataverse, Version 3.

Here we will be using the raw data for the dataset above. You can access the raw data here.

The code book available on the Harvard Dataverse will also be very helpful for this exercise here. 

Cleaning your Dataset(s): Step-by-Step Guide 

Using Find & Replace values

by GCFLearning

Filtering Data in Excel

by GCFLearning

1.

 Deleted Variables (columns) that are not required

Before you delete any variables make sure you do the following:

The image below is raw data from Qualtrics. As you can see there are a lot of unnecessary variables. 

Now select and delete all that are unnecessary or that you will not need.  

Select Columns > Right-click > Delete

2.

Rename Variables (where necessary)

Depending on the dataset you have got, you may want to rename your variables.

If you have conducted your own survey, many of the variables will have the Question as the column name. This is not great. Change them 

e.g. 'What is your age' can be changed to 'Age' or 

'Select your Gender' can be changed to 'Gender'

Keep your variables as short as possible - but give it a name that makes some sense.

Keep a record of the questions asked, as you can add that information into your codebook later or into the description box in Jamovi.

3.

Delete Unnecessary rows (Header related)

Often you may have a number of rows, that give you information about your column that is unnecessary. Delete them, so that you are only left with one row that contains the Variable Name.

Select > RIght Click > Delete

4.

Remove Observations (rows)

If your dataset is survey data, you will have many observations (rows) that are only partially completed. Most survey platforms will give you an extra column that tells you how much they have completed. 

Decide on a cut of point and delete any observations below the agreed threshold in our case we have decided to remove observations that have completed less than 80%.

Now delete the rows. Also delete any header rows that you no longer need. You should be left with only one header row.

5.

Missing data

When you download your data, make sure you look up what the missing value is. Often the missing value is -99, but do check.

This value indicates that an observation is missing. Jamovi recognises 'blanks' and 'na' as missing data. This means we have to convert the missing value of '-99' to 'na'.

You will notice that most datasets will also have blank cells - this may be because the participant was never shown certain questions. In some instances these may be treated as missing data - just leave it blank. In other instances, you may want to place different values in.

To do this do the following:

All your missing values have now been replace with 'na'

Create a Codebook: Step-by-Step Guide 

A codebook will allow everyone to make sense of your dataset. Codebooks should include the following:

Having a good codebook is essential. The below video shows you how to easily create a codebook in Excel. 

Creating a Codebook 

by Courtney Donovan