Test your knowledge
Before completing the tasks, work through each of the sub-pages for tutorial 1.
Task 1: Create a dataset from scratch
Select 10 Twitter Accounts of your choice and create a dataset with the following variables:
Twitter Handle - Record the name of the Twitter account
Gender/Organisation - Record the gender of the Twitter account holder (record Males as 0 and Females as 1 - add labels so that you can easily see what is what in your data).
Note: Some of you may select organisations rather than people, or a combination of the two. If this is the case create one variable that measures whether the twitter account is a person or an organisation and a second variable that collects the sex, if available - if it is an organisation just add in NaN for missing values.
Followers - Record the number of followers
Following - Record how many people the account follows
Year Joint - Record the year the person joined.
If you haven't already done so, add a description for each variable and add labels to your Gender levels (Male for 0s and Female for 1s)
Task 2: Clean Existing Data and get it ready for Jamovi
Use the Filter and Replace function in Excel to clean your data
Use the Excel Filter function to delete any rows where Progress is below 75.
Re-label the Variable names in the following order: Progress; Age; HouseHold_Income; HH_Dependents; Gender; Know_Place
Clean the following variables using the replace function
Whole Dataset: Replace -99 with a blank space
Household_income: Replace ¬¨¬ with no space
HH_Dependents: Change None to 0 and 5 or more with 5
Gender: Replace Male with 0, Female with 1; and Other: Blank
Know_Place: Replace text with numbers
Strongly agree with -2
Somewhat agree with -1
Neither agree nor disagree with 0
Somewhat disagree with 1
Strongly disagree with 2
Open up the saved file in Jamovi and attached Labels and descriptions.
Add the following descriptions to variables:
Household_income: Income of Household in £
HH_Dependents: Number of Children in Household
Know_Place: I know my place in the world
Now add the following labels:
Gender: for 0 add Male; for 1 add Female
For -2 add Strongly agree
For -1 add Somewhat agree
For 0 add Neither agree nor disagree
For 1 add Somewhat disagree
For 2 add Strongly disagree
Your file is now ready to analyse. You would use a very similar process to clean a bigger dataset.
If you have really big datasets, it is worth learning some Python or R to make data cleaning easier.
Task 3: Create New Variables using Transform
Use the Twitter Dataset you created in Task 1
Using the Twitter dataset you have just created do the following:
Create an ordinal variable using the Followers variable. Call it Followers_cat.
Create the following categories:
Less than 1,000 followers
Between 1,001 - 10,000 followers
Between 10,001-100,000 followers
More than 100,001 Followers
Change the numbers to suit your dataset (e.g. instead of 1,000 change it to 1 million).
You should now have a new variable that has four ordinal categories.
Use the Dataset you cleaned in Task 2.
For this task use the Dataset you just cleaned in task 2:
Create a nominal variable using the Age variable. Call it Over_50
Create the following categories:
Over 50 - this variable should include everyone over the age of 50
Under 50 - this variable should include everyone 49 and younger
You should now have a new nominal variable.