# Install from GitHub (one time only)
# install.packages("pak")
pak::pak("emmarshall/psych350data")Getting Started with psych350data
Overview
The psych350data package provides all the datasets used in PSYC 350 labs at UNL. Every dataset is ready to use the moment you load the package — no file downloading, no manual data entry, and no SPSS license needed.
The package supports three workflows depending on what you need to do:
Use the raw R data directly — categorical variables are human-readable strings like
"Film"or"Control", which work perfectly with ggplot2 and most R functions.Prep for numeric analysis matching SPSS —
prep_*()functions convert character categories to the same numeric codes used in the SPSS files (e.g., 1, 2, 3), so your R output matches SPSS output exactly. This is useful when working with functions from the psych350lab package that expect numeric grouping variables.Export to SPSS —
export_*_sav()functions create fully labeled.savfiles with numeric codes, value labels, variable labels, and-99for missing values. Use this to create answer key files or student data files for labs.
Installation
Loading the Package
Browsing Available Datasets
Use list_datasets() to see every dataset in the package:
list_datasets()
#> [1] "superman" "superman_smes" "superman_movies"
#> [4] "superman_combined" "hotones" "hotones_sauces"
#> [7] "hotones_episodes" "tip_jokes" "mcu"
#> [10] "mock_jury" "candy" "candy_simple"
#> [13] "football" "huskers" "interpersonal_data"
#> [16] "self_descriptive_data" "parent_child_data" "hindsight_mg_data"
#> [19] "hindsight_wg_data" "cheese_data" "lpd_data"For detailed documentation on any dataset, use ? in the R console:
?superman
?hotones
?mock_jury
?football
?interpersonal_dataSee the Dataset Reference for a complete guide to every dataset including sources and variable descriptions.
Workflow 1: Raw Data in R
Every dataset is a tibble that’s available immediately after loading the package. Categorical variables use descriptive character values that are easy to read and plot:
superman |>
select(num, media, type, clark_grp, age_grp) |>
head()
#> # A tibble: 6 × 5
#> num media type clark_grp age_grp
#> <int> <chr> <chr> <chr> <chr>
#> 1 1 Superman Film 6ft or taller Average
#> 2 2 Superman: The Movie Film 6ft or taller Average
#> 3 3 Smallville TV Show 6ft or taller Minimal
#> 4 4 Superman Returns Film 6ft or taller Average
#> 5 5 Superman & the Mole Men Film 6ft or taller Big
#> 6 6 Man of Steel Film 6ft or taller Bigfootball |>
count(group)
#> # A tibble: 3 × 2
#> group n
#> <chr> <int>
#> 1 Control 25
#> 2 Football no concussion 25
#> 3 Football with concussion 25Plotting with ggplot2
Because categorical variables are already human-readable strings, they work directly as axis labels and legend entries in ggplot2 — no extra formatting needed:
If you need to control the order of categories on the axis (e.g., for a specific ordering in a bar chart), convert to a factor with explicit levels:
Handling Missing Values
Missing values in the raw R data are standard NA, which R handles automatically in most functions:
# NA values are excluded with na.rm = TRUE
mean(superman$rt_critics_score, na.rm = TRUE)
#> [1] 79.375Running Analyses with Raw Data
For analyses like ANOVA or t-tests that need a factor grouping variable, wrap the character column in factor():
# One-way ANOVA using the character group variable
model <- aov(volume ~ factor(group), data = football)
summary(model)
#> Df Sum Sq Mean Sq F value Pr(>F)
#> factor(group) 2 44.35 22.174 31.47 1.51e-10 ***
#> Residuals 72 50.73 0.705
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1Workflow 2: Prep for Numeric Analysis
If your analysis needs to produce output with numeric codes that match SPSS — for example, when comparing your R results to SPSS output in a lab, or when using functions from the psych350lab package — use prep_*() functions.
What prep_*() Does
Each prep_*() function replaces character categorical variables with the same numeric codes used in the SPSS .sav file. For example, in the superman dataset, type goes from "Film" / "TV Series" / "Serial" to 1 / 2 / 3, and age_grp goes from "Minimal" / "Average" / "Big" to 1 / 2 / 3.
superman_num <- prep_superman(superman)
superman_num |>
select(num, media, type, clark_grp, age_grp) |>
head()
#> # A tibble: 6 × 5
#> num media type clark_grp age_grp
#> <int> <chr> <dbl> <dbl> <dbl>
#> 1 1 Superman 1 2 2
#> 2 2 Superman: The Movie 1 2 2
#> 3 3 Smallville 2 2 1
#> 4 4 Superman Returns 1 2 2
#> 5 5 Superman & the Mole Men 1 2 3
#> 6 6 Man of Steel 1 2 3Using the Generic prep_data() Function
You can also use the generic prep_data() function with any dataset name as a string:
superman_num <- prep_data(superman, "superman")
superman_num |>
select(num, media, type, age_grp) |>
head()
#> # A tibble: 6 × 4
#> num media type age_grp
#> <int> <chr> <dbl> <dbl>
#> 1 1 Superman 1 2
#> 2 2 Superman: The Movie 1 2
#> 3 3 Smallville 2 1
#> 4 4 Superman Returns 1 2
#> 5 5 Superman & the Mole Men 1 3
#> 6 6 Man of Steel 1 3All Prep Functions
Every dataset has its own dedicated prep function:
prep_superman(superman)
prep_superman_smes(superman_smes)
prep_superman_movies(superman_movies)
prep_hotones(hotones)
prep_hotones_sauces(hotones_sauces)
prep_hotones_episodes(hotones_episodes)
prep_mcu(mcu)
prep_mock_jury(mock_jury)
prep_tip_jokes(tip_jokes)
prep_candy(candy)
prep_candy_simple(candy_simple)
prep_football(football)
prep_huskers(huskers)
prep_interpersonal(interpersonal_data)
prep_self_descriptive(self_descriptive_data)
prep_parent_child(parent_child_data)
prep_hindsight_bg(hindsight_mg_data)
prep_hindsight_wg(hindsight_wg_data)
prep_cheese(cheese_data)
prep_lpd(lpd_data)Some datasets (like mock_jury, tip_jokes, interpersonal_data, and self_descriptive_data) already store their categorical variables as numeric codes. Their prep functions still exist for consistency, but they return the data unchanged.
Compatibility with psych350lab
The psych350lab package provides helper functions for PSYC 350 lab assignments. Many of these functions expect data that looks like an SPSS .sav file — that is, categorical/grouping variables stored as numbers (not character strings) and missing values coded as -99 rather than NA.
To make any psych350data dataset work with psych350lab functions, you need two steps:
Step 1: Convert categories to numeric codes with prep_*():
football_num <- prep_football(football)
football_num |> count(group)
#> # A tibble: 3 × 2
#> group n
#> <dbl> <int>
#> 1 1 25
#> 2 2 25
#> 3 3 25
# group is now 1, 2, 3 instead of "Control", "Football no concussion", ...Step 2: Replace NA with -99 (if your psych350lab function expects -99 for missing):
If you’re only running standard R functions like aov(), t.test(), or cor.test(), you do not need the -99 step — R handles NA natively. The -99 replacement is only needed for functions that specifically expect SPSS-style missing value coding.
Keeping Labels for Plotting After Prep
By default, prep_*() replaces the character values with numbers, and the original labels are lost. If you need both the numeric codes (for analysis) and the original labels (for plotting), use keep_labels = TRUE:
superman_both <- prep_superman(superman, keep_labels = TRUE)
superman_both |>
select(num, type, type_label) |>
head()
#> # A tibble: 6 × 3
#> num type type_label
#> <int> <dbl> <chr>
#> 1 1 1 Film
#> 2 2 1 Film
#> 3 3 2 TV Show
#> 4 4 1 Film
#> 5 5 1 Film
#> 6 6 1 FilmThis is useful when you want numeric codes for analysis but readable labels in ggplot2:
# Use the _label column for readable axis labels
superman_both |>
ggplot(aes(x = type_label, y = rt_critics_score)) +
geom_boxplot() +
labs(x = "Media Type", y = "Rotten Tomatoes Critics Score")Workflow 3: Export to SPSS
Use export_*_sav() functions to create .sav files for SPSS or JASP. The export functions handle everything automatically — you do not need to run a prep function first.
Each export produces a file with:
- Numeric codes for all categorical variables
- Value labels so SPSS displays category names (e.g., 1 = “Control”)
- Variable labels describing each column in SPSS Variable View
-
-99for missing values, registered as user-defined missing so SPSS excludes them automatically
Quick Export
# Export to current working directory
export_superman_sav()
# Export to a specific location
export_football_sav("~/Desktop/football_data.sav")Exporting for Answer Keys vs. Student Data
A common workflow is to export the full dataset as an instructor answer key, and a subset of variables as the student version.
Full dataset (answer key):
export_superman_sav("superman_answer_key.sav")Subset of variables (student version):
# Use select() to choose only the variables students need,
# then pipe to export_sav()
superman |>
select(num, media, type, clark_height, rt_critics_score) |>
export_sav(path = "superman_student.sav")
# Works with tidyselect helpers too
interpersonal_data |>
select(age, gender, race, gcb, risc, lsas) |>
export_sav(path = "interpersonal_student.sav")Export All Datasets at Once
export_all_sav(dir = "~/Desktop/PSYC350_SPSS/")This creates one .sav file per dataset in the specified folder.
See the Exporting to SPSS vignette for the full export guide, including how to audit exports and control missing value behavior.
Choosing the Right Workflow
| Situation | Recommended workflow |
|---|---|
| Exploring data, making plots with ggplot2 | Use the raw data as-is |
| Running standard R analyses (t-test, ANOVA, correlation) | Use the raw data with factor() for grouping variables |
| R analysis that needs to match SPSS numeric output | prep_*() |
| Using psych350lab functions that expect numeric groups and -99 |
prep_*() then replace NA with -99 |
| Plotting after prepping (need readable axis labels) | prep_*(..., keep_labels = TRUE) |
Creating a .sav file for SPSS or JASP |
export_*_sav() |
| Creating a student data file with fewer variables |
select() then export_sav()
|
Creating instructor answer key .sav files |
export_*_sav() with full dataset |
R vs. SPSS: What’s Different?
| Aspect | Raw R data | After prep_*()
|
SPSS .sav export |
|---|---|---|---|
| Categorical values | Character strings ("Film", "Control") |
Numeric codes (1, 2, 3) | Numeric codes with value labels |
| Missing values | NA |
NA |
-99 with user-defined missing |
| Variable descriptions |
?dataset help files |
?dataset help files |
Variable labels in Variable View |
| Best for | Plotting, exploration | Matching SPSS output in R | SPSS / JASP labs |