psych350data

Lifecycle: experimental

Datasets for UNL PSYC 350 Labs. All datasets are ready to use in R with human-readable categorical values, and can be exported as fully labeled SPSS (.sav) files with a single function call.

Installation

Install from GitHub:

# install.packages("pak")
pak::pak("emmarshall/psych350data")

Quick Start

library(psych350data)

# See all available datasets
list_datasets()
 [1] "superman"              "superman_smes"         "superman_movies"
 [4] "superman_combined"     "hotones"               "hotones_sauces"
 [7] "hotones_episodes"      "tip_jokes"             "mcu"
[10] "mock_jury"             "candy"                 "candy_simple"
[13] "football"              "huskers"               "interpersonal_data"
[16] "self_descriptive_data" "parent_child_data"     "hindsight_mg_data"
[19] "hindsight_wg_data"     "cheese_data"           "lpd_data"
# Use a dataset directly in R
head(superman)
# Export to SPSS
export_superman_sav("superman_data.sav")
export_football_sav("~/Desktop/football_data.sav")

# Export all datasets at once
export_all_sav(dir = "~/Desktop/PSYC350_SPSS/")

Three Ways to Use the Data

Each dataset supports three workflows depending on what you need:

1. Raw data in R — categorical variables are human-readable character strings, ideal for exploration and plotting with ggplot2:

superman |> count(type)         # "Film", "TV Series", "Serial"
football |> count(group)        # "Control", "Football no concussion", ...

2. Prep for numeric R analysisprep_*() converts character categories to the same numeric codes used in SPSS, so your R output matches SPSS output exactly. This is also useful when working with psych350lab functions that expect numeric grouping variables:

superman_num <- prep_superman(superman)
superman_num |> count(type)     # 1, 2, 3

3. Export to SPSSexport_*_sav() produces a fully labeled .sav file with numeric codes, value labels, variable labels, and -99 for missing values:

export_superman_sav("superman_data.sav")

See vignette("getting-started") for a full walkthrough of all three workflows.

Available Datasets

Dataset Description Source
superman Superman actor data with ratings Rotten Tomatoes, Letterboxd, IMDb
superman_smes SMES ratings by height gap and age difference Simulated
superman_movies Superman film box office data IMDb Box Office Mojo
hotones Hot Ones guest data First We Feast (YouTube)
hotones_sauces Hot sauce data by season and position First We Feast (YouTube)
hotones_episodes Episode-level YouTube engagement metrics First We Feast (YouTube)
tip_jokes Tipping experiment Gueguen (2002)
mcu MCU films box office and ratings IMDb via openintro
mock_jury Mock jury sentencing Plaster (1989)
candy Candy power rankings — full FiveThirtyEight (2017)
candy_simple Candy power rankings — simplified FiveThirtyEight (2017)
football Football concussion brain measurements Singh et al. (2014), JAMA
huskers Nebraska football box scores (1962–2024) Historical records
cheese_data Cheese characteristics and nutrition cheese.com via TidyTuesday
lpd_data Lincoln Police Dept. traffic stops LPD Open Data Portal
parent_child_data Parent-child observational study Simulated
hindsight_mg_data Hindsight bias — between-groups (long) Simulated
hindsight_wg_data Hindsight bias — within-groups (wide) Simulated
interpersonal_data Interpersonal relationships survey scales Simulated
self_descriptive_data Personality survey scales Simulated

Exporting for Answer Keys vs. Student Data

Export the full dataset as an instructor answer key, and a subset as the student version:

library(dplyr)

# Full dataset — answer key with all variables
export_football_sav("football_answer_key.sav")

# Student version — only the variables they need
football |>
  select(group, volume) |>
  export_sav(path = "football_student.sav")

R vs SPSS: What’s Different?

Aspect Raw R data After prep_*() SPSS .sav export
Categorical values Character strings ("Film", "Control") Numeric codes (1, 2, 3) Numeric codes with value labels
Missing values NA NA -99 with user-defined missing
Variable descriptions Help files (?superman) Help files Variable labels in Variable View

Example: All Three Workflows

library(dplyr)

# Workflow 1: Raw data — human-readable, great for plots
football |>
  count(group)
# A tibble: 3 × 2
  group                        n
  <chr>                    <int>
1 Control                     25
2 Football no concussion      25
3 Football with concussion    25
# Workflow 2: Prep for numeric analysis matching SPSS
football_num <- prep_football(football)
football_num |>
  count(group)   # 1, 2, 3
# A tibble: 3 × 2
  group     n
  <dbl> <int>
1     1    25
2     2    25
3     3    25
# Run ANOVA with numeric codes
model <- aov(volume ~ factor(group), data = football_num)
summary(model)
              Df Sum Sq Mean Sq F value   Pr(>F)
factor(group)  2  44.35  22.174   31.47 1.51e-10 ***
Residuals     72  50.73   0.705
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Workflow 3: Export to SPSS with full labels and -99 for missing
export_football_sav("football_data.sav")

# Export a subset of variables
superman |>
  select(num, media, type, clark_height, rt_critics_score) |>
  export_sav(path = "superman_subset.sav")

What’s Included in SPSS Exports?

  • Variable labels describing each column
  • Value labels for categorical variables (e.g., 1 = “Control”, 2 = “Football no concussion”)
  • Missing values coded as -99 with SPSS missing value definitions
  • Proper SPSS formats for all variable types

Auditing an Export

export_huskers_sav("huskers_data.sav")
check_sav_export("huskers_data.sav")

# Show all variables, not just issues
check_sav_export("huskers_data.sav", show_all = TRUE)

# Read back for analysis
huskers_clean <- get_spss_data("huskers_data.sav")

Documentation

  • vignette("getting-started") — Introduction and all three workflows, including psych350lab compatibility
  • vignette("exporting-spss") — Detailed SPSS export guide with answer key vs. student workflows
  • vignette("dataset-reference") — Complete reference for all datasets with source attribution
  • ?dataset_name — Help for any specific dataset

License

MIT