psych350data

Datasets for UNL PSYC 350 Labs. All datasets are ready to use in R with human-readable categorical values, and can be exported as fully labeled SPSS (.sav) files with a single function call.

Installation

Install from GitHub:

# install.packages("pak")
pak::pak("emmarshall/psych350data")

Quick Start

library(psych350data)

# See all available datasets
list_datasets()

 [1] "superman"              "superman_smes"         "superman_movies"
 [4] "superman_combined"     "hotones"               "hotones_sauces"
 [7] "hotones_episodes"      "tip_jokes"             "mcu"
[10] "mock_jury"             "candy"                 "candy_simple"
[13] "football"              "huskers"               "interpersonal_data"
[16] "self_descriptive_data" "parent_child_data"     "hindsight_mg_data"
[19] "hindsight_wg_data"     "cheese_data"           "lpd_data"

# Use a dataset directly in R
head(superman)

# Export to SPSS
export_superman_sav("superman_data.sav")
export_football_sav("~/Desktop/football_data.sav")

# Export all datasets at once
export_all_sav(dir = "~/Desktop/PSYC350_SPSS/")

Three Ways to Use the Data

Each dataset supports three workflows depending on what you need:

1. Raw data in R — categorical variables are human-readable character strings, ideal for exploration and plotting with ggplot2:

superman |> count(type)         # "Film", "TV Series", "Serial"
football |> count(group)        # "Control", "Football no concussion", ...

2. Prep for numeric R analysis — prep_*() converts character categories to the same numeric codes used in SPSS, so your R output matches SPSS output exactly. This is also useful when working with psych350lab functions that expect numeric grouping variables:

superman_num <- prep_superman(superman)
superman_num |> count(type)     # 1, 2, 3

3. Export to SPSS — export_*_sav() produces a fully labeled .sav file with numeric codes, value labels, variable labels, and -99 for missing values:

export_superman_sav("superman_data.sav")

See vignette("getting-started") for a full walkthrough of all three workflows.

Available Datasets

Dataset	Description	Source
`superman`	Superman actor data with ratings	Rotten Tomatoes, Letterboxd, IMDb
`superman_smes`	SMES ratings by height gap and age difference	Simulated
`superman_movies`	Superman film box office data	IMDb Box Office Mojo
`hotones`	Hot Ones guest data	First We Feast (YouTube)
`hotones_sauces`	Hot sauce data by season and position	First We Feast (YouTube)
`hotones_episodes`	Episode-level YouTube engagement metrics	First We Feast (YouTube)
`tip_jokes`	Tipping experiment	Gueguen (2002)
`mcu`	MCU films box office and ratings	IMDb via openintro
`mock_jury`	Mock jury sentencing	Plaster (1989)
`candy`	Candy power rankings — full	FiveThirtyEight (2017)
`candy_simple`	Candy power rankings — simplified	FiveThirtyEight (2017)
`football`	Football concussion brain measurements	Singh et al. (2014), JAMA
`huskers`	Nebraska football box scores (1962–2024)	Historical records
`cheese_data`	Cheese characteristics and nutrition	cheese.com via TidyTuesday
`lpd_data`	Lincoln Police Dept. traffic stops	LPD Open Data Portal
`parent_child_data`	Parent-child observational study	Simulated
`hindsight_mg_data`	Hindsight bias — between-groups (long)	Simulated
`hindsight_wg_data`	Hindsight bias — within-groups (wide)	Simulated
`interpersonal_data`	Interpersonal relationships survey scales	Simulated
`self_descriptive_data`	Personality survey scales	Simulated

Exporting for Answer Keys vs. Student Data

Export the full dataset as an instructor answer key, and a subset as the student version:

library(dplyr)

# Full dataset — answer key with all variables
export_football_sav("football_answer_key.sav")

# Student version — only the variables they need
football |>
  select(group, volume) |>
  export_sav(path = "football_student.sav")

R vs SPSS: What’s Different?

Aspect	Raw R data	After `prep_*()`	SPSS `.sav` export
Categorical values	Character strings (`"Film"`, `"Control"`)	Numeric codes (1, 2, 3)	Numeric codes with value labels
Missing values	`NA`	`NA`	`-99` with user-defined missing
Variable descriptions	Help files (`?superman`)	Help files	Variable labels in Variable View

Example: All Three Workflows

library(dplyr)

# Workflow 1: Raw data — human-readable, great for plots
football |>
  count(group)

# A tibble: 3 × 2
  group                        n
  <chr>                    <int>
1 Control                     25
2 Football no concussion      25
3 Football with concussion    25

# Workflow 2: Prep for numeric analysis matching SPSS
football_num <- prep_football(football)
football_num |>
  count(group)   # 1, 2, 3

# A tibble: 3 × 2
  group     n
  <dbl> <int>
1     1    25
2     2    25
3     3    25

# Run ANOVA with numeric codes
model <- aov(volume ~ factor(group), data = football_num)
summary(model)

              Df Sum Sq Mean Sq F value   Pr(>F)
factor(group)  2  44.35  22.174   31.47 1.51e-10 ***
Residuals     72  50.73   0.705
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# Workflow 3: Export to SPSS with full labels and -99 for missing
export_football_sav("football_data.sav")

# Export a subset of variables
superman |>
  select(num, media, type, clark_height, rt_critics_score) |>
  export_sav(path = "superman_subset.sav")

What’s Included in SPSS Exports?

Variable labels describing each column
Value labels for categorical variables (e.g., 1 = “Control”, 2 = “Football no concussion”)
Missing values coded as -99 with SPSS missing value definitions
Proper SPSS formats for all variable types

Auditing an Export

export_huskers_sav("huskers_data.sav")
check_sav_export("huskers_data.sav")

# Show all variables, not just issues
check_sav_export("huskers_data.sav", show_all = TRUE)

# Read back for analysis
huskers_clean <- get_spss_data("huskers_data.sav")

Documentation

vignette("getting-started") — Introduction and all three workflows, including psych350lab compatibility
vignette("exporting-spss") — Detailed SPSS export guide with answer key vs. student workflows
vignette("dataset-reference") — Complete reference for all datasets with source attribution
?dataset_name — Help for any specific dataset

License

MIT