Dataset Reference

A complete reference for all datasets in the psych350data package. Each entry includes a description, source attribution, variable overview, and the categorical variable codes used in SPSS exports and after prep_*().

For guidance on how to use these datasets, see the Getting Started vignette.

Quick Reference

Dataset	Related Functions	Source
`superman`	`prep_superman()` `label_superman()` `export_superman_sav()`	Rotten Tomatoes, Letterboxd, IMDb
`superman_smes`	`prep_superman_smes()` `label_superman_smes()` `export_superman_smes_sav()`	Simulated
`superman_movies`	`prep_superman_movies()` `label_superman_movies` `export_superman_movies_sav()`	IMDb Box Office Mojo
`hotones`	`prep_hotones()` `label_hotones()` `export_hotones_sav()`	Hot Ones / First We Feast (YouTube)
`hotones_sauces`	`prep_hotones_sauces()` `label_hotones_sauces()` `export_hotones_sauces_sav()`	Hot Ones / First We Feast (YouTube)
`hotones_episodes`	`prep_hotones_episodes()` `label_hotones_episodes()` `export_hotones_episodes_sav()`	Hot Ones / First We Feast (YouTube)
`tip_jokes`	`prep_tip_jokes()` `label_tip_jokes()` `export_tip_jokes_sav()`	Gueguen (2002)
`mcu`	`prep_mcu()` `label_mcu()` `export_mcu_sav()`	IMDb via openintro package
`mock_jury`	`prep_mock_jury()` `label_mock_jury()` `export_mock_jury_sav()`	Plaster (1989)
`candy`	`prep_candy()` `label_candy()` `export_candy_sav()`	FiveThirtyEight
`candy_simple`	`prep_candy_simple()` `label_candy_simple()` `export_candy_simple_sav()`	FiveThirtyEight
`football`	`prep_football()` `label_football()` `export_football_sav()`	Singh et al. (2014), JAMA
`huskers`	`prep_huskers()` `label_huskers()` `export_huskers_sav()`	Historical records
`cheese_data`	`prep_cheese()` `label_cheese()` `export_cheese_sav()`	cheese.com via TidyTuesday
`lpd_data`	`prep_lpd()` `label_lpd()` `export_lpd_sav()`	Lincoln Police Dept. Open Data
`parent_child_data`	`prep_parent_child()` `label_parent_child()` `export_parent_child_sav()`	Simulated
`hindsight_mg_data`	`prep_hindsight_bg()`¹ `label_hindsight_mg()` `export_hindsight_mg_sav()`	Simulated
`hindsight_wg_data`	`prep_hindsight_wg()` `label_hindsight_wg()` `export_hindsight_wg_sav()`	Simulated
`interpersonal_data`	`prep_interpersonal()` `label_interpersonal()` `export_interpersonal_sav()`	Simulated
`self_descriptive_data`	`prep_self_descriptive()` `label_self_descriptive()` `export_selfdescriptive_sav()`	Simulated

Note

A combined Superman dataset (superman_combined) can be exported with export_superman_combined_sav(). It joins superman_movies with superman actor data using join_superman_data().

Use list_datasets() to see all available dataset names from the R console.

How Categorical Variables Work

In R (raw): Categorical variables use human-readable character values like "Film", "Control", or "Minimal". These are ideal for plotting and exploration.

After prep_*(): Character categories are replaced with numeric codes that match SPSS. Use this when your R output needs to match SPSS output or when working with psych350lab functions.

In SPSS exports: Numeric codes are stored with value labels, so SPSS displays both the number and the category name.

# Raw R data: human-readable categories
superman |>
  select(num, media, type, age_grp) |>
  head(4)
#> # A tibble: 4 × 4
#>     num media               type    age_grp
#>   <int> <chr>               <chr>   <chr>  
#> 1     1 Superman            Film    Average
#> 2     2 Superman: The Movie Film    Average
#> 3     3 Smallville          TV Show Minimal
#> 4     4 Superman Returns    Film    Average

# After prep: numeric codes matching SPSS
prep_superman(superman) |>
  select(num, media, type, age_grp) |>
  head(4)
#> # A tibble: 4 × 4
#>     num media                type age_grp
#>   <int> <chr>               <dbl>   <dbl>
#> 1     1 Superman                1       2
#> 2     2 Superman: The Movie     1       2
#> 3     3 Smallville              2       1
#> 4     4 Superman Returns        1       2

Superman Actor Data

Dataset: superman | 11 rows, 27 columns

Physical characteristics and ratings data for actors who have played Superman across film, TV, and serial media. Includes actor heights, Rotten Tomatoes scores, Letterboxd ratings, and popularity metrics.

Source: Compiled from Rotten Tomatoes, Letterboxd, and IMDb.

glimpse(superman)
#> Rows: 11
#> Columns: 27
#> $ type              <chr> "Film", "Film", "TV Show", "Film", "Film", "Film", "…
#> $ media             <chr> "Superman", "Superman: The Movie", "Smallville", "Su…
#> $ year              <dbl> 2025, 1978, 2001, 2006, 1951, 2013, 1948, 2021, 1993…
#> $ clark_actor       <chr> "David Corenswet", "Christopher Reeve", "Tom Welling…
#> $ clark_height      <dbl> 1.93, 1.93, 1.90, 1.89, 1.86, 1.85, 1.85, 1.82, 1.81…
#> $ lois_actor        <chr> "Rachel Brosnahan", "Margot Kidder", "Erica Durance"…
#> $ lois_height       <dbl> 1.60, 1.72, 1.71, 1.65, 1.63, 1.63, 1.62, 1.68, 1.68…
#> $ rt_critics_score  <dbl> 83, 88, 78, 72, NA, 57, 83, 88, 86, NA, NA
#> $ rt_critics_count  <dbl> 484, 121, 111, 290, 7, 340, 484, 55, 20, NA, NA
#> $ rt_audience_score <dbl> 90, 86, 72, 60, 79, 75, 90, 84, 86, NA, NA
#> $ rt_audience_count <dbl> 25000, 250000, 2500, 250000, 250, 250000, 25000, 100…
#> $ ldb_likes         <dbl> 1105511, 99115, NA, 26076, 744, 204463, NA, NA, NA, …
#> $ ldb_scores        <dbl> 3.9, 3.7, NA, 2.7, 2.6, 3.0, NA, NA, NA, NA, NA
#> $ num               <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
#> $ clark_age         <dbl> 32.01, 26.22, 24.47, 26.72, 37.84, 30.11, 37.24, 33.…
#> $ lois_age          <dbl> 35.00, 30.16, 23.32, 23.48, 24.81, 38.82, 27.11, 40.…
#> $ age_diff          <dbl> 2.99, 3.94, 1.15, 3.24, 13.03, 8.71, 10.13, 6.65, 1.…
#> $ age_grp           <chr> "Average", "Average", "Minimal", "Average", "Big", "…
#> $ clark_height_in   <dbl> 75.9841, 75.9841, 74.8030, 74.4093, 73.2282, 72.8345…
#> $ lois_height_in    <dbl> 62.9920, 67.7164, 67.3227, 64.9605, 64.1731, 64.1731…
#> $ height_diff       <dbl> 12.9921, 8.2677, 7.4803, 9.4488, 9.0551, 8.6614, 9.0…
#> $ height_gap        <chr> "Big", "Big", "Average", "Big", "Big", "Big", "Big",…
#> $ clark_grp         <chr> "6ft or taller", "6ft or taller", "6ft or taller", "…
#> $ tomatometer       <chr> "Fresh", "Fresh", "Fresh", "Fresh", NA, "Rotten", "F…
#> $ rt_avg            <dbl> 86.5, 87.0, 75.0, 66.0, NA, 66.0, 86.5, 86.0, 86.0, …
#> $ rt_diff           <dbl> -86.71433, -85.91582, -65.62313, -59.84706, NA, -74.…
#> $ popular           <chr> "High", "Mid", NA, "Mid", "Low", "High", NA, NA, NA,…

Categorical Variables (SPSS codes)

Variable	Values
`type`	1 = Film, 2 = TV Series, 3 = Serial
`clark_grp`	1 = Under 6ft, 2 = 6ft or taller
`height_gap`	1 = Minimal, 2 = Average, 3 = Big
`age_grp`	1 = Minimal, 2 = Average, 3 = Big
`tomatometer`	1 = Rotten, 2 = Fresh
`popular`	1 = Low, 2 = Mid, 3 = High

Superman SMES Data

Dataset: superman_smes | 47 rows, 7 columns

Simulated participant ratings on the Subjective Media Experience Scale (SMES), grouped by both the height gap and age difference between the Superman and Lois Lane actors. The emotion variable requires data prep conversion to factor; all other variables are already numeric.

Source: Simulated data for teaching purposes.

glimpse(superman_smes)
#> Rows: 47
#> Columns: 7
#> $ num                  <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15…
#> $ height_gap           <int> 3, 3, 3, 3, 1, 1, 3, 3, 3, 2, 2, 1, 2, 3, 1, 3, 3…
#> $ age_grp              <int> 2, 3, 2, 2, 1, 3, 2, 2, 2, 1, 1, 3, 1, 3, 1, 3, 3…
#> $ emotional_impact     <dbl> 10, 10, NA, 12, 12, 17, 14, 16, 14, 18, 16, 8, 13…
#> $ aesthetic_appeal     <dbl> 14, 9, 6, 13, 12, 6, 15, 7, 10, 8, 9, 9, 11, 11, …
#> $ cognitive_engagement <dbl> 5.4, 4.5, 4.5, 4.5, 3.6, 3.7, 4.7, 4.7, 4.0, 3.9,…
#> $ emotion              <int> 1, 2, 4, 2, 2, 2, 6, 5, 2, 2, 3, 6, 2, 4, 5, 4, 1…

Categorical Variables (SPSS codes)

Variable	Values
`height_gap`	1 = Minimal, 2 = Average, 3 = Big
`age_grp`	1 = Minimal, 2 = Average, 3 = Big
`emotion`	1 = Fear, 2 = Joy, 3 = Sadness, 4 = Anger, 5 = Disgust, 6 = Anxiety

Scale Information

Subscale	Range
`emotional_impact`	4–20
`aesthetic_appeal`	3–15
`cognitive_engagement`	0–7

Superman Movies Data

Dataset: superman_movies | 10 rows, 22 columns

Box office and production data for Superman theatrical films, including budget, domestic and international grosses, and MPAA ratings.

Source: IMDb Box Office Mojo.

glimpse(superman_movies)
#> Rows: 8
#> Columns: 21
#> $ imdb_id             <chr> "tt5950044", "tt0078346", "tt0770828", "tt0348150"…
#> $ title               <chr> "Superman", "Superman: The Movie", "Man of Steel",…
#> $ year                <int> 2025, 1978, 2013, 2006, 1980, 1983, 1987, 2016
#> $ description         <chr> "Superman must reconcile his alien Kryptonian heri…
#> $ domestic_gross      <dbl> 354.22380, 134.47845, 291.04552, 200.08119, 108.18…
#> $ domestic_pct        <dbl> 57.3, 44.8, 43.4, 51.2, 50.0, 74.7, 51.8, 37.8
#> $ international_gross <dbl> 264.5000, 166.0000, 379.1000, 191.0000, 108.2000, …
#> $ international_pct   <dbl> 42.7, 55.2, 56.6, 48.8, 50.0, 25.3, 48.2, 62.2
#> $ worldwide_gross     <dbl> 618.72380, 300.47845, 670.14552, 391.08119, 216.38…
#> $ distributor         <chr> "Warner Bros.", "Warner Bros.", "Warner Bros.", "W…
#> $ opening_weekend     <dbl> 125.021735, 7.465343, 116.619362, 52.535096, 14.10…
#> $ budget              <dbl> 225, 55, 225, 270, 54, 39, 17, 250
#> $ release_date        <chr> "7/11/25", "12/15/78", "6/12/13", "6/28/06", "6/19…
#> $ mpaa                <chr> "PG-13", "PG", "PG-13", "PG-13", "PG", "Unrated", …
#> $ runtime_min         <int> 129, 143, 143, 154, 127, 125, 90, 151
#> $ genres              <chr> "Action Adventure Fantasy Sci-Fi", "Action Adventu…
#> $ poster_url          <chr> "https://m.media-amazon.com/images/M/MV5BZjFhZmU5N…
#> $ clark_actor         <chr> "David Corenswet", "Christopher Reeve", "Henry Cav…
#> $ roi                 <dbl> 1.7498836, 4.4632445, 1.9784245, 0.4484489, 3.0071…
#> $ budget_cat          <chr> "High", "Medium", "High", "High", "Medium", "Low",…
#> $ box_office_cat      <chr> "High", "Medium", "High", "Medium", "Medium", "Low…

Categorical Variables (SPSS codes)

Variable	Values
`mpaa`	1 = G, 2 = PG, 3 = PG-13, 4 = R
`budget_cat`	1 = Low, 2 = Medium, 3 = High (tercile-based)
`box_office_cat`	1 = Low, 2 = Medium, 3 = High (tercile-based)

Hot Ones Guest Data

Dataset: hotones | Export: export_hotones_sav()

Data on guests from the YouTube show Hot Ones, including demographic information, Scoville ratings for each sauce consumed (SHU_1 through SHU_10), and whether the guest succeeded in eating all ten wings.

Source: Hot Ones / First We Feast (YouTube).

glimpse(hotones)
#> Rows: 369
#> Columns: 25
#> $ subn        <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,…
#> $ name        <chr> "Tony Yayo", "Anthony Rizzo", "Machine Gun Kelly", "Gunpla…
#> $ gender      <chr> "Male", "Male", "Male", "Male", "Male", "Male", "Male", "M…
#> $ age         <dbl> 36.94795, 25.75890, 28.04932, 35.93151, 39.44658, 26.77808…
#> $ occupation  <chr> "Rapper", "Athlete", "Rapper", "Rapper", "Rapper", "Rapper…
#> $ wing_total  <dbl> 4.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 3.0, 10.0, 10.0, …
#> $ alt_food    <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ helpers     <chr> "Water", "Milk", "Beer,Milk,Water", "Milk", "Milk,Water", …
#> $ SHU_1       <dbl> 747, 747, 747, 747, 747, 747, 747, 747, 2200, 2200, 2200, …
#> $ SHU_2       <dbl> 3600, 3600, 3600, 3600, 3600, 3600, 3600, 3600, 3000, 3000…
#> $ SHU_3       <dbl> 5790, 5790, 5790, 5790, 5790, 5790, 5790, 5790, 5790, 5790…
#> $ SHU_4       <dbl> 15000, 15000, 15000, 15000, 15000, 15000, 15000, 15000, 13…
#> $ SHU_5       <dbl> 13000, 13000, 13000, 13000, 13000, 13000, 13000, 13000, 15…
#> $ SHU_6       <dbl> 40600, 40600, 40600, 40600, 40600, 40600, 40600, 40600, 34…
#> $ SHU_7       <dbl> 30000, 30000, 30000, 30000, 30000, 30000, 30000, 30000, 40…
#> $ SHU_8       <dbl> 57000, 57000, 57000, 57000, 57000, 57000, 57000, 57000, 13…
#> $ SHU_9       <dbl> 180000, 180000, 180000, 180000, 180000, 180000, 180000, 18…
#> $ SHU_10      <dbl> 357000, 357000, 357000, 357000, 357000, 357000, 357000, 35…
#> $ result      <chr> "Failed", "Succeeded", "Succeeded", "Succeeded", "Succeede…
#> $ appearances <dbl> 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 2, 1…
#> $ season      <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
#> $ order       <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 8, 9, 10, …
#> $ views       <dbl> 1.371525, 0.825218, 3.164341, 1.164724, 1.143213, 0.861112…
#> $ likes       <dbl> 18108, 11301, 49026, 15473, 13761, 9627, 7572, 79735, 9395…
#> $ comments    <dbl> 1011, 1041, 3609, 1226, 1148, 681, 595, 24666, 740, 1669, …

Categorical Variables (SPSS codes)

Variable	Values
`gender`	1 = Male, 2 = Female
`result`	1 = Succeeded, 2 = Failed, 3 = Incomplete
`occupation`	1 = Rapper, 2 = Athlete, 3 = Actor, 4 = Actor-Comedian, 5 = Comedian, 6 = Chef, 7 = Actor-Musician, 8 = Musician, 9 = DJ, 10 = YouTuber, 11 = Model, 12 = Wrestler, 13 = Magician, 14 = Other

Hot Ones Sauces Data

Dataset: hotones_sauces | Export: export_hotones_sauces_sav()

Data on the hot sauces used in each season and position of Hot Ones, including Scoville Heat Unit (SHU) ratings. All variables are numeric — no categorical conversion needed.

Source: Hot Ones / First We Feast (YouTube).

glimpse(hotones_sauces)
#> Rows: 250
#> Columns: 4
#> $ season     <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,…
#> $ order      <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1…
#> $ sauce_name <chr> "Texas Pete Original Hot Sauce", "Cholula Original Hot Sauc…
#> $ SHU        <dbl> 747, 3600, 5790, 15000, 13000, 40600, 30000, 57000, 180000,…

Hot Ones Episodes Data

Dataset: hotones_episodes | Export: export_hotones_episodes_sav()

Episode-level data from Hot Ones including guest names, episode titles, and YouTube engagement metrics (views, likes, comments). All variables are numeric or character — no categorical conversion needed.

Source: Hot Ones / First We Feast (YouTube).

glimpse(hotones_episodes)
#> Rows: 345
#> Columns: 11
#> $ season            <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
#> $ order             <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1…
#> $ guest             <chr> "Tony Yayo", "Anthony Rizzo", "Machine Gun Kelly", "…
#> $ episode_title     <chr> "Tony Yayo Talks Shmoney Dance & Eminem's Taco Habit…
#> $ publish_date      <dbl> 42075, 42136, 42166, 42178, 42227, 42242, 42284, 422…
#> $ views             <dbl> 1.371525, 0.825218, 3.164341, 1.164724, 1.143213, 0.…
#> $ likes             <dbl> 18108, 11301, 49026, 15473, 13761, 9627, 7572, 79735…
#> $ comments          <dbl> 1011, 1041, 3609, 1226, 1148, 681, 595, 24666, 740, …
#> $ short_description <chr> "First We Feast videos offer an iconoclastic view in…
#> $ img               <chr> "https://i.ytimg.com/vi/aGhqumcE6_w/hqdefault.jpg", …
#> $ video_id          <chr> "aGhqumcE6_w", "4iSCOtYs_6Q", "H7pSH4YL-T4", "e5Ipfn…

Tip-Jokes Experiment Data

Dataset: tip_jokes | 211 rows, 5 columns

Experimental data examining whether a waiter leaving a joke or an advertisement on a card affects tipping behavior. All variables are already stored as numeric codes.

Source: Gueguen, N. (2002). The effects of a joke on tipping when it is delivered at the same time as the bill. Journal of Applied Social Psychology, 32(9), 1955–1963.

glimpse(tip_jokes)
#> Rows: 211
#> Columns: 5
#> $ card <dbl> 3, 2, 1, 3, 3, 3, 1, 1, 3, 3, 3, 1, 3, 1, 2, 2, 2, 3, 2, 3, 1, 1,…
#> $ tip  <dbl> 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1,…
#> $ ad   <dbl> 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1,…
#> $ joke <dbl> 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0,…
#> $ none <dbl> 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0,…

Variables (SPSS codes)

Variable	Values
`card`	1 = Advertisement, 2 = Joke, 3 = None
`tip`	0 = No, 1 = Yes
`ad`	0 = No, 1 = Yes (received advertisement)
`joke`	0 = No, 1 = Yes (received joke)
`none`	0 = No, 1 = Yes (no card)

MCU Films Data

Dataset: mcu | 23 rows, 11 columns

Box office performance and Rotten Tomatoes scores for Marvel Cinematic Universe films through the Infinity Saga (Phases 1–3).

Source: Internet Movie Database (IMDb), via the openintro package.

glimpse(mcu)
#> Rows: 23
#> Columns: 11
#> $ movie              <chr> "Iron Man", "The Incredible Hulk", "Iron Man 2", "T…
#> $ length_hrs         <dbl> 2, 1, 2, 1, 2, 2, 2, 1, 2, 2, 2, 1, 2, 1, 2, 2, 2, …
#> $ length_min         <dbl> 6, 52, 4, 55, 4, 23, 10, 52, 16, 1, 21, 57, 27, 55,…
#> $ release_date       <dttm> 2008-05-02, 2008-06-12, 2010-05-07, 2011-05-06, 20…
#> $ opening_weekend_us <dbl> 98618668, 55414050, 128122480, 65723338, 65058524, …
#> $ gross_us           <dbl> 319034126, 134806913, 312433331, 181030624, 1766545…
#> $ gross_world        <dbl> 585796247, 264770996, 623933331, 449326618, 3705697…
#> $ phase              <chr> "Phase 1", "Phase 1", "Phase 1", "Phase 1", "Phase …
#> $ critics            <dbl> 94, 68, 72, 77, 80, 91, 79, 67, 90, 91, 75, 83, 90,…
#> $ audience           <dbl> 91, 69, 71, 76, 75, 91, 78, 74, 92, 92, 82, 85, 89,…
#> $ favor              <chr> "Critics", "Audience", "Critics", "Critics", "Criti…

Categorical Variables (SPSS codes)

Variable	Values
`phase`	1 = Phase 1, 2 = Phase 2, 3 = Phase 3
`favor`	1 = Critics, 2 = Audience

Mock Jury Data

Dataset: mock_jury | 114 rows, 17 columns

Data from a study examining the effects of defendant physical attractiveness on mock jury sentencing decisions. Participants rated defendants of varying attractiveness levels who were charged with either burglary or swindle. All categorical variables are already stored as numeric codes.

Source: Plaster, M. E. (1989). The effects of physical attractiveness on mock jury decisions. Unpublished manuscript, East Carolina University.

glimpse(mock_jury)
#> Rows: 114
#> Columns: 17
#> $ attr          <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
#> $ crime         <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
#> $ years         <dbl> 10, 3, 5, 1, 7, 7, 3, 7, 2, 3, 5, 1, 1, 1, 2, 2, 10, 10,…
#> $ serious       <dbl> 8, 8, 5, 3, 9, 9, 4, 4, 5, 2, 9, 2, 3, 2, 4, 6, 7, 8, 4,…
#> $ exciting      <dbl> 6, 9, 3, 3, 1, 1, 5, 4, 4, 6, 5, 6, 7, 4, 6, 7, 7, 6, 6,…
#> $ calm          <dbl> 9, 5, 4, 6, 1, 5, 6, 9, 8, 8, 3, 7, 8, 6, 8, 8, 4, 9, 4,…
#> $ independent   <dbl> 9, 9, 6, 9, 5, 7, 7, 2, 8, 7, 5, 9, 7, 6, 7, 6, 1, 9, 9,…
#> $ sincere       <dbl> 8, 3, 3, 8, 1, 5, 6, 9, 7, 5, 6, 7, 8, 7, 7, 7, 1, 8, 7,…
#> $ warm          <dbl> 5, 5, 6, 8, 8, 8, 7, 6, 1, 7, 8, 4, 8, 4, 5, 5, 1, 5, 7,…
#> $ phyattr       <dbl> 9, 9, 7, 9, 8, 8, 8, 5, 9, 8, 9, 9, 9, 7, 9, 8, 9, 9, 7,…
#> $ sociable      <dbl> 9, 9, 4, 9, 9, 9, 7, 2, 1, 9, 5, 4, 9, 7, 6, 6, 8, 9, 5,…
#> $ kind          <dbl> 9, 4, 2, 9, 4, 5, 5, 9, 5, 7, 7, 3, 7, 2, 9, 5, 1, 9, 7,…
#> $ intelligent   <dbl> 6, 9, 4, 9, 7, 8, 7, 9, 9, 9, 8, 7, 9, 3, 9, 7, 1, 6, 8,…
#> $ strong        <dbl> 9, 5, 5, 9, 9, 9, 5, 2, 7, 5, 2, 4, 8, 7, 3, 6, 1, 9, 6,…
#> $ sophisticated <dbl> 9, 5, 4, 9, 9, 9, 6, 2, 7, 6, 5, 7, 5, 2, 7, 7, 1, 9, 7,…
#> $ happy         <dbl> 5, 5, 5, 9, 8, 9, 5, 2, 6, 8, 2, 6, 7, 1, 6, 6, 1, 5, 4,…
#> $ ownPA         <dbl> 9, 7, 5, 9, 7, 9, 6, 5, 3, 6, 9, 9, 9, 9, 6, 9, 5, 9, 9,…

Categorical Variables (SPSS codes)

Variable	Values
`attr`	1 = Beautiful, 2 = Average, 3 = Unattractive
`crime`	1 = Burglary, 2 = Swindle

Candy Rankings Data

Full dataset: candy | 85 rows, 13 columns Simplified: candy_simple | 85 rows, 5 columns

Candy power rankings based on 269,000 head-to-head matchups. The full dataset includes nine binary ingredient indicators, plus sugar percentile, price percentile, and win percentage. The simplified version includes only chocolate, sugarpercent, pricepercent, and winpercent.

Source: The Ultimate Halloween Candy Power Ranking, FiveThirtyEight (2017).

glimpse(candy_simple)
#> Rows: 85
#> Columns: 5
#> $ competitorname <chr> "100 Grand", "3 Musketeers", "One dime", "One quarter",…
#> $ chocolate      <chr> "Yes", "Yes", "No", "No", "No", "Yes", "Yes", "No", "No…
#> $ sugarpercent   <dbl> 0.732, 0.604, 0.011, 0.011, 0.906, 0.465, 0.604, 0.313,…
#> $ pricepercent   <dbl> 0.860, 0.511, 0.116, 0.511, 0.511, 0.767, 0.767, 0.511,…
#> $ winpercent     <dbl> 66.97173, 67.60294, 32.26109, 46.11650, 52.34146, 50.34…

Binary Variables (SPSS codes)

All binary ingredient variables: 0 = No, 1 = Yes.

Variables in the full dataset: chocolate, fruity, caramel, peanutyalmondy, nougat, crispedricewafer, hard, bar, pluribus.

The simplified dataset contains only chocolate.

Football Concussion Data

Dataset: football | 75 rows, 3 columns

Brain volume measurements comparing football players (with and without concussion history) to non-football-playing controls.

Source: Singh, R., Meier, T. B., Kuplicki, R., Savitz, J., Mukai, I., Cavanagh, L., Allen, T., Teague, T. K., Nerio, C., Polanski, D., & Bellgowan, P. S. F. (2014). Relationship of collegiate football experience and concussion with hippocampal volume and cognitive outcomes. JAMA, 311(18), 1883–1888.

glimpse(football)
#> Rows: 75
#> Columns: 3
#> $ group  <chr> "Control", "Control", "Control", "Control", "Control", "Control…
#> $ years  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
#> $ volume <dbl> 6.175, 6.220, 6.360, 6.465, 6.540, 6.780, 6.980, 7.075, 7.120, …

Categorical Variables (SPSS codes)

Variable	Values
`group`	1 = Control, 2 = Football no concussion, 3 = Football with concussion

Nebraska Football Box Scores

Dataset: huskers | Export: export_huskers_sav()

Game-level team statistics for all Nebraska Cornhuskers football games from September 1962 through the 2024 season. Includes scoring, rushing, passing, turnovers, penalties, point spreads, and weather data. This is a large dataset with 68 columns.

Source: Compiled from historical Nebraska football records. Weather data sourced from DarkSky API and Weather Underground.

glimpse(huskers)
#> Rows: 770
#> Columns: 59
#> $ date                 <dttm> 1962-09-22, 1962-09-29, 1962-10-06, 1962-10-13, …
#> $ time_ct              <dttm> 1899-12-31 14:00:00, 1899-12-31 12:30:00, 1899-1…
#> $ season               <dbl> 1962, 1962, 1962, 1962, 1962, 1962, 1962, 1962, 1…
#> $ opp                  <chr> "South Dakota", "Michigan", "Iowa State", "North …
#> $ site                 <chr> "Home", "Away", "Home", "Home", "Home", "Away", "…
#> $ conference           <chr> "Non-conference", "Non-conference", "Conference",…
#> $ opp_rank             <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, 10, NA, NA, N…
#> $ ne_rank              <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ result               <chr> "Win", "Win", "Win", "Win", "Win", "Win", "Loss",…
#> $ opp_score            <dbl> 0, 13, 22, 14, 6, 6, 16, 16, 0, 34, 34, 7, 7, 7, …
#> $ ne_score             <dbl> 53, 25, 36, 19, 26, 31, 7, 40, 14, 6, 36, 58, 14,…
#> $ opp_score_q1         <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ opp_score_q2         <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ opp_score_q3         <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ opp_score_q4         <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ opp_score_ot         <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ ne_score_q1          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ ne_score_q2          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ ne_score_q3          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ ne_score_q4          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ ne_score_ot          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ opp_rush_att         <dbl> 42, 43, 36, 44, 26, 42, 57, 30, 25, 50, 43, 32, 3…
#> $ opp_rush_yards       <dbl> 89, 170, 147, 130, 38, 94, 199, 122, 96, 191, 181…
#> $ ne_rush_att          <dbl> 42, 49, 62, 47, 61, 58, 34, 71, 72, 32, 42, 58, 5…
#> $ ne_rush_yards        <dbl> 313, 222, 234, 154, 317, 365, 141, 369, 272, 68, …
#> $ opp_pass_comp        <dbl> 4, 8, 7, 4, 7, 14, 1, 1, 4, 9, 24, 5, 9, 5, 9, 17…
#> $ opp_pass_att         <dbl> 8, 21, 14, 8, 22, 28, 5, 3, 12, 15, 46, 17, 23, 1…
#> $ opp_pass_yards       <dbl> 14, 83, 70, 48, 156, 150, 33, 11, 31, 182, 321, 4…
#> $ ne_pass_comp         <dbl> 11, 8, 9, 15, 6, 5, 2, 10, 6, 10, 9, 4, 4, 3, 5, …
#> $ ne_pass_att          <dbl> 17, 15, 15, 25, 15, 15, 14, 20, 13, 25, 14, 8, 8,…
#> $ ne_pass_yards        <dbl> 142, 119, 153, 202, 87, 90, 7, 149, 125, 130, 146…
#> $ opp_first_downs      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ ne_first_downs       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ opp_third_down_comp  <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ opp_third_down_att   <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ ne_third_down_comp   <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ ne_third_down_att    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ opp_fourth_down_comp <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ opp_fourth_down_att  <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ ne_fourth_down_comp  <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ ne_fourth_down_att   <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ opp_int              <dbl> 1, 0, 2, 0, 0, 1, 1, 1, 2, 1, 2, 0, 1, 2, 2, 4, 1…
#> $ opp_fum              <dbl> 0, 3, 1, 2, 3, 1, 0, 0, 0, 0, 2, 2, 0, 1, 0, 1, 3…
#> $ ne_int               <dbl> 0, 0, 0, 1, 3, 1, 3, 0, 1, 1, 0, 1, 1, 0, 2, 1, 0…
#> $ ne_fum               <dbl> 3, 2, 2, 0, 2, 1, 3, 1, 1, 0, 2, 2, 1, 0, 3, 3, 1…
#> $ opp_pen_num          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ opp_pen_yards        <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ ne_pen_num           <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ ne_pen_yards         <chr> "65", "65", "55", "43", "45", "68", "15", "15", "…
#> $ opp_possession       <dttm> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
#> $ ne_possession        <dttm> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
#> $ spread               <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, NA, NA…
#> $ total                <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ temp                 <dbl> 66.7, 62.7, 64.7, 78.7, 56.8, 64.1, 41.8, 62.8, 3…
#> $ humidity             <dbl> 0.66, 0.43, 0.91, 0.70, 0.53, 0.15, 0.65, 0.35, 0…
#> $ wind_speed           <dbl> 6.9, 17.9, 10.3, 9.2, 9.2, 18.4, 5.8, 15.0, 11.6,…
#> $ wind_bearing         <dbl> 90, 293, 135, 225, 0, 45, 338, 158, 0, 293, 330, …
#> $ win                  <chr> "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "No", "…
#> $ home                 <chr> "Yes", "No", "Yes", "Yes", "Yes", "No", "Yes", "N…

Categorical Variables (SPSS codes)

Variable	Values
`result`	1 = Win, 2 = Loss, 3 = Tie
`site`	1 = Home, 2 = Away, 3 = Neutral (home), 4 = Neutral (away)
`conference`	0 = Non-conference, 1 = Conference
`win`	0 = No, 1 = Yes (derived from `result`: Win → 1, else → 0)
`home`	0 = No, 1 = Yes (derived from `site`: Home/Neutral-home → 1, else → 0)

Cheese Characteristics Data

Dataset: cheese_data | Export: export_cheese_sav()

Cleaned subset of the cheese.com dataset originally featured in TidyTuesday (June 2024). Contains cheeses with known calcium content, with fat content and milk source cleaned and recoded for teaching data entry and transformation.

Source: cheese.com via TidyTuesday 2024-06-04.

glimpse(cheese_data)
#> Rows: 25
#> Columns: 12
#> $ id              <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,…
#> $ cheese          <chr> "Affidelice au Chablis", "Amul Emmental", "Amul Gouda"…
#> $ url             <chr> "https://www.cheese.com/affidelice-au-chablis/", "http…
#> $ milk            <chr> "cow", "cow", "cow", "cow", "cow, water buffalo", "goa…
#> $ milk_source     <int> 1, 1, 1, 1, 5, 5, 1, 1, 1, 1, 1, 3, 5, 5, 1, 5, 1, 1, …
#> $ country         <chr> "France", "India", "India", "India", "India", "Greece"…
#> $ family          <chr> NA, "Swiss Cheese", "Gouda", "Mozzarella", "Cheddar", …
#> $ type            <chr> "soft", "semi-hard", "semi-hard", "semi-soft, processe…
#> $ vegetarian      <int> 0, 1, 1, 1, 1, NA, 0, 0, 0, 0, 1, 0, NA, 0, 0, 1, 1, N…
#> $ color           <chr> "orange", "yellow", "yellow", "yellow", "yellow", "whi…
#> $ fat_content     <dbl> 55.0, 46.0, 46.0, NA, 26.0, 30.0, 25.5, NA, NA, NA, NA…
#> $ calcium_content <dbl> 26, 488, 492, 492, 343, 318, 700, 450, 725, 430, 90, 1…

Categorical Variables (SPSS codes)

Variable	Values
`milk_source`	1 = Cow, 2 = Goat, 3 = Sheep, 4 = Buffalo, 5 = Multiple, 6 = Other
`vegetarian`	0 = No, 1 = Yes

Lincoln Police Department Traffic Stops

Dataset: lpd_data | Export: export_lpd_sav()

Traffic stop records from the Lincoln Police Department, compiled from a multi-sheet Excel file with one sheet per year and stacked into a single data frame. Date and time variables have been parsed and a time-of-day category added. All categorical variables are stored as integers with SPSS-style value labels.

Source: Lincoln Police Department Open Data Portal.

glimpse(lpd_data)
#> Rows: 408,288
#> Columns: 11
#> $ year        <int> 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023…
#> $ race        <dbl> 1, 1, 1, 1, 1, 1, 4, 1, 2, 2, 1, 1, 1, 2, 1, 2, 1, 2, 3, 1…
#> $ sex         <dbl> 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 2, 2, 1, 1…
#> $ reason      <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
#> $ outcome     <dbl> 1, 1, 1, 2, 1, 1, 3, 5, 2, 2, 2, 2, 4, 2, 2, 2, 2, 1, 2, 2…
#> $ search      <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 5, 1, 1, 5, 1, 1, 1, 2, 1, 1, 1…
#> $ fid         <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,…
#> $ date        <date> 2023-01-01, 2023-01-01, 2023-01-01, 2023-01-01, 2023-01-0…
#> $ month       <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
#> $ time_of_day <int> 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 1, 1, 2, 2…
#> $ time        <chr> "00:13", "00:17", "00:20", "00:47", "00:47", "00:53", "01:…

Categorical Variables (SPSS codes)

Variable	Values
`time_of_day`	1 = Morning (5am–noon), 2 = Afternoon (noon–5pm), 3 = Evening (5pm–9pm), 4 = Night (9pm–5am)
`race`	1 = White, 2 = Black or African American, 3 = Hispanic or Latino, 4 = Asian, 5 = American Indian or Alaska Native, 6 = Native Hawaiian or Pacific Islander, 7 = Two or More Races, 8 = Unknown
`sex`	1 = Male, 2 = Female, 3 = Unknown
`reason`	1 = Traffic probable cause, 2 = Criminal probable cause, 3 = Other
`outcome`	1 = Traffic warning, 2 = Traffic official, 3 = Criminal cite & release, 4 = Lodged in jail, 5 = None
`search`	1 = None, 2 = Incident to arrest, 3 = Inventory, 4 = Consent, 5 = Probable cause

Parent-Child Observation Data

Dataset: parent_child_data | 100 rows, 11 columns

Simulated observational data from a study examining the effects of a remediation treatment on parenting behaviours. Includes maternal and child demographics alongside coded behavioural counts (praise, directives, and negative behaviours) from structured observations. All categorical variables are already stored as numeric codes.

Source: Simulated data generated to resemble a plausible parent-child observational study.

glimpse(parent_child_data)
#> Rows: 147
#> Columns: 11
#> $ casenum   <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 1…
#> $ mage      <dbl> 19, 30, 31, 22, 29, 28, 24, 34, 29, 18, 34, 24, 32, 24, 22, …
#> $ magegroup <int> 1, 2, 2, 1, 2, 2, 1, 2, 2, 1, 2, 1, 2, 1, 1, 1, 1, 2, 1, 2, …
#> $ cage      <dbl> 3.000000, 2.000000, 2.166667, 2.000000, 4.750000, 2.750000, …
#> $ cagegroup <int> 1, 1, 1, 1, 2, 1, 1, 2, 2, 1, 1, 1, 2, 2, 2, 2, 1, 1, 2, 1, …
#> $ famtype   <int> 1, 1, 2, 1, 1, 2, 2, 2, 2, 1, 1, 1, 2, 1, 1, 2, 1, 2, 2, 1, …
#> $ CLINREM   <int> 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, …
#> $ tx        <int> 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, …
#> $ praise    <dbl> 10, 1, 3, 10, 11, 4, 3, 6, 5, 7, 7, 6, 7, 11, 12, 5, 7, 7, 9…
#> $ direct    <dbl> 4, 6, 3, 6, 7, 5, 3, 6, 4, 6, 0, 6, 4, 4, 3, 4, 9, 4, 2, 5, …
#> $ negat     <dbl> 9, 7, 7, 8, 3, 7, 6, 2, 5, 5, 9, 3, 4, 3, 2, 7, 8, 8, 6, 5, …

Categorical Variables (SPSS codes)

Variable	Values
`magegroup`	1 = 18–27, 2 = 28–35
`cagegroup`	1 = 2–3 years, 2 = 4–5 years
`famtype`	1 = 2-parent family, 2 = Mother-only
`CLINREM`	0 = Remediation suggested, 1 = Not suggested
`tx`	0 = Control, 1 = Remediation

Hindsight Bias Data — Between-Groups

Dataset: hindsight_mg_data | 600 rows, 7 columns

Simulated data from a hindsight bias study in which participants viewed celebrity faces and estimated recognition time at baseline and again after receiving an outcome cue. The between-groups factor is whether participants saw old or new faces in the hindsight phase. Each row represents one participant-face trial (60 participants × 10 faces). All categorical variables are already stored as numeric codes.

Source: Simulated data generated to illustrate hindsight bias in a between-groups design.

glimpse(hindsight_mg_data)
#> Rows: 600
#> Columns: 7
#> $ participant_id <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2…
#> $ face_id        <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, …
#> $ condition      <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
#> $ fame_level     <int> 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 2, 2, 2, 2…
#> $ score_1        <dbl> 6.280091, 7.756997, 7.826435, 10.903807, 9.235433, 32.1…
#> $ score_2        <dbl> 5.933358, 2.283858, 5.795783, 5.056316, 2.311482, 26.41…
#> $ correct        <int> 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0…

Categorical Variables (SPSS codes)

Variable	Values
`condition`	1 = Old, 2 = New
`fame_level`	1 = Extremely Famous, 2 = Moderately Famous
`correct`	0 = Incorrect, 1 = Correct

Hindsight Bias Data — Within-Groups

Dataset: hindsight_wg_data | 60 rows, 6 columns

Participant-level summary of the hindsight bias study, averaged across faces within each fame level. Each row is one participant with separate columns for extremely and moderately famous faces at baseline and hindsight phases. Suitable for a 2×2 repeated-measures or mixed ANOVA. All categorical variables are already stored as numeric codes.

Source: Simulated data generated to illustrate hindsight bias in a within-groups design.

glimpse(hindsight_wg_data)
#> Rows: 60
#> Columns: 6
#> $ participant_id <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, …
#> $ condition      <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
#> $ EXTREMEavg_1   <dbl> 8.400553, 7.868591, 7.324184, 7.676370, 7.051691, 8.242…
#> $ MODERATEavg_1  <dbl> 31.20088, 30.01027, 30.89756, 29.80273, 31.88939, 30.84…
#> $ EXTREMEavg_2   <dbl> 4.276159, 5.562062, 2.631611, 5.132468, 4.899965, 3.488…
#> $ MODERATEavg_2  <dbl> 27.32385, 23.89094, 25.56715, 27.47061, 26.04688, 24.42…

Categorical Variables (SPSS codes)

Variable	Values
`condition`	1 = Old, 2 = New

Interpersonal Relationships Survey Data

Dataset: interpersonal_data | 574 rows, 33 columns

Simulated survey data from college students at a predominantly white rural state university. Contains demographics, relationship variables, and scores on several interpersonal and psychological scales. All categorical variables are already stored as numeric codes.

Source: Simulated data generated to resemble plausible survey responses from undergraduate psychology students.

glimpse(interpersonal_data)
#> Rows: 574
#> Columns: 33
#> $ age       <int> 19, 19, 19, 19, 21, 19, 21, 21, 19, 23, 19, 21, 19, 19, 20, …
#> $ gender    <chr> "Female", "Female", "Female", "Female", "Male", "Female", "F…
#> $ sexorient <chr> "Straight or heterosexual", "Straight or heterosexual", "Str…
#> $ race      <chr> "White", "White", "White", "Other", "White", "White", "Indig…
#> $ hand      <chr> "Right", "Right", "Right", "Right", "Right", "Right", "Right…
#> $ community <chr> "Rural", "Rural", "Urban", "Rural", "Suburban", "Suburban", …
#> $ parentedu <chr> "Yes", "Yes", "No", "Yes", "Yes", "No", "Yes", "Yes", "Yes",…
#> $ famclass  <chr> "Upper middle class", "Lower middle class", "Working class",…
#> $ faminc    <int> 84923, 60859, 36202, 114697, 90626, 74183, 35218, 103487, 34…
#> $ numsib    <int> 3, 0, 5, 2, 3, 6, 2, 2, 2, 4, 1, 0, 1, 3, 2, 1, 1, 2, 0, 1, …
#> $ move      <int> 0, 4, 1, 4, 2, 1, 3, 0, 2, 3, 1, 1, 4, 2, 2, 5, 4, 2, 4, 1, …
#> $ clsfrn    <int> 12, 8, 4, 7, 6, 3, 5, 5, 8, 3, 5, 7, 2, 4, 6, 5, 8, 13, 5, 4…
#> $ clsfrlst  <int> 11, 12, 3, 6, 5, 1, 6, 4, 9, 1, 3, 8, 0, 7, 3, 1, 11, 16, 6,…
#> $ greek_in  <chr> "Greek", "Independent", "Independent", "Independent", "Greek…
#> $ campus    <chr> "Yes", "No", "No", "Yes", "No", "No", "No", "Yes", "No", "No…
#> $ relsp     <chr> "Not in a relationship", "Not in a relationship", "In a mono…
#> $ rlength   <int> 11, 1, 7, 3, 5, 1, 4, 1, 6, 1, 7, 12, 9, 6, 4, 3, 1, 2, 1, 3…
#> $ serious   <int> NA, NA, 3, NA, 6, 5, 3, 6, 5, NA, NA, 7, 4, 2, NA, 3, NA, NA…
#> $ numrels   <int> 0, 0, 5, 0, 3, 3, 6, 6, 2, 0, 0, 4, 2, 5, 0, 2, 0, 4, 1, 0, …
#> $ gcb       <dbl> 2.36, 1.00, 2.75, 2.20, 2.01, NA, 2.10, 1.00, NA, 2.73, 3.47…
#> $ datdaq    <int> 59, 53, 51, 40, 37, NA, 34, 44, 56, 35, 38, 52, 43, 61, 61, …
#> $ assrtdaq  <int> 56, 33, 58, NA, 65, 57, 61, 46, 34, 57, 55, 54, 38, 37, 38, …
#> $ emorel    <int> 41, 42, 41, 40, 34, 33, 32, 44, 29, 43, 45, 37, 35, 34, 40, …
#> $ lacksc    <int> 30, 29, 34, 35, 38, 28, 37, 35, 29, 34, 35, 29, 30, 26, 33, …
#> $ auto      <int> 33, 37, 35, 27, 39, 42, 34, 37, 43, 24, 36, 41, 33, 22, 49, …
#> $ perspec   <int> 13, 20, NA, 20, 16, 17, 13, 25, 18, 18, 18, 8, 9, 22, 19, 9,…
#> $ fantasy   <int> 20, 19, 15, 17, 12, 8, 24, 15, 17, 8, 13, 14, 10, 18, 13, 22…
#> $ empath    <int> 15, 26, 16, 12, 17, 26, 15, 16, 20, 17, 21, 14, NA, 24, 20, …
#> $ distress  <int> 13, 16, 17, 13, 13, 14, 18, 9, 18, 9, 21, 6, 11, 14, 11, 21,…
#> $ polsoc    <int> 17, 16, 10, 17, 12, 11, 17, 19, 21, 18, 16, 21, 11, 15, 19, …
#> $ npolsoc   <int> 20, 22, 15, 18, 17, 20, 18, 27, 21, 21, 19, 22, 17, 17, 19, …
#> $ risc      <dbl> 4.19, 5.16, 4.68, 5.83, 3.64, 4.94, 5.50, 4.25, 4.95, 4.80, …
#> $ lsas      <int> 59, 27, 27, 59, 24, 27, 62, 41, 24, 39, 59, 57, 58, 59, 42, …

Categorical Variables (SPSS codes)

Variable	Values
`gender`	1 = Male, 2 = Female, 3 = Another
`race`	1 = Asian, 2 = Black, 3 = Indigenous, 4 = Latino/Hispanic, 5 = Middle Eastern, 6 = White, 7 = Other
`hand`	1 = Right, 2 = Left, 3 = Both
`community`	1 = Rural, 2 = Small town, 3 = Suburban, 4 = Urban
`famclass`	1 = Working, 2 = Lower, 3 = Lower middle, 4 = Upper middle, 5 = Upper
`greek_in`	1 = Independent, 2 = Greek

Scale Variables

The following composite scale scores are included: DAQ (Dating and Assertion Questionnaire) subscales, IDI (Interpersonal Dependency Inventory) subscales, IRI (Interpersonal Reactivity Index) subscales, Sociability subscales, RISC (Relational-Interdependent Self-Construal), GCB (Generic Conspiracist Beliefs), and LSAS-SR (Liebowitz Social Anxiety Scale — Self Report).

Self-Descriptive Survey Data

Dataset: self_descriptive_data | 547 rows, 37 columns

Simulated survey data from college students at a predominantly white rural state university. Contains demographics, relationship variables, and scores on personality and self-concept measures. All categorical variables are already stored as numeric codes.

Source: Simulated data generated to resemble plausible survey responses from undergraduate psychology students.

glimpse(self_descriptive_data)
#> Rows: 547
#> Columns: 37
#> $ age               <int> 20, 21, 23, 20, 22, 19, 22, 20, 20, 19, 19, 19, 19, …
#> $ gender            <chr> "Female", "Male", "Male", "Female", "Male", "Male", …
#> $ sexorient         <chr> "Straight or heterosexual", "Straight or heterosexua…
#> $ race              <chr> "White", "Latino/Hispanic", "White", "Black", "Indig…
#> $ hand              <chr> "Right", "Left", "Right", "Right", "Right", "Right",…
#> $ community         <chr> "Small town", "Rural", "Small town", "Suburban", "Ur…
#> $ parentedu         <chr> "Yes", "Yes", "No", "No", "No", "Yes", "Yes", "No", …
#> $ famclass          <chr> "Working class", "Upper middle class", "Upper middle…
#> $ faminc            <int> 29034, 72518, 105693, 32123, 23907, 158842, 26565, 6…
#> $ numsib            <int> 1, 0, 3, 2, 0, 2, 4, 2, 2, 4, 4, 3, 2, 3, 1, 1, 2, 1…
#> $ move              <int> 3, 0, 5, 1, 4, 2, 2, 2, 2, 2, 1, 3, 1, 4, 2, 2, 4, 2…
#> $ clsfrn            <int> 5, 6, 7, 6, 3, 8, 8, 4, 10, 3, 5, 2, 6, 4, 8, 4, 6, …
#> $ clsfrlst          <int> 1, 5, 6, NA, 0, 5, 9, 4, 8, 1, 6, 1, 5, 3, 14, 5, 8,…
#> $ greek_in          <chr> "Independent", "Independent", "Greek", "Independent"…
#> $ campus            <chr> "No", "Yes", "No", "No", "No", "No", "Yes", "No", "Y…
#> $ relsp             <chr> "Not in a relationship", "Not in a relationship", "I…
#> $ rlength           <int> 10, 4, 11, 1, 9, 8, 13, 5, 1, 8, 1, 1, 1, 8, 2, 10, …
#> $ serious           <int> NA, NA, 7, NA, 6, 5, 4, NA, 7, 5, NA, NA, NA, 3, NA,…
#> $ numrels           <int> 0, 4, 4, 3, 5, 5, 1, 0, 1, 2, 4, 0, 0, 4, 0, 3, 0, 6…
#> $ extraversion      <dbl> 6.0, 4.5, 6.0, 4.0, 4.5, 3.5, 2.0, 5.0, 5.0, NA, 2.5…
#> $ agreeableness     <dbl> 5.5, 4.0, 3.5, 4.0, NA, 7.0, 4.0, 4.5, 7.0, 4.0, 5.5…
#> $ conscientiousness <dbl> 7.0, 4.0, 5.5, 5.5, 5.5, 6.0, 3.5, 5.5, 6.5, 5.0, 5.…
#> $ emot_stability    <dbl> 5.5, 3.0, 5.0, 4.5, 4.0, 4.5, 5.0, 5.0, 4.0, 6.5, 6.…
#> $ openness          <dbl> 5.5, 6.5, 5.5, 4.0, 4.5, 5.0, 3.5, 4.5, 6.0, 4.5, 6.…
#> $ disc              <dbl> 5.98, 3.36, 2.45, 5.45, 2.21, 3.45, 4.87, 3.53, 4.64…
#> $ moral             <dbl> 4.15, 2.16, 4.66, 4.73, 3.81, 1.49, 3.99, 1.50, 3.74…
#> $ comp              <dbl> 5.73, 6.40, 6.05, 5.40, 2.53, 5.75, 4.53, 5.91, 4.22…
#> $ maas              <dbl> 5.29, 3.97, 4.39, 5.19, 2.85, 3.56, 4.46, 3.65, 4.20…
#> $ rse               <dbl> 36.81, 36.19, 30.39, 30.31, 37.52, 19.39, 28.22, 27.…
#> $ promote           <dbl> 2.53, 3.13, 3.31, 2.96, 3.92, 4.24, 2.02, 3.77, 2.44…
#> $ prevent           <dbl> 5.00, 3.03, 2.51, 4.24, 3.39, 2.76, 2.78, 2.02, 3.69…
#> $ atq               <int> 64, 59, 70, 41, 64, 63, 46, 55, 71, 39, 78, 69, 62, …
#> $ pmdc              <dbl> 2.62, 2.15, 2.49, 1.55, 2.53, 2.62, 1.94, 1.89, 2.46…
#> $ nsne              <dbl> 1.66, 2.03, 2.23, 1.35, 2.04, 1.91, 1.00, 1.83, NA, …
#> $ lse               <dbl> 1.48, 1.65, 2.17, 1.00, NA, 2.37, 1.04, 1.00, 1.80, …
#> $ help              <dbl> 2.51, 1.87, 2.40, 1.35, 2.01, 1.56, 1.90, NA, 2.55, …
#> $ ngse              <dbl> 32.66, 29.14, 25.78, 37.73, 29.66, 29.22, 36.58, 35.…

Categorical Variables

Same demographic coding as interpersonal_data (see above).

Scale Variables

The following composite scale scores are included: TIPI (Ten-Item Personality Inventory) Big Five subscales, MAAS (Mindful Attention Awareness Scale) subscales, RFQ (Regulatory Focus Questionnaire) subscales, ATQ-30 (Automatic Thoughts Questionnaire) subscales, RSE (Rosenberg Self-Esteem), and NGSE (New General Self-Efficacy).

Footnotes

The prep function uses bg (between-groups) while the dataset and export use mg.↩︎