Dataset Reference

A complete reference for all datasets in the psych350data package. Each entry includes a description, source attribution, variable overview, and the categorical variable codes used in SPSS exports and after prep_*().

For guidance on how to use these datasets, see the Getting Started vignette.

Quick Reference

Dataset Related Functions Source
superman

prep_superman()

label_superman()

export_superman_sav()

Rotten Tomatoes, Letterboxd, IMDb
superman_smes

prep_superman_smes()

label_superman_smes()

export_superman_smes_sav()

Simulated
superman_movies

prep_superman_movies()

label_superman_movies

export_superman_movies_sav()

IMDb Box Office Mojo
hotones

prep_hotones()

label_hotones()

export_hotones_sav()

Hot Ones / First We Feast (YouTube)
hotones_sauces

prep_hotones_sauces()

label_hotones_sauces()

export_hotones_sauces_sav()

Hot Ones / First We Feast (YouTube)
hotones_episodes

prep_hotones_episodes()

label_hotones_episodes()

export_hotones_episodes_sav()

Hot Ones / First We Feast (YouTube)
tip_jokes

prep_tip_jokes()

label_tip_jokes()

export_tip_jokes_sav()

Gueguen (2002)
mcu

prep_mcu()

label_mcu()

export_mcu_sav()

IMDb via openintro package
mock_jury

prep_mock_jury()

label_mock_jury()

export_mock_jury_sav()

Plaster (1989)
candy

prep_candy()

label_candy()

export_candy_sav()

FiveThirtyEight
candy_simple

prep_candy_simple()

label_candy_simple()

export_candy_simple_sav()

FiveThirtyEight
football

prep_football()

label_football()

export_football_sav()

Singh et al. (2014), JAMA
huskers

prep_huskers()

label_huskers()

export_huskers_sav()

Historical records
cheese_data

prep_cheese()

label_cheese()

export_cheese_sav()

cheese.com via TidyTuesday
lpd_data

prep_lpd()

label_lpd()

export_lpd_sav()

Lincoln Police Dept. Open Data
parent_child_data

prep_parent_child()

label_parent_child()

export_parent_child_sav()

Simulated
hindsight_mg_data

prep_hindsight_bg()1

label_hindsight_mg()

export_hindsight_mg_sav()

Simulated
hindsight_wg_data

prep_hindsight_wg()

label_hindsight_wg()

export_hindsight_wg_sav()

Simulated
interpersonal_data

prep_interpersonal()

label_interpersonal()

export_interpersonal_sav()

Simulated
self_descriptive_data

prep_self_descriptive()

label_self_descriptive()

export_selfdescriptive_sav()

Simulated
Note

A combined Superman dataset (superman_combined) can be exported with export_superman_combined_sav(). It joins superman_movies with superman actor data using join_superman_data().

Use list_datasets() to see all available dataset names from the R console.

How Categorical Variables Work

In R (raw): Categorical variables use human-readable character values like "Film", "Control", or "Minimal". These are ideal for plotting and exploration.

After prep_*(): Character categories are replaced with numeric codes that match SPSS. Use this when your R output needs to match SPSS output or when working with psych350lab functions.

In SPSS exports: Numeric codes are stored with value labels, so SPSS displays both the number and the category name.

# Raw R data: human-readable categories
superman |>
  select(num, media, type, age_grp) |>
  head(4)
#> # A tibble: 4 × 4
#>     num media               type    age_grp
#>   <int> <chr>               <chr>   <chr>  
#> 1     1 Superman            Film    Average
#> 2     2 Superman: The Movie Film    Average
#> 3     3 Smallville          TV Show Minimal
#> 4     4 Superman Returns    Film    Average
# After prep: numeric codes matching SPSS
prep_superman(superman) |>
  select(num, media, type, age_grp) |>
  head(4)
#> # A tibble: 4 × 4
#>     num media                type age_grp
#>   <int> <chr>               <dbl>   <dbl>
#> 1     1 Superman                1       2
#> 2     2 Superman: The Movie     1       2
#> 3     3 Smallville              2       1
#> 4     4 Superman Returns        1       2

Superman Actor Data

Dataset: superman | 11 rows, 27 columns

Physical characteristics and ratings data for actors who have played Superman across film, TV, and serial media. Includes actor heights, Rotten Tomatoes scores, Letterboxd ratings, and popularity metrics.

Source: Compiled from Rotten Tomatoes, Letterboxd, and IMDb.

glimpse(superman)
#> Rows: 11
#> Columns: 27
#> $ type              <chr> "Film", "Film", "TV Show", "Film", "Film", "Film", "…
#> $ media             <chr> "Superman", "Superman: The Movie", "Smallville", "Su…
#> $ year              <dbl> 2025, 1978, 2001, 2006, 1951, 2013, 1948, 2021, 1993…
#> $ clark_actor       <chr> "David Corenswet", "Christopher Reeve", "Tom Welling…
#> $ clark_height      <dbl> 1.93, 1.93, 1.90, 1.89, 1.86, 1.85, 1.85, 1.82, 1.81…
#> $ lois_actor        <chr> "Rachel Brosnahan", "Margot Kidder", "Erica Durance"…
#> $ lois_height       <dbl> 1.60, 1.72, 1.71, 1.65, 1.63, 1.63, 1.62, 1.68, 1.68…
#> $ rt_critics_score  <dbl> 83, 88, 78, 72, NA, 57, 83, 88, 86, NA, NA
#> $ rt_critics_count  <dbl> 484, 121, 111, 290, 7, 340, 484, 55, 20, NA, NA
#> $ rt_audience_score <dbl> 90, 86, 72, 60, 79, 75, 90, 84, 86, NA, NA
#> $ rt_audience_count <dbl> 25000, 250000, 2500, 250000, 250, 250000, 25000, 100…
#> $ ldb_likes         <dbl> 1105511, 99115, NA, 26076, 744, 204463, NA, NA, NA, …
#> $ ldb_scores        <dbl> 3.9, 3.7, NA, 2.7, 2.6, 3.0, NA, NA, NA, NA, NA
#> $ num               <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
#> $ clark_age         <dbl> 32.01, 26.22, 24.47, 26.72, 37.84, 30.11, 37.24, 33.…
#> $ lois_age          <dbl> 35.00, 30.16, 23.32, 23.48, 24.81, 38.82, 27.11, 40.…
#> $ age_diff          <dbl> 2.99, 3.94, 1.15, 3.24, 13.03, 8.71, 10.13, 6.65, 1.…
#> $ age_grp           <chr> "Average", "Average", "Minimal", "Average", "Big", "…
#> $ clark_height_in   <dbl> 75.9841, 75.9841, 74.8030, 74.4093, 73.2282, 72.8345…
#> $ lois_height_in    <dbl> 62.9920, 67.7164, 67.3227, 64.9605, 64.1731, 64.1731…
#> $ height_diff       <dbl> 12.9921, 8.2677, 7.4803, 9.4488, 9.0551, 8.6614, 9.0…
#> $ height_gap        <chr> "Big", "Big", "Average", "Big", "Big", "Big", "Big",…
#> $ clark_grp         <chr> "6ft or taller", "6ft or taller", "6ft or taller", "…
#> $ tomatometer       <chr> "Fresh", "Fresh", "Fresh", "Fresh", NA, "Rotten", "F…
#> $ rt_avg            <dbl> 86.5, 87.0, 75.0, 66.0, NA, 66.0, 86.5, 86.0, 86.0, …
#> $ rt_diff           <dbl> -86.71433, -85.91582, -65.62313, -59.84706, NA, -74.…
#> $ popular           <chr> "High", "Mid", NA, "Mid", "Low", "High", NA, NA, NA,…

Categorical Variables (SPSS codes)

Variable Values
type 1 = Film, 2 = TV Series, 3 = Serial
clark_grp 1 = Under 6ft, 2 = 6ft or taller
height_gap 1 = Minimal, 2 = Average, 3 = Big
age_grp 1 = Minimal, 2 = Average, 3 = Big
tomatometer 1 = Rotten, 2 = Fresh
popular 1 = Low, 2 = Mid, 3 = High

Superman SMES Data

Dataset: superman_smes | 47 rows, 7 columns

Simulated participant ratings on the Subjective Media Experience Scale (SMES), grouped by both the height gap and age difference between the Superman and Lois Lane actors. The emotion variable requires data prep conversion to factor; all other variables are already numeric.

Source: Simulated data for teaching purposes.

glimpse(superman_smes)
#> Rows: 47
#> Columns: 6
#> $ num                  <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15…
#> $ height_gap           <int> 3, 3, 3, 3, 1, 1, 3, 3, 3, 2, 2, 1, 2, 3, 1, 3, 3…
#> $ age_grp              <int> 2, 3, 2, 2, 1, 3, 2, 2, 2, 1, 1, 3, 1, 3, 1, 3, 3…
#> $ emotional_impact     <dbl> 10, 10, 13, 12, NA, 17, 14, 16, 14, 18, 16, 8, 13…
#> $ aesthetic_appeal     <dbl> 12, 12, 13, 10, 6, 10, 9, 9, 11, 9, 11, 8, 10, 6,…
#> $ cognitive_engagement <dbl> 4.5, 4.5, 4.5, 4.3, 3.7, 4.0, 4.7, 4.0, 4.4, 4.8,…

Categorical Variables (SPSS codes)

Variable Values
height_gap 1 = Minimal, 2 = Average, 3 = Big
age_grp 1 = Minimal, 2 = Average, 3 = Big
emotion 1 = Fear, 2 = Joy, 3 = Sadness, 4 = Anger, 5 = Disgust, 6 = Anxiety

Scale Information

Subscale Range
emotional_impact 4–20
aesthetic_appeal 3–15
cognitive_engagement 0–7

Superman Movies Data

Dataset: superman_movies | 10 rows, 22 columns

Box office and production data for Superman theatrical films, including budget, domestic and international grosses, and MPAA ratings.

Source: IMDb Box Office Mojo.

glimpse(superman_movies)
#> Rows: 8
#> Columns: 21
#> $ imdb_id             <chr> "tt5950044", "tt0078346", "tt0770828", "tt0348150"…
#> $ title               <chr> "Superman", "Superman: The Movie", "Man of Steel",…
#> $ year                <int> 2025, 1978, 2013, 2006, 1980, 1983, 1987, 2016
#> $ description         <chr> "Superman must reconcile his alien Kryptonian heri…
#> $ domestic_gross      <dbl> 354.22380, 134.47845, 291.04552, 200.08119, 108.18…
#> $ domestic_pct        <dbl> 57.3, 44.8, 43.4, 51.2, 50.0, 74.7, 51.8, 37.8
#> $ international_gross <dbl> 264.5000, 166.0000, 379.1000, 191.0000, 108.2000, …
#> $ international_pct   <dbl> 42.7, 55.2, 56.6, 48.8, 50.0, 25.3, 48.2, 62.2
#> $ worldwide_gross     <dbl> 618.72380, 300.47845, 670.14552, 391.08119, 216.38…
#> $ distributor         <chr> "Warner Bros.", "Warner Bros.", "Warner Bros.", "W…
#> $ opening_weekend     <dbl> 125.021735, 7.465343, 116.619362, 52.535096, 14.10…
#> $ budget              <dbl> 225, 55, 225, 270, 54, 39, 17, 250
#> $ release_date        <chr> "7/11/25", "12/15/78", "6/12/13", "6/28/06", "6/19…
#> $ mpaa                <chr> "PG-13", "PG", "PG-13", "PG-13", "PG", "Unrated", …
#> $ runtime_min         <int> 129, 143, 143, 154, 127, 125, 90, 151
#> $ genres              <chr> "Action Adventure Fantasy Sci-Fi", "Action Adventu…
#> $ poster_url          <chr> "https://m.media-amazon.com/images/M/MV5BZjFhZmU5N…
#> $ clark_actor         <chr> "David Corenswet", "Christopher Reeve", "Henry Cav…
#> $ roi                 <dbl> 1.7498836, 4.4632445, 1.9784245, 0.4484489, 3.0071…
#> $ budget_cat          <chr> "High", "Medium", "High", "High", "Medium", "Low",…
#> $ box_office_cat      <chr> "High", "Medium", "High", "Medium", "Medium", "Low…

Categorical Variables (SPSS codes)

Variable Values
mpaa 1 = G, 2 = PG, 3 = PG-13, 4 = R
budget_cat 1 = Low, 2 = Medium, 3 = High (tercile-based)
box_office_cat 1 = Low, 2 = Medium, 3 = High (tercile-based)

Hot Ones Guest Data

Dataset: hotones | Export: export_hotones_sav()

Data on guests from the YouTube show Hot Ones, including demographic information, Scoville ratings for each sauce consumed (SHU_1 through SHU_10), and whether the guest succeeded in eating all ten wings.

Source: Hot Ones / First We Feast (YouTube).

glimpse(hotones)
#> Rows: 369
#> Columns: 25
#> $ subn        <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,…
#> $ name        <chr> "Tony Yayo", "Anthony Rizzo", "Machine Gun Kelly", "Gunpla…
#> $ gender      <chr> "Male", "Male", "Male", "Male", "Male", "Male", "Male", "M…
#> $ age         <dbl> 36.94795, 25.75890, 28.04932, 35.93151, 39.44658, 26.77808…
#> $ occupation  <chr> "Rapper", "Athlete", "Rapper", "Rapper", "Rapper", "Rapper…
#> $ wing_total  <dbl> 4.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 3.0, 10.0, 10.0, …
#> $ alt_food    <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ helpers     <chr> "Water", "Milk", "Beer,Milk,Water", "Milk", "Milk,Water", …
#> $ SHU_1       <dbl> 747, 747, 747, 747, 747, 747, 747, 747, 2200, 2200, 2200, …
#> $ SHU_2       <dbl> 3600, 3600, 3600, 3600, 3600, 3600, 3600, 3600, 3000, 3000…
#> $ SHU_3       <dbl> 5790, 5790, 5790, 5790, 5790, 5790, 5790, 5790, 5790, 5790…
#> $ SHU_4       <dbl> 15000, 15000, 15000, 15000, 15000, 15000, 15000, 15000, 13…
#> $ SHU_5       <dbl> 13000, 13000, 13000, 13000, 13000, 13000, 13000, 13000, 15…
#> $ SHU_6       <dbl> 40600, 40600, 40600, 40600, 40600, 40600, 40600, 40600, 34…
#> $ SHU_7       <dbl> 30000, 30000, 30000, 30000, 30000, 30000, 30000, 30000, 40…
#> $ SHU_8       <dbl> 57000, 57000, 57000, 57000, 57000, 57000, 57000, 57000, 13…
#> $ SHU_9       <dbl> 180000, 180000, 180000, 180000, 180000, 180000, 180000, 18…
#> $ SHU_10      <dbl> 357000, 357000, 357000, 357000, 357000, 357000, 357000, 35…
#> $ result      <chr> "Failed", "Succeeded", "Succeeded", "Succeeded", "Succeede…
#> $ appearances <dbl> 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 2, 1…
#> $ season      <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
#> $ order       <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 8, 9, 10, …
#> $ views       <dbl> 1.371525, 0.825218, 3.164341, 1.164724, 1.143213, 0.861112…
#> $ likes       <dbl> 18108, 11301, 49026, 15473, 13761, 9627, 7572, 79735, 9395…
#> $ comments    <dbl> 1011, 1041, 3609, 1226, 1148, 681, 595, 24666, 740, 1669, …

Categorical Variables (SPSS codes)

Variable Values
gender 1 = Male, 2 = Female
result 1 = Succeeded, 2 = Failed, 3 = Incomplete
occupation 1 = Rapper, 2 = Athlete, 3 = Actor, 4 = Actor-Comedian, 5 = Comedian, 6 = Chef, 7 = Actor-Musician, 8 = Musician, 9 = DJ, 10 = YouTuber, 11 = Model, 12 = Wrestler, 13 = Magician, 14 = Other

Hot Ones Sauces Data

Dataset: hotones_sauces | Export: export_hotones_sauces_sav()

Data on the hot sauces used in each season and position of Hot Ones, including Scoville Heat Unit (SHU) ratings. All variables are numeric — no categorical conversion needed.

Source: Hot Ones / First We Feast (YouTube).

glimpse(hotones_sauces)
#> Rows: 250
#> Columns: 4
#> $ season     <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,…
#> $ order      <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1…
#> $ sauce_name <chr> "Texas Pete Original Hot Sauce", "Cholula Original Hot Sauc…
#> $ SHU        <dbl> 747, 3600, 5790, 15000, 13000, 40600, 30000, 57000, 180000,…

Hot Ones Episodes Data

Dataset: hotones_episodes | Export: export_hotones_episodes_sav()

Episode-level data from Hot Ones including guest names, episode titles, and YouTube engagement metrics (views, likes, comments). All variables are numeric or character — no categorical conversion needed.

Source: Hot Ones / First We Feast (YouTube).

glimpse(hotones_episodes)
#> Rows: 345
#> Columns: 11
#> $ season            <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
#> $ order             <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1…
#> $ guest             <chr> "Tony Yayo", "Anthony Rizzo", "Machine Gun Kelly", "…
#> $ episode_title     <chr> "Tony Yayo Talks Shmoney Dance & Eminem's Taco Habit…
#> $ publish_date      <dbl> 42075, 42136, 42166, 42178, 42227, 42242, 42284, 422…
#> $ views             <dbl> 1.371525, 0.825218, 3.164341, 1.164724, 1.143213, 0.…
#> $ likes             <dbl> 18108, 11301, 49026, 15473, 13761, 9627, 7572, 79735…
#> $ comments          <dbl> 1011, 1041, 3609, 1226, 1148, 681, 595, 24666, 740, …
#> $ short_description <chr> "First We Feast videos offer an iconoclastic view in…
#> $ img               <chr> "https://i.ytimg.com/vi/aGhqumcE6_w/hqdefault.jpg", …
#> $ video_id          <chr> "aGhqumcE6_w", "4iSCOtYs_6Q", "H7pSH4YL-T4", "e5Ipfn…

Tip-Jokes Experiment Data

Dataset: tip_jokes | 211 rows, 5 columns

Experimental data examining whether a waiter leaving a joke or an advertisement on a card affects tipping behavior. All variables are already stored as numeric codes.

Source: Gueguen, N. (2002). The effects of a joke on tipping when it is delivered at the same time as the bill. Journal of Applied Social Psychology, 32(9), 1955–1963.

glimpse(tip_jokes)
#> Rows: 211
#> Columns: 5
#> $ card <dbl> 3, 2, 1, 3, 3, 3, 1, 1, 3, 3, 3, 1, 3, 1, 2, 2, 2, 3, 2, 3, 1, 1,…
#> $ tip  <dbl> 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1,…
#> $ ad   <dbl> 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1,…
#> $ joke <dbl> 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0,…
#> $ none <dbl> 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0,…

Variables (SPSS codes)

Variable Values
card 1 = Advertisement, 2 = Joke, 3 = None
tip 0 = No, 1 = Yes
ad 0 = No, 1 = Yes (received advertisement)
joke 0 = No, 1 = Yes (received joke)
none 0 = No, 1 = Yes (no card)

MCU Films Data

Dataset: mcu | 23 rows, 11 columns

Box office performance and Rotten Tomatoes scores for Marvel Cinematic Universe films through the Infinity Saga (Phases 1–3).

Source: Internet Movie Database (IMDb), via the openintro package.

glimpse(mcu)
#> Rows: 23
#> Columns: 11
#> $ movie              <chr> "Iron Man", "The Incredible Hulk", "Iron Man 2", "T…
#> $ length_hrs         <dbl> 2, 1, 2, 1, 2, 2, 2, 1, 2, 2, 2, 1, 2, 1, 2, 2, 2, …
#> $ length_min         <dbl> 6, 52, 4, 55, 4, 23, 10, 52, 16, 1, 21, 57, 27, 55,…
#> $ release_date       <dttm> 2008-05-02, 2008-06-12, 2010-05-07, 2011-05-06, 20…
#> $ opening_weekend_us <dbl> 98618668, 55414050, 128122480, 65723338, 65058524, …
#> $ gross_us           <dbl> 319034126, 134806913, 312433331, 181030624, 1766545…
#> $ gross_world        <dbl> 585796247, 264770996, 623933331, 449326618, 3705697…
#> $ phase              <chr> "Phase 1", "Phase 1", "Phase 1", "Phase 1", "Phase …
#> $ critics            <dbl> 94, 68, 72, 77, 80, 91, 79, 67, 90, 91, 75, 83, 90,…
#> $ audience           <dbl> 91, 69, 71, 76, 75, 91, 78, 74, 92, 92, 82, 85, 89,…
#> $ favor              <chr> "Critics", "Audience", "Critics", "Critics", "Criti…

Categorical Variables (SPSS codes)

Variable Values
phase 1 = Phase 1, 2 = Phase 2, 3 = Phase 3
favor 1 = Critics, 2 = Audience

Mock Jury Data

Dataset: mock_jury | 114 rows, 17 columns

Data from a study examining the effects of defendant physical attractiveness on mock jury sentencing decisions. Participants rated defendants of varying attractiveness levels who were charged with either burglary or swindle. All categorical variables are already stored as numeric codes.

Source: Plaster, M. E. (1989). The effects of physical attractiveness on mock jury decisions. Unpublished manuscript, East Carolina University.

glimpse(mock_jury)
#> Rows: 114
#> Columns: 17
#> $ attr          <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
#> $ crime         <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
#> $ years         <dbl> 10, 3, 5, 1, 7, 7, 3, 7, 2, 3, 5, 1, 1, 1, 2, 2, 10, 10,…
#> $ serious       <dbl> 8, 8, 5, 3, 9, 9, 4, 4, 5, 2, 9, 2, 3, 2, 4, 6, 7, 8, 4,…
#> $ exciting      <dbl> 6, 9, 3, 3, 1, 1, 5, 4, 4, 6, 5, 6, 7, 4, 6, 7, 7, 6, 6,…
#> $ calm          <dbl> 9, 5, 4, 6, 1, 5, 6, 9, 8, 8, 3, 7, 8, 6, 8, 8, 4, 9, 4,…
#> $ independent   <dbl> 9, 9, 6, 9, 5, 7, 7, 2, 8, 7, 5, 9, 7, 6, 7, 6, 1, 9, 9,…
#> $ sincere       <dbl> 8, 3, 3, 8, 1, 5, 6, 9, 7, 5, 6, 7, 8, 7, 7, 7, 1, 8, 7,…
#> $ warm          <dbl> 5, 5, 6, 8, 8, 8, 7, 6, 1, 7, 8, 4, 8, 4, 5, 5, 1, 5, 7,…
#> $ phyattr       <dbl> 9, 9, 7, 9, 8, 8, 8, 5, 9, 8, 9, 9, 9, 7, 9, 8, 9, 9, 7,…
#> $ sociable      <dbl> 9, 9, 4, 9, 9, 9, 7, 2, 1, 9, 5, 4, 9, 7, 6, 6, 8, 9, 5,…
#> $ kind          <dbl> 9, 4, 2, 9, 4, 5, 5, 9, 5, 7, 7, 3, 7, 2, 9, 5, 1, 9, 7,…
#> $ intelligent   <dbl> 6, 9, 4, 9, 7, 8, 7, 9, 9, 9, 8, 7, 9, 3, 9, 7, 1, 6, 8,…
#> $ strong        <dbl> 9, 5, 5, 9, 9, 9, 5, 2, 7, 5, 2, 4, 8, 7, 3, 6, 1, 9, 6,…
#> $ sophisticated <dbl> 9, 5, 4, 9, 9, 9, 6, 2, 7, 6, 5, 7, 5, 2, 7, 7, 1, 9, 7,…
#> $ happy         <dbl> 5, 5, 5, 9, 8, 9, 5, 2, 6, 8, 2, 6, 7, 1, 6, 6, 1, 5, 4,…
#> $ ownPA         <dbl> 9, 7, 5, 9, 7, 9, 6, 5, 3, 6, 9, 9, 9, 9, 6, 9, 5, 9, 9,…

Categorical Variables (SPSS codes)

Variable Values
attr 1 = Beautiful, 2 = Average, 3 = Unattractive
crime 1 = Burglary, 2 = Swindle

Candy Rankings Data

Full dataset: candy | 85 rows, 13 columns Simplified: candy_simple | 85 rows, 5 columns

Candy power rankings based on 269,000 head-to-head matchups. The full dataset includes nine binary ingredient indicators, plus sugar percentile, price percentile, and win percentage. The simplified version includes only chocolate, sugarpercent, pricepercent, and winpercent.

Source: The Ultimate Halloween Candy Power Ranking, FiveThirtyEight (2017).

glimpse(candy_simple)
#> Rows: 85
#> Columns: 5
#> $ competitorname <chr> "100 Grand", "3 Musketeers", "One dime", "One quarter",…
#> $ chocolate      <chr> "Yes", "Yes", "No", "No", "No", "Yes", "Yes", "No", "No…
#> $ sugarpercent   <dbl> 0.732, 0.604, 0.011, 0.011, 0.906, 0.465, 0.604, 0.313,…
#> $ pricepercent   <dbl> 0.860, 0.511, 0.116, 0.511, 0.511, 0.767, 0.767, 0.511,…
#> $ winpercent     <dbl> 66.97173, 67.60294, 32.26109, 46.11650, 52.34146, 50.34…

Binary Variables (SPSS codes)

All binary ingredient variables: 0 = No, 1 = Yes.

Variables in the full dataset: chocolate, fruity, caramel, peanutyalmondy, nougat, crispedricewafer, hard, bar, pluribus.

The simplified dataset contains only chocolate.


Football Concussion Data

Dataset: football | 75 rows, 3 columns

Brain volume measurements comparing football players (with and without concussion history) to non-football-playing controls.

Source: Singh, R., Meier, T. B., Kuplicki, R., Savitz, J., Mukai, I., Cavanagh, L., Allen, T., Teague, T. K., Nerio, C., Polanski, D., & Bellgowan, P. S. F. (2014). Relationship of collegiate football experience and concussion with hippocampal volume and cognitive outcomes. JAMA, 311(18), 1883–1888.

glimpse(football)
#> Rows: 75
#> Columns: 3
#> $ group  <chr> "Control", "Control", "Control", "Control", "Control", "Control…
#> $ years  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
#> $ volume <dbl> 6.175, 6.220, 6.360, 6.465, 6.540, 6.780, 6.980, 7.075, 7.120, …

Categorical Variables (SPSS codes)

Variable Values
group 1 = Control, 2 = Football no concussion, 3 = Football with concussion

Nebraska Football Box Scores

Dataset: huskers | Export: export_huskers_sav()

Game-level team statistics for all Nebraska Cornhuskers football games from September 1962 through the 2024 season. Includes scoring, rushing, passing, turnovers, penalties, point spreads, and weather data. This is a large dataset with 68 columns.

Source: Compiled from historical Nebraska football records. Weather data sourced from DarkSky API and Weather Underground.

glimpse(huskers)
#> Rows: 770
#> Columns: 57
#> $ date                 <dttm> 1962-09-22, 1962-09-29, 1962-10-06, 1962-10-13, …
#> $ time_ct              <dttm> 1899-12-31 14:00:00, 1899-12-31 12:30:00, 1899-1…
#> $ season               <dbl> 1962, 1962, 1962, 1962, 1962, 1962, 1962, 1962, 1…
#> $ opp                  <chr> "South Dakota", "Michigan", "Iowa State", "North …
#> $ site                 <chr> "Home", "Away", "Home", "Home", "Home", "Away", "…
#> $ conference           <chr> "Non-conference", "Non-conference", "Conference",…
#> $ opp_rank             <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, 10, NA, NA, N…
#> $ ne_rank              <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ result               <chr> "Win", "Win", "Win", "Win", "Win", "Win", "Loss",…
#> $ opp_score            <dbl> 0, 13, 22, 14, 6, 6, 16, 16, 0, 34, 34, 7, 7, 7, …
#> $ ne_score             <dbl> 53, 25, 36, 19, 26, 31, 7, 40, 14, 6, 36, 58, 14,…
#> $ opp_score_q1         <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ opp_score_q2         <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ opp_score_q3         <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ opp_score_q4         <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ opp_score_ot         <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ ne_score_q1          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ ne_score_q2          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ ne_score_q3          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ ne_score_q4          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ ne_score_ot          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ opp_rush_att         <dbl> 42, 43, 36, 44, 26, 42, 57, 30, 25, 50, 43, 32, 3…
#> $ opp_rush_yards       <dbl> 89, 170, 147, 130, 38, 94, 199, 122, 96, 191, 181…
#> $ ne_rush_att          <dbl> 42, 49, 62, 47, 61, 58, 34, 71, 72, 32, 42, 58, 5…
#> $ ne_rush_yards        <dbl> 313, 222, 234, 154, 317, 365, 141, 369, 272, 68, …
#> $ opp_pass_comp        <dbl> 4, 8, 7, 4, 7, 14, 1, 1, 4, 9, 24, 5, 9, 5, 9, 17…
#> $ opp_pass_att         <dbl> 8, 21, 14, 8, 22, 28, 5, 3, 12, 15, 46, 17, 23, 1…
#> $ opp_pass_yards       <dbl> 14, 83, 70, 48, 156, 150, 33, 11, 31, 182, 321, 4…
#> $ ne_pass_comp         <dbl> 11, 8, 9, 15, 6, 5, 2, 10, 6, 10, 9, 4, 4, 3, 5, …
#> $ ne_pass_att          <dbl> 17, 15, 15, 25, 15, 15, 14, 20, 13, 25, 14, 8, 8,…
#> $ ne_pass_yards        <dbl> 142, 119, 153, 202, 87, 90, 7, 149, 125, 130, 146…
#> $ opp_first_downs      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ ne_first_downs       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ opp_third_down_comp  <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ opp_third_down_att   <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ ne_third_down_comp   <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ ne_third_down_att    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ opp_fourth_down_comp <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ opp_fourth_down_att  <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ ne_fourth_down_comp  <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ ne_fourth_down_att   <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ opp_int              <dbl> 1, 0, 2, 0, 0, 1, 1, 1, 2, 1, 2, 0, 1, 2, 2, 4, 1…
#> $ opp_fum              <dbl> 0, 3, 1, 2, 3, 1, 0, 0, 0, 0, 2, 2, 0, 1, 0, 1, 3…
#> $ ne_int               <dbl> 0, 0, 0, 1, 3, 1, 3, 0, 1, 1, 0, 1, 1, 0, 2, 1, 0…
#> $ ne_fum               <dbl> 3, 2, 2, 0, 2, 1, 3, 1, 1, 0, 2, 2, 1, 0, 3, 3, 1…
#> $ opp_pen_num          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ opp_pen_yards        <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ ne_pen_num           <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ ne_pen_yards         <chr> "65", "65", "55", "43", "45", "68", "15", "15", "…
#> $ opp_possession       <dttm> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
#> $ ne_possession        <dttm> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
#> $ spread               <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, NA, NA…
#> $ total                <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ temp                 <dbl> 66.7, 62.7, 64.7, 78.7, 56.8, 64.1, 41.8, 62.8, 3…
#> $ humidity             <dbl> 0.66, 0.43, 0.91, 0.70, 0.53, 0.15, 0.65, 0.35, 0…
#> $ wind_speed           <dbl> 6.9, 17.9, 10.3, 9.2, 9.2, 18.4, 5.8, 15.0, 11.6,…
#> $ wind_bearing         <dbl> 90, 293, 135, 225, 0, 45, 338, 158, 0, 293, 330, …

Categorical Variables (SPSS codes)

Variable Values
result 1 = Win, 2 = Loss, 3 = Tie
site 1 = Home, 2 = Away, 3 = Neutral (home), 4 = Neutral (away)
conference 0 = Non-conference, 1 = Conference

Cheese Characteristics Data

Dataset: cheese_data | Export: export_cheese_sav()

Cleaned subset of the cheese.com dataset originally featured in TidyTuesday (June 2024). Contains cheeses with known calcium content, with fat content and milk source cleaned and recoded for teaching data entry and transformation.

Source: cheese.com via TidyTuesday 2024-06-04.

glimpse(cheese_data)
#> Rows: 25
#> Columns: 12
#> $ id              <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,…
#> $ cheese          <chr> "Affidelice au Chablis", "Amul Emmental", "Amul Gouda"…
#> $ url             <chr> "https://www.cheese.com/affidelice-au-chablis/", "http…
#> $ milk            <chr> "cow", "cow", "cow", "cow", "cow, water buffalo", "goa…
#> $ milk_source     <int> 1, 1, 1, 1, 5, 5, 1, 1, 1, 1, 1, 3, 5, 5, 1, 5, 1, 1, …
#> $ country         <chr> "France", "India", "India", "India", "India", "Greece"…
#> $ family          <chr> NA, "Swiss Cheese", "Gouda", "Mozzarella", "Cheddar", …
#> $ type            <chr> "soft", "semi-hard", "semi-hard", "semi-soft, processe…
#> $ vegetarian      <int> 0, 1, 1, 1, 1, NA, 0, 0, 0, 0, 1, 0, NA, 0, 0, 1, 1, N…
#> $ color           <chr> "orange", "yellow", "yellow", "yellow", "yellow", "whi…
#> $ fat_content     <dbl> 55.0, 46.0, 46.0, NA, 26.0, 30.0, 25.5, NA, NA, NA, NA…
#> $ calcium_content <dbl> 26, 488, 492, 492, 343, 318, 700, 450, 725, 430, 90, 1…

Categorical Variables (SPSS codes)

Variable Values
milk_source 1 = Cow, 2 = Goat, 3 = Sheep, 4 = Buffalo, 5 = Multiple, 6 = Other
vegetarian 0 = No, 1 = Yes

Lincoln Police Department Traffic Stops

Dataset: lpd_data | Export: export_lpd_sav()

Traffic stop records from the Lincoln Police Department, compiled from a multi-sheet Excel file with one sheet per year and stacked into a single data frame. Date and time variables have been parsed and a time-of-day category added. All categorical variables are stored as integers with SPSS-style value labels.

Source: Lincoln Police Department Open Data Portal.

glimpse(lpd_data)
#> Rows: 408,288
#> Columns: 11
#> $ year        <int> 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023…
#> $ race        <dbl> 1, 1, 1, 1, 1, 1, 4, 1, 2, 2, 1, 1, 1, 2, 1, 2, 1, 2, 3, 1…
#> $ sex         <dbl> 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 2, 2, 1, 1…
#> $ reason      <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
#> $ outcome     <dbl> 1, 1, 1, 2, 1, 1, 3, 5, 2, 2, 2, 2, 4, 2, 2, 2, 2, 1, 2, 2…
#> $ search      <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 5, 1, 1, 5, 1, 1, 1, 2, 1, 1, 1…
#> $ fid         <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,…
#> $ date        <date> 2023-01-01, 2023-01-01, 2023-01-01, 2023-01-01, 2023-01-0…
#> $ month       <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
#> $ time_of_day <int> 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 1, 1, 2, 2…
#> $ time        <chr> "00:13", "00:17", "00:20", "00:47", "00:47", "00:53", "01:…

Categorical Variables (SPSS codes)

Variable Values
time_of_day 1 = Morning (5am–noon), 2 = Afternoon (noon–5pm), 3 = Evening (5pm–9pm), 4 = Night (9pm–5am)
race 1 = White, 2 = Black or African American, 3 = Hispanic or Latino, 4 = Asian, 5 = American Indian or Alaska Native, 6 = Native Hawaiian or Pacific Islander, 7 = Two or More Races, 8 = Unknown
sex 1 = Male, 2 = Female, 3 = Unknown
reason 1 = Traffic probable cause, 2 = Criminal probable cause, 3 = Other
outcome 1 = Traffic warning, 2 = Traffic official, 3 = Criminal cite & release, 4 = Lodged in jail, 5 = None
search 1 = None, 2 = Incident to arrest, 3 = Inventory, 4 = Consent, 5 = Probable cause

Parent-Child Observation Data

Dataset: parent_child_data | 100 rows, 11 columns

Simulated observational data from a study examining the effects of a remediation treatment on parenting behaviours. Includes maternal and child demographics alongside coded behavioural counts (praise, directives, and negative behaviours) from structured observations. All categorical variables are already stored as numeric codes.

Source: Simulated data generated to resemble a plausible parent-child observational study.

glimpse(parent_child_data)
#> Rows: 147
#> Columns: 11
#> $ casenum   <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 1…
#> $ mage      <dbl> 19, 30, 31, 22, 29, 28, 24, 34, 29, 18, 34, 24, 32, 24, 22, …
#> $ magegroup <int> 1, 2, 2, 1, 2, 2, 1, 2, 2, 1, 2, 1, 2, 1, 1, 1, 1, 2, 1, 2, …
#> $ cage      <dbl> 3.000000, 2.000000, 2.166667, 2.000000, 4.750000, 2.750000, …
#> $ cagegroup <int> 1, 1, 1, 1, 2, 1, 1, 2, 2, 1, 1, 1, 2, 2, 2, 2, 1, 1, 2, 1, …
#> $ famtype   <int> 1, 1, 2, 1, 1, 2, 2, 2, 2, 1, 1, 1, 2, 1, 1, 2, 1, 2, 2, 1, …
#> $ CLINREM   <int> 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, …
#> $ tx        <int> 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, …
#> $ praise    <dbl> 9, 1, 3, 8, 9, 5, 4, 7, 6, 7, 7, 6, 7, 10, 10, 5, 6, 8, 10, …
#> $ direct    <dbl> 4, 5, 2, 6, 9, 6, 2, 7, 5, 4, 0, 4, 4, 6, 5, 4, 9, 5, 3, 6, …
#> $ negat     <dbl> 7, 7, 8, 6, 5, 6, 5, 3, 7, 5, 9, 3, 5, 5, 3, 8, 7, 7, 7, 3, …

Categorical Variables (SPSS codes)

Variable Values
magegroup 1 = 18–27, 2 = 28–35
cagegroup 1 = 2–3 years, 2 = 4–5 years
famtype 1 = 2-parent family, 2 = Mother-only
CLINREM 0 = Remediation suggested, 1 = Not suggested
tx 0 = Control, 1 = Remediation

Hindsight Bias Data — Between-Groups

Dataset: hindsight_mg_data | 600 rows, 7 columns

Simulated data from a hindsight bias study in which participants viewed celebrity faces and estimated recognition time at baseline and again after receiving an outcome cue. The between-groups factor is whether participants saw old or new faces in the hindsight phase. Each row represents one participant-face trial (60 participants × 10 faces). All categorical variables are already stored as numeric codes.

Source: Simulated data generated to illustrate hindsight bias in a between-groups design.

glimpse(hindsight_mg_data)
#> Rows: 600
#> Columns: 7
#> $ participant_id <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2…
#> $ face_id        <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, …
#> $ condition      <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
#> $ fame_level     <int> 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 2, 2, 2, 2…
#> $ score_1        <dbl> 6.280091, 7.756997, 7.826435, 10.903807, 9.235433, 32.1…
#> $ score_2        <dbl> 5.933358, 2.283858, 5.795783, 5.056316, 2.311482, 26.41…
#> $ correct        <int> 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0…

Categorical Variables (SPSS codes)

Variable Values
condition 1 = Old, 2 = New
fame_level 1 = Extremely Famous, 2 = Moderately Famous
correct 0 = Incorrect, 1 = Correct

Hindsight Bias Data — Within-Groups

Dataset: hindsight_wg_data | 60 rows, 6 columns

Participant-level summary of the hindsight bias study, averaged across faces within each fame level. Each row is one participant with separate columns for extremely and moderately famous faces at baseline and hindsight phases. Suitable for a 2×2 repeated-measures or mixed ANOVA. All categorical variables are already stored as numeric codes.

Source: Simulated data generated to illustrate hindsight bias in a within-groups design.

glimpse(hindsight_wg_data)
#> Rows: 60
#> Columns: 6
#> $ participant_id <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, …
#> $ condition      <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
#> $ EXTREMEavg_1   <dbl> 8.400553, 7.868591, 7.324184, 7.676370, 7.051691, 8.242…
#> $ MODERATEavg_1  <dbl> 31.20088, 30.01027, 30.89756, 29.80273, 31.88939, 30.84…
#> $ EXTREMEavg_2   <dbl> 4.276159, 5.562062, 2.631611, 5.132468, 4.899965, 3.488…
#> $ MODERATEavg_2  <dbl> 27.32385, 23.89094, 25.56715, 27.47061, 26.04688, 24.42…

Categorical Variables (SPSS codes)

Variable Values
condition 1 = Old, 2 = New

Interpersonal Relationships Survey Data

Dataset: interpersonal_data | 574 rows, 33 columns

Simulated survey data from college students at a predominantly white rural state university. Contains demographics, relationship variables, and scores on several interpersonal and psychological scales. All categorical variables are already stored as numeric codes.

Source: Simulated data generated to resemble plausible survey responses from undergraduate psychology students.

glimpse(interpersonal_data)
#> Rows: 574
#> Columns: 33
#> $ age       <int> 19, 19, 19, 19, 21, 19, 21, 21, 19, 23, 19, 21, 19, 19, 20, …
#> $ gender    <chr> "Female", "Female", "Female", "Female", "Male", "Female", "F…
#> $ sexorient <chr> "Straight or heterosexual", "Straight or heterosexual", "Str…
#> $ race      <chr> "White", "White", "White", "Other", "White", "White", "Indig…
#> $ hand      <chr> "Right", "Right", "Right", "Right", "Right", "Right", "Right…
#> $ community <chr> "Rural", "Rural", "Urban", "Rural", "Suburban", "Suburban", …
#> $ parentedu <chr> "Yes", "Yes", "No", "Yes", "Yes", "No", "Yes", "Yes", "Yes",…
#> $ famclass  <chr> "Upper middle class", "Lower middle class", "Working class",…
#> $ faminc    <int> 84923, 60859, 36202, 114697, 90626, 74183, 35218, 103487, 34…
#> $ numsib    <int> 3, 0, 5, 2, 3, 6, 2, 2, 2, 4, 1, 0, 1, 3, 2, 1, 1, 2, 0, 1, …
#> $ move      <int> 0, 4, 1, 4, 2, 1, 3, 0, 2, 3, 1, 1, 4, 2, 2, 5, 4, 2, 4, 1, …
#> $ clsfrn    <int> 12, 8, 4, 7, 6, 3, 5, 5, 8, 3, 5, 7, 2, 4, 6, 5, 8, 13, 5, 4…
#> $ clsfrlst  <int> 11, 12, 3, 6, 5, 1, 6, 4, 9, 1, 3, 8, 0, 7, 3, 1, 11, 16, 6,…
#> $ greek_in  <chr> "Greek", "Independent", "Independent", "Independent", "Greek…
#> $ campus    <chr> "Yes", "No", "No", "Yes", "No", "No", "No", "Yes", "No", "No…
#> $ relsp     <chr> "Not in a relationship", "Not in a relationship", "In a mono…
#> $ rlength   <int> 11, 1, 7, 3, 5, 1, 4, 1, 6, 1, 7, 12, 9, 6, 4, 3, 1, 2, 1, 3…
#> $ serious   <int> NA, NA, 3, NA, 6, 5, 3, 6, 5, NA, NA, 7, 4, 2, NA, 3, NA, NA…
#> $ numrels   <int> 0, 0, 5, 0, 3, 3, 6, 6, 2, 0, 0, 4, 2, 5, 0, 2, 0, 4, 1, 0, …
#> $ gcb       <dbl> 2.36, 1.00, 2.75, 2.20, 2.01, NA, 2.10, 1.00, NA, 2.73, 3.47…
#> $ datdaq    <int> 59, 53, 51, 40, 37, NA, 34, 44, 56, 35, 38, 52, 43, 61, 61, …
#> $ assrtdaq  <int> 56, 33, 58, NA, 65, 57, 61, 46, 34, 57, 55, 54, 38, 37, 38, …
#> $ emorel    <int> 41, 42, 41, 40, 34, 33, 32, 44, 29, 43, 45, 37, 35, 34, 40, …
#> $ lacksc    <int> 30, 29, 34, 35, 38, 28, 37, 35, 29, 34, 35, 29, 30, 26, 33, …
#> $ auto      <int> 33, 37, 35, 27, 39, 42, 34, 37, 43, 24, 36, 41, 33, 22, 49, …
#> $ perspec   <int> 13, 20, NA, 20, 16, 17, 13, 25, 18, 18, 18, 8, 9, 22, 19, 9,…
#> $ fantasy   <int> 20, 19, 15, 17, 12, 8, 24, 15, 17, 8, 13, 14, 10, 18, 13, 22…
#> $ empath    <int> 15, 26, 16, 12, 17, 26, 15, 16, 20, 17, 21, 14, NA, 24, 20, …
#> $ distress  <int> 13, 16, 17, 13, 13, 14, 18, 9, 18, 9, 21, 6, 11, 14, 11, 21,…
#> $ polsoc    <int> 17, 16, 10, 17, 12, 11, 17, 19, 21, 18, 16, 21, 11, 15, 19, …
#> $ npolsoc   <int> 20, 22, 15, 18, 17, 20, 18, 27, 21, 21, 19, 22, 17, 17, 19, …
#> $ risc      <dbl> 4.19, 5.16, 4.68, 5.83, 3.64, 4.94, 5.50, 4.25, 4.95, 4.80, …
#> $ lsas      <int> 59, 27, 27, 59, 24, 27, 62, 41, 24, 39, 59, 57, 58, 59, 42, …

Categorical Variables (SPSS codes)

Variable Values
gender 1 = Male, 2 = Female, 3 = Another
race 1 = Asian, 2 = Black, 3 = Indigenous, 4 = Latino/Hispanic, 5 = Middle Eastern, 6 = White, 7 = Other
hand 1 = Right, 2 = Left, 3 = Both
community 1 = Rural, 2 = Small town, 3 = Suburban, 4 = Urban
famclass 1 = Working, 2 = Lower, 3 = Lower middle, 4 = Upper middle, 5 = Upper
greek_in 1 = Independent, 2 = Greek

Scale Variables

The following composite scale scores are included: DAQ (Dating and Assertion Questionnaire) subscales, IDI (Interpersonal Dependency Inventory) subscales, IRI (Interpersonal Reactivity Index) subscales, Sociability subscales, RISC (Relational-Interdependent Self-Construal), GCB (Generic Conspiracist Beliefs), and LSAS-SR (Liebowitz Social Anxiety Scale — Self Report).


Self-Descriptive Survey Data

Dataset: self_descriptive_data | 547 rows, 37 columns

Simulated survey data from college students at a predominantly white rural state university. Contains demographics, relationship variables, and scores on personality and self-concept measures. All categorical variables are already stored as numeric codes.

Source: Simulated data generated to resemble plausible survey responses from undergraduate psychology students.

glimpse(self_descriptive_data)
#> Rows: 547
#> Columns: 37
#> $ age               <int> 20, 21, 23, 20, 22, 19, 22, 20, 20, 19, 19, 19, 19, …
#> $ gender            <chr> "Female", "Male", "Male", "Female", "Male", "Male", …
#> $ sexorient         <chr> "Straight or heterosexual", "Straight or heterosexua…
#> $ race              <chr> "White", "Latino/Hispanic", "White", "Black", "Indig…
#> $ hand              <chr> "Right", "Left", "Right", "Right", "Right", "Right",…
#> $ community         <chr> "Small town", "Rural", "Small town", "Suburban", "Ur…
#> $ parentedu         <chr> "Yes", "Yes", "No", "No", "No", "Yes", "Yes", "No", …
#> $ famclass          <chr> "Working class", "Upper middle class", "Upper middle…
#> $ faminc            <int> 29034, 72518, 105693, 32123, 23907, 158842, 26565, 6…
#> $ numsib            <int> 1, 0, 3, 2, 0, 2, 4, 2, 2, 4, 4, 3, 2, 3, 1, 1, 2, 1…
#> $ move              <int> 3, 0, 5, 1, 4, 2, 2, 2, 2, 2, 1, 3, 1, 4, 2, 2, 4, 2…
#> $ clsfrn            <int> 5, 6, 7, 6, 3, 8, 8, 4, 10, 3, 5, 2, 6, 4, 8, 4, 6, …
#> $ clsfrlst          <int> 1, 5, 6, NA, 0, 5, 9, 4, 8, 1, 6, 1, 5, 3, 14, 5, 8,…
#> $ greek_in          <chr> "Independent", "Independent", "Greek", "Independent"…
#> $ campus            <chr> "No", "Yes", "No", "No", "No", "No", "Yes", "No", "Y…
#> $ relsp             <chr> "Not in a relationship", "Not in a relationship", "I…
#> $ rlength           <int> 10, 4, 11, 1, 9, 8, 13, 5, 1, 8, 1, 1, 1, 8, 2, 10, …
#> $ serious           <int> NA, NA, 7, NA, 6, 5, 4, NA, 7, 5, NA, NA, NA, 3, NA,…
#> $ numrels           <int> 0, 4, 4, 3, 5, 5, 1, 0, 1, 2, 4, 0, 0, 4, 0, 3, 0, 6…
#> $ extraversion      <dbl> 6.0, 4.5, 6.0, 4.0, 4.5, 3.5, 2.0, 5.0, 5.0, NA, 2.5…
#> $ agreeableness     <dbl> 5.5, 4.0, 3.5, 4.0, NA, 7.0, 4.0, 4.5, 7.0, 4.0, 5.5…
#> $ conscientiousness <dbl> 7.0, 4.0, 5.5, 5.5, 5.5, 6.0, 3.5, 5.5, 6.5, 5.0, 5.…
#> $ emot_stability    <dbl> 5.5, 3.0, 5.0, 4.5, 4.0, 4.5, 5.0, 5.0, 4.0, 6.5, 6.…
#> $ openness          <dbl> 5.5, 6.5, 5.5, 4.0, 4.5, 5.0, 3.5, 4.5, 6.0, 4.5, 6.…
#> $ disc              <dbl> 5.98, 3.36, 2.45, 5.45, 2.21, 3.45, 4.87, 3.53, 4.64…
#> $ moral             <dbl> 4.15, 2.16, 4.66, 4.73, 3.81, 1.49, 3.99, 1.50, 3.74…
#> $ comp              <dbl> 5.73, 6.40, 6.05, 5.40, 2.53, 5.75, 4.53, 5.91, 4.22…
#> $ maas              <dbl> 5.29, 3.97, 4.39, 5.19, 2.85, 3.56, 4.46, 3.65, 4.20…
#> $ rse               <dbl> 36.81, 36.19, 30.39, 30.31, 37.52, 19.39, 28.22, 27.…
#> $ promote           <dbl> 2.53, 3.13, 3.31, 2.96, 3.92, 4.24, 2.02, 3.77, 2.44…
#> $ prevent           <dbl> 5.00, 3.03, 2.51, 4.24, 3.39, 2.76, 2.78, 2.02, 3.69…
#> $ atq               <int> 64, 59, 70, 41, 64, 63, 46, 55, 71, 39, 78, 69, 62, …
#> $ pmdc              <dbl> 2.62, 2.15, 2.49, 1.55, 2.53, 2.62, 1.94, 1.89, 2.46…
#> $ nsne              <dbl> 1.66, 2.03, 2.23, 1.35, 2.04, 1.91, 1.00, 1.83, NA, …
#> $ lse               <dbl> 1.48, 1.65, 2.17, 1.00, NA, 2.37, 1.04, 1.00, 1.80, …
#> $ help              <dbl> 2.51, 1.87, 2.40, 1.35, 2.01, 1.56, 1.90, NA, 2.55, …
#> $ ngse              <dbl> 32.66, 29.14, 25.78, 37.73, 29.66, 29.22, 36.58, 35.…

Categorical Variables

Same demographic coding as interpersonal_data (see above).

Scale Variables

The following composite scale scores are included: TIPI (Ten-Item Personality Inventory) Big Five subscales, MAAS (Mindful Attention Awareness Scale) subscales, RFQ (Regulatory Focus Questionnaire) subscales, ATQ-30 (Automatic Thoughts Questionnaire) subscales, RSE (Rosenberg Self-Esteem), and NGSE (New General Self-Efficacy).

Footnotes

  1. The prep function uses bg (between-groups) while the dataset and export use mg.↩︎