Daily 3 - Jan 21

Class Performance

Students: 113 | Mean: 2.61 | Median: 2.5 | SD: 0.59

Scores ranged from 1 to 4 out of 4 points.

Score Distribution

Performance by Question

Questions

Q1: The .Rmd file extension signifies what type of file?

R Markdown — a file format that combines text, code, and output in a single document.

  • Saying “R Script” — .R is an R Script file; .Rmd is R Markdown which adds documentation capabilities
  • Vague answers — “R file” or “data file” don’t capture the specific meaning of R Markdown
  • Confusing with CSV — CSV is a data format; .Rmd is a document format for reproducible analysis

Q2: What is the unit of record in the CPS?

Person — each row in the CPS represents one individual person surveyed.

  • Answering “young men” — The dataset in class focused on young men, but the CPS unit of record is “person” — individual people of any demographic
  • Answering “labor force” — Labor force describes the population being studied, not the unit of observation
  • Answering “ID” or “key variable” — ID is a variable that identifies the person, but “person” is the unit itself

Q3: For data to be “tidy”, every entry or cell must correspond to a _____.

Single value — each cell contains exactly one piece of data.

  • Missing “single” — Answering just “value” without specifying “single” misses the key constraint
  • Confusing with variable — Variables are columns; each cell contains a single value of that variable
  • Confusing with observation — Observations are rows; each cell is one value within an observation

Q4: The term ATAC captures four steps in the data value chain: Acquisition, _____, _____, and Communication.

Transformation, Analysis — The full chain is Acquisition, Transformation, Analysis, Communication.

  • Misspelling “Translation” — The step is “Transformation” not “Translation”
  • Wrong order — Analysis comes after Transformation, before Communication

Key Takeaways

Strengths: R Markdown identification, ATAC framework, and tidy data concepts are emerging.

Review:

  • Unit of record = person — Don’t confuse with the specific subset (young men) used in class
  • Single value per cell — Include “single” when describing tidy data
  • ATAC = Acquisition, Transformation, Analysis, Communication — Memorize the sequence