+ - 0:00:00
Notes for current slide
Notes for next slide

1.1: Introduction to Econometrics

ECON 480 · Econometrics · Fall 2019

Ryan Safner
Assistant Professor of Economics
safner@hood.edu
ryansafner/metricsf19
metricsF19.classes.ryansafner.com

What is Econometrics?

Why Everyone, Yes Everyone, Should Learn Statistics

Why Everyone, Yes Everyone, Should Learn Statistics

We're Not so Good at Statistics: Votes I

  • Votes in the U.S. House of Representatives in favor of passing the Civil Rights Act of 1964:
Democrat Republican
61% 80%

We're Not so Good at Statistics: Votes I

  • Votes in the U.S. House of Representatives in favor of passing the Civil Rights Act of 1964:
Democrat Republican
61% 80%
  • Simple enough: "on average, Republicans tended to vote for passage more than Democrats"

We're Not so Good at Statistics: Votes II

  • Broken down further by Northern vs. Southern states:
Democrat Republican
North 94% 85%
(145/154) (138/162)
South 7% 0%
(7/94) (0/10)
Overall 61% 80%
(152/248) (138/172)

We're Not so Good at Statistics: Votes II

  • Broken down further by Northern vs. Southern states:
Democrat Republican
North 94% 85%
(145/154) (138/162)
South 7% 0%
(7/94) (0/10)
Overall 61% 80%
(152/248) (138/172)
  • Larger proportion of Democrats (94248, 38%) than Republicans (10172, 6%) were from South

  • The 7% of southern Democrats voting for the Act dragged down the Democrats' overall percentage more than the 0% of southern Republicans

We're Not So Good at Statistics: Kidney Stones I

  • Suppose you suffer from kidney stones, your doctor offers you treatment A or treatment B

  • In clinical trials, treatment B was effective for a larger percentage of patients than treatment A

  • Treatment A was effective for a higher percentage of patients with large stones and a higher percentage of patients with small stones

  • Wait, what?

We're Not So Good at Statistics: Kidney Stones II

From a real medical study:

Treatment A Treatment B
Small Stones 93% 87%
(81/87) (234/270)
Large Stones 73% 69%
(192/263) (55/80)
Overall 78% 83%
(273/350) (289/350)

C R Charig, D R Webb, S R Payne, and J E Wickham, 1986, "Comparison of treatment of renal calculi by open surgery, percutaneous nephrolithotomy, and extracorporeal shockwave lithotripsy," Br Med J (Clin Res Ed) 292(6524): 879–882.

We're Not So Good at Statistics: Kidney Stones II

From a real medical study:

Treatment A Treatment B
Small Stones 93% 87%
(81/87) (234/270)
Large Stones 73% 69%
(192/263) (55/80)
Overall 78% 83%
(273/350) (289/350)

C R Charig, D R Webb, S R Payne, and J E Wickham, 1986, "Comparison of treatment of renal calculi by open surgery, percutaneous nephrolithotomy, and extracorporeal shockwave lithotripsy," Br Med J (Clin Res Ed) 292(6524): 879–882.

  • The sizes of the two groups (i.e. who gets A vs B) are very different
  • A lurking variable in the study is the severity of the case: doctors tended to give treatment B for less severe cases

Simpson's Paradox

Simpson's Paradox: The correlation between two variables can change (even reverse!) when additional variables are considered

We're Not so Good at Statistics: Smoking I

  • 1964: U.S. Surgeon General issued a report claiming that cigarette smoking causes lung cancer

  • Evidence based primarily on correlations between cigarette smoking and lung cancer

We're Not so Good at Statistics: Smoking II

  • Tobacco companies attacked the report, naturally

We're Not so Good at Statistics: Smoking III

Ronald A. Fisher

1890--1924

We're Not so Good at Statistics: Smoking IV

  • There could be a confounding variable ("smoking gene") that causes both lung cancer and the urge to smoke

  • Would imply: decision to smoke or not would have no impact on lung cancer!

  • Correlation between smoking and cancer is spurious!

Correlation Does Not Imply Causation I

  • The goal of every intro statistics class ever

Correlation Does Not Imply Causation I

  • The goal of every intro statistics class ever

Correlation Does Not Imply Causation II

Correlation Can Imply Causation...

  • It's always good to be skeptical of causal claims

  • But this is actually where econometrics shines

...With the Right Tools

  • Econometrics is the application of statistical tools to quantify economic relationships in the real world

  • Uses real data to

    • test economic hypotheses
    • quantitatively estimate the magnitude of relationships between economic variables
    • forecast future events

Causal Inference I

  • What sets econometrics apart from mere statistics (or uses of statistics in other disciplines) is its role in causal inference

  • We can, with proper tools and interprations, make quantitative causal claims

    • about the effects of individual choices
    • about the effects of policy interventions
    • about the impact of political institutions
    • about economic history and economic development
    • etc...

Causal Inference II

A 50% increase in police presence in a metropolitan area lowers crime rates by 15%, on average1

Being an incumbent in office raises the probability of re-election by 40-45 percentage points2

European cities with at least one printing press in 1500 were at least 29% more likely to become Protestant by 16003

1 Klick, Jonathan and Alexander Tabarrok, 2005, "Using Terror Alert Levels to Estimate the Effect of Police on Crime," Journal of Law and Economics 48(1): 267-279

2 Lee, David S, 2001, "The Electoral Advantage to Incumbency and Voters' Valuation of Politicians' Experience: A Regression Discontinuity Analysis of Elections to the U.S," NBER Working Paper 8441

3 Rubin, Jared, 2014, "Printing and Protestants: An Empirical Test of the Role of Printing in the Reformation," Review of Economics and Statistics 96(2): 270-286

Example 1: Education

Does reducing class sizes improve student performance?

Example 1: Education

Does reducing class sizes improve student performance?

  • A policy-relevant tradeoff with a budget constraint
  • What is the precise effect of class size on performance?
  • Is it worth hiring new teachers and building more schools over?

Example 2: Discrimination in Lending

Is there racial discrimination in home mortgage lending?

Example 2: Discrimination in Lending

Is there racial discrimination in home mortgage lending?

  • Boston Fed: 28% of African-Americans are denied mortgages compared to only 9% of White Americans
  • Is this due to factors such as credit history, income, or discrimination purely because of race?

Example 3: Public Health and Public Finance

How much do state cigarette taxes reduce smoking rates?

Example 3: Public Health and Public Finance

How much do state cigarette taxes reduce smoking rates?

  • Econ 101: raise price lower quantity consumed

  • What is the price elasticity of demand for smoking?

  • How much tax revenue will this generate?

  • Probably:

TaxesSmokers

  • Maybe?:

TaxesSmokers

About this Class

Real Talk I

Real Talk I

Real Talk I

Real Talk II

  • This will be one of the hardest courses you take at Hood
  • There will be moments where you have no idea WTF is going on (this is normal)
  • Yes, you can still get an A

This Class Is

  • Economics: take your preexisting intuition and models for causal inference
  • Statistics: add regression and statistical inference
  • Computer Programming: using R and R Studio for analyzing and presenting data

This Class Is

Old School Statistics Courses

  • ˉx=1nni=1xi

  • σx=1nni=1(xiˉx)2

  • rxy=ni=1(xiˉx)(yiˉy)ni=1(xiˉx)2ni=1(yiˉy)2

  • Use pre-cleaned "toy" data, if any

This Class Is

Old School Statistics Courses

  • ˉx=1nni=1xi

  • σx=1nni=1(xiˉx)2

  • rxy=ni=1(xiˉx)(yiˉy)ni=1(xiˉx)2ni=1(yiˉy)2

  • Use pre-cleaned "toy" data, if any

Hip New Data Science Courses

  • mean(x)
  • sd(x)
  • cor(x, y)
  • Manipulate raw data from scratch (real life)

Prerequisites

  • Courses:
    • ECON 205
    • ECON 206
    • ECON 305 or ECON 306
    • MATH 112 or ECMG 212

Prerequisites

  • Courses:

    • ECON 205
    • ECON 206
    • ECON 305 or ECON 306
    • MATH 112 or ECMG 212
  • Math Skills:

    • Basic algebra
    • Probability-ish
    • Statistics-ish

Prerequisites

  • Courses:

    • ECON 205
    • ECON 206
    • ECON 305 or ECON 306
    • MATH 112 or ECMG 212
  • Math Skills:

    • Basic algebra
    • Probability-ish
    • Statistics-ish
  • Computer Science Skills:

    • None

What You'll Get Out of This Class

By the end of this semester, you will:

  1. understand how to evaluate statistical and empirical claims;
  2. use the fundamental models of causal inference and research design;
  3. gather, analyze, and communicate with real data in R.

This Class Opens Doors

This Class Gives You a Hybrid of Skills

  • "Data Science": ???
  • Causal Inference: economists' comparative advantage!

Data Science I

R Skills are In Demand

Yada Yada Machine Learning

"When you’re fundraising, it’s AI. When you’re hiring, it’s ML. When you’re implementing, it’s logistic regression."

  • everyone on Twitter ever (Source)

Causal Inference I

  • Machine learning and artificial intelligence are "dumb"1
  • With the right models and research designs, we can say "X causes Y" and quantify it!
  • Economists are in a unique position to make causal claims that mere statistics cannot

1 For more, see my blog post, and Pearl & MacKenzie (2018), The Book of Why

Causal Inference II

"First, the field of economics has spent decades developing a toolkit aimed at investigating empirical relationships, focusing on techniques to help understand which correlations speak to a causal relationship and which do not. This comes up all the time — does Uber Express Pool grow the full Uber user base, or simply draw in users from other Uber products? Should eBay advertise on Google, or does this simply syphon off people who would have come through organic search anyway? Are African-American Airbnb users rejected on the basis of their race? These are just a few of the countless questions that tech companies are grappling with, investing heavily in understanding the extent of a causal relationship."

Building Good Workflow Habits

  • I will show you the tools to make your workflow:
    • Reproducible
    • Computer- and Human-Readable (!)
    • Automated
    • All in one program

A Quick Example

library("gapminder")
ggplot(data = gapminder,
aes(x = gdpPercap,
y = lifeExp,
color = continent))+
geom_point(alpha=0.3)+
geom_smooth(method = "lm")+
scale_x_log10(breaks=c(1000,10000, 100000),
label=scales::dollar)+
labs(x = "GDP/Capita",
y = "Life Expectancy (Years)")+
facet_wrap(~continent)+
guides(color = F)+
theme_light()

Assignments

  • Research project:
    • Come up with a testable research question
    • Find data
    • Analyze data
    • Present your results (in writing and verbally)
  • HWs
  • Midterm, Final exam (in-class, closed notes)
Assignment Percent
1 Research Project 30%
n Homeworks (Average) 25%
1 Midterm 20%
1 Final 25%

Your "Textbooks"

Tips for Success In This Course

  • Take notes. On paper. Really.

  • Work together on assignments and study together.

  • Ask questions, come to office hours. Don't struggle in silence, you are not alone!

  • You are learning how to learn1

  • See the reference page for more

1 A properly worded Google search will become your secret weapon. Believe me. It's still mine.

Let's Try This Out

Roadmap for the Semester

What is Econometrics?

Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow