+ - 0:00:00
Notes for current slide
Notes for next slide
  • Facilitators introduce themselves
  • Facilitators (respectfully) assert authority to be teaching material
  • Facilitators begin creating a safe, comfortable container for participants

Introduction to Statistical Analysis

img-center-50

Introduction to Statistical Analysis


Instructors: Mark Yarish & Elizabeth DiLuzio

Follow along at: http://bit.ly/intro-stats

See the code at: http://bit.ly/intro-stats-code

1 / 407

Introduction to Statistical Analysis

Welcome

2 / 407
  • Facilitators introduce themselves
  • Facilitators (respectfully) assert authority to be teaching material
  • Facilitators begin creating a safe, comfortable container for participants

Introduction to Statistical Analysis

A Few Ground Rules

3 / 407

Introduction to Statistical Analysis

A Few Ground Rules

  • Step up, step back
4 / 407

Introduction to Statistical Analysis

A Few Ground Rules

  • Step up, step back
  • Be curious and ask questions!
5 / 407

Introduction to Statistical Analysis

A Few Ground Rules

  • Step up, step back
  • Be curious and ask questions!
  • Assume noble regard and positive intent
6 / 407

Introduction to Statistical Analysis

A Few Ground Rules

  • Step up, step back
  • Be curious and ask questions!
  • Assume noble regard and positive intent
  • Respect multiple perspectives
7 / 407

Introduction to Statistical Analysis

A Few Ground Rules

  • Step up, step back
  • Be curious and ask questions!
  • Assume noble regard and positive intent
  • Respect multiple perspectives
  • Listen deeply
8 / 407

Introduction to Statistical Analysis

A Few Ground Rules

  • Step up, step back
  • Be curious and ask questions!
  • Assume noble regard and positive intent
  • Respect multiple perspectives
  • Listen deeply
  • Be present (phone, email, social media, etc.)
9 / 407

Introduction to Statistical Analysis

Introductions and Data Collection

  • Who are you?
  • Where do you work?
  • How many siblings do you have (not including yourself)?
  • How long have you worked for NYC in years?
  • What is your height (in inches)?
10 / 407

Introduction to Statistical Analysis

Goals for the Course

11 / 407

Introduction to Statistical Analysis

Goals for the Course

  • Review descriptive statistics in the context of operational decision making
12 / 407

Introduction to Statistical Analysis

Goals for the Course

  • Review descriptive statistics in the context of operational decision making
  • Discuss correlation and simple linear regression analysis in the context of operational decision making
13 / 407

Introduction to Statistical Analysis

Goals for the Course

  • Review descriptive statistics in the context of operational decision making
  • Discuss correlation and simple linear regression analysis in the context of operational decision making
  • Introduce decision modeling and their use
14 / 407

Introduction to Statistical Analysis

Goals for the Course

  • Review descriptive statistics in the context of operational decision making
  • Discuss correlation and simple linear regression analysis in the context of operational decision making
  • Introduce decision modeling and their use
  • Practice calculating descriptive statistics, calculating correlation, and developing predictive models in Excel
15 / 407

Introduction to Statistical Analysis

Key Takeaways for the Course

16 / 407

Introduction to Statistical Analysis

Key Takeaways for the Course

  • You will be more familiar with basic descriptive statistics
17 / 407

Introduction to Statistical Analysis

Key Takeaways for the Course

  • You will be more familiar with basic descriptive statistics
  • You will be better able to describe correlation and simple linear regression
18 / 407

Introduction to Statistical Analysis

Key Takeaways for the Course

  • You will be more familiar with basic descriptive statistics
  • You will be better able to describe correlation and simple linear regression
  • You will better understand the value of decision models in operational decision making
19 / 407

Introduction to Statistical Analysis

Key Takeaways for the Course

  • You will be more familiar with basic descriptive statistics
  • You will be better able to describe correlation and simple linear regression
  • You will better understand the value of decision models in operational decision making
  • You will be practiced in calculating descriptive statistics, calculating correlation, and developing predictive models in Excel
20 / 407

Introduction to Statistical Analysis

Key Assumptions

21 / 407

Introduction to Statistical Analysis

Key Assumptions

  • You’ve had some previous experience with statistics and probability
22 / 407

Introduction to Statistical Analysis

Key Assumptions

  • You’ve had some previous experience with statistics and probability
  • You’re familiar with using Excel to manipulate data and calculate values
23 / 407

Introduction to Statistical Analysis

Key Assumptions

  • You’ve had some previous experience with statistics and probability
  • You’re familiar with using Excel to manipulate data and calculate values
  • You’re familiar with using formulas in Excel
24 / 407

Introduction to Statistical Analysis

Disclaimer

25 / 407

Introduction to Statistical Analysis

Disclaimer

  • We're not statisticians
26 / 407

Introduction to Statistical Analysis

Disclaimer

  • We're not statisticians
  • You won’t be a statistician by the end of this course
27 / 407

Introduction to Statistical Analysis

Disclaimer

  • We're not statisticians
  • You won’t be a statistician by the end of this course
  • WeI often apply statistical tools and understanding in the work we do
28 / 407

Introduction to Statistical Analysis

Disclaimer

  • We're not statisticians
  • You won’t be a statistician by the end of this course
  • WeI often apply statistical tools and understanding in the work we do
  • We're assuming you all do the same, which is why you’re here
29 / 407

Introduction to Statistical Analysis

Housekeeping

30 / 407
  • Facilitator sets expectations with the students
  • Establishes the "contract" for the class

Introduction to Statistical Analysis

Housekeeping

  • We’ll have one 15 minute break in the morning
31 / 407
  • Facilitator sets expectations with the students
  • Establishes the "contract" for the class

Introduction to Statistical Analysis

Housekeeping

  • We’ll have one 15 minute break in the morning
  • We’ll have an hour for lunch
32 / 407
  • Facilitator sets expectations with the students
  • Establishes the "contract" for the class

Introduction to Statistical Analysis

Housekeeping

  • We’ll have one 15 minute break in the morning
  • We’ll have an hour for lunch
  • We’ll have a 15 minute break in the afternoon
33 / 407
  • Facilitator sets expectations with the students
  • Establishes the "contract" for the class

Introduction to Statistical Analysis

Housekeeping

  • We’ll have one 15 minute break in the morning
  • We’ll have an hour for lunch
  • We’ll have a 15 minute break in the afternoon
  • Class will start promptly after breaks
34 / 407
  • Facilitator sets expectations with the students
  • Establishes the "contract" for the class

Introduction to Statistical Analysis

Housekeeping

  • We’ll have one 15 minute break in the morning
  • We’ll have an hour for lunch
  • We’ll have a 15 minute break in the afternoon
  • Class will start promptly after breaks
  • Feel free to use the bathroom if you need during class
35 / 407
  • Facilitator sets expectations with the students
  • Establishes the "contract" for the class

Introduction to Statistical Analysis

Housekeeping

  • We’ll have one 15 minute break in the morning
  • We’ll have an hour for lunch
  • We’ll have a 15 minute break in the afternoon
  • Class will start promptly after breaks
  • Feel free to use the bathroom if you need during class
  • Please take any phone conversations into the hall to not disrupt the class
36 / 407
  • Facilitator sets expectations with the students
  • Establishes the "contract" for the class

Introduction to Statistical Analysis

Goals for this Morning

37 / 407

Introduction to Statistical Analysis

Goals for this Morning

  • Review basic statistical measures
38 / 407

Introduction to Statistical Analysis

Goals for this Morning

  • Review basic statistical measures
  • Practice using statistics in real-world applications
39 / 407

Introduction to Statistical Analysis

Goals for this Morning

  • Review basic statistical measures
  • Practice using statistics in real-world applications
  • Familiarize you with how to use Excel for statistical analysis
40 / 407

Introduction to Statistical Analysis

We are drowning in information and starving for knowledge.

John Naisbitt

41 / 407
  • Facilitator prompts the participants to reflect on the value of statistics for understanding information
  • Helpful to make the distinction between the raw information and intelligence that can be used for decision making

Introduction to Statistical Analysis

Why Statistics?

42 / 407
  • Facilitator emphasizes the utility of statistics for understanding information

Introduction to Statistical Analysis

Why Statistics?

  • Tools for extracting meaning from data
43 / 407
  • Facilitator emphasizes the utility of statistics for understanding information

Introduction to Statistical Analysis

Why Statistics?

  • Tools for extracting meaning from data
  • Commonly understood ways of communicating meaning to others
44 / 407
  • Facilitator emphasizes the utility of statistics for understanding information
  • Facilitator makes point about being able to compare using statistics
  • I often use the example of comparing one class with the other -> if mean years of service is higher than this class, what might that imply about the

Introduction to Statistical Analysis

Let’s run the statistics on our class today

Download the data for the class

img-center-65

45 / 407
  • Facilitator leads the participants through applying basic descriptive stats to the data collected at the introductions
  • Facilitator reviews each concept with participants then leads them through calculating it in the appropriate place on the spreadsheet

Introduction to Statistical Analysis

Mean

46 / 407

Introduction to Statistical Analysis

Mean

  • A representative value for the data
47 / 407

Introduction to Statistical Analysis

Mean

  • A representative value for the data
  • Usually what people mean by “average”
48 / 407

Introduction to Statistical Analysis

Mean

  • A representative value for the data
  • Usually what people mean by “average”
  • Calculate by adding all the values together and dividing by the number instances
49 / 407

Introduction to Statistical Analysis

Mean

  • A representative value for the data
  • Usually what people mean by “average”
  • Calculate by adding all the values together and dividing by the number instances
  • Sensitive to extremes
50 / 407

Introduction to Statistical Analysis

Mean

  • A representative value for the data
  • Usually what people mean by “average”
  • Calculate by adding all the values together and dividing by the number instances
  • Sensitive to extremes

Calculate the means (number of siblings, years of service, and height) for our class today

51 / 407

Introduction to Statistical Analysis

Median

52 / 407

Introduction to Statistical Analysis

Median

  • The “middle” value of a data set
53 / 407

Introduction to Statistical Analysis

Median

  • The “middle” value of a data set
  • Center value of a data set with an odd number of values
54 / 407

Introduction to Statistical Analysis

Median

  • The “middle” value of a data set
  • Center value of a data set with an odd number of values
  • Sum of two middle values divided by 2 if the number of items in a data set is even
55 / 407

Introduction to Statistical Analysis

Median

  • The “middle” value of a data set
  • Center value of a data set with an odd number of values
  • Sum of two middle values divided by 2 if the number of items in a data set is even
  • Resistant to extreme values
56 / 407

Introduction to Statistical Analysis

Median

  • The “middle” value of a data set
  • Center value of a data set with an odd number of values
  • Sum of two middle values divided by 2 if the number of items in a data set is even
  • Resistant to extreme values img-center-100
57 / 407

Introduction to Statistical Analysis

Median

  • The “middle” value of a data set
  • Center value of a data set with an odd number of values
  • Sum of two middle values divided by 2 if the number of items in a data set is even
  • Resistant to extreme values img-center-100

Calculate the medians for our class today

58 / 407

Introduction to Statistical Analysis

Mode

img-right-50

59 / 407

Introduction to Statistical Analysis

Mode

img-right-50

  • The most frequent value in a dataset
60 / 407

Introduction to Statistical Analysis

Mode

img-right-50

  • The most frequent value in a dataset
  • Often used for categorical data
61 / 407

Introduction to Statistical Analysis

Mode

img-right-50

  • The most frequent value in a dataset
  • Often used for categorical data

     


    Calculate the mode for our class today

62 / 407

Introduction to Statistical Analysis

Median vs Mean vs Mode

img-center-80

By Cmglee (Own work) CC BY-SA 3.0 or GFDL, via Wikimedia Commons

63 / 407
  • Facilitator describes the difference between mean, median, and mode, emphasizing when you would use one over the other

Introduction to Statistical Analysis

When Do We Use Median rather than Mean (Average)?

64 / 407
  • Facilitator reflects with participants when they recall using median rather than mean in their work or in the media
  • Establishes key learning point of when these measure should be used (depends on the shape of the data)

Introduction to Statistical Analysis

When Do We Use Median rather than Mean (Average)?

  • House Prices
65 / 407
  • Facilitator reflects with participants when they recall using median rather than mean in their work or in the media
  • Establishes key learning point of when these measure should be used (depends on the shape of the data)

Introduction to Statistical Analysis

When Do We Use Median rather than Mean (Average)?

  • House Prices
  • Household Income
66 / 407
  • Facilitator reflects with participants when they recall using median rather than mean in their work or in the media
  • Establishes key learning point of when these measure should be used (depends on the shape of the data)

Introduction to Statistical Analysis

When Do We Use Median rather than Mean (Average)?

  • House Prices
  • Household Income
  • What else?
67 / 407
  • Facilitator reflects with participants when they recall using median rather than mean in their work or in the media
  • Establishes key learning point of when these measure should be used (depends on the shape of the data)

Introduction to Statistical Analysis

When Do We Use Median rather than Mean (Average)?

  • House Prices
  • Household Income
  • What else?
  • Why?
68 / 407
  • Facilitator reflects with participants when they recall using median rather than mean in their work or in the media
  • Establishes key learning point of when these measure should be used (depends on the shape of the data)

Introduction to Statistical Analysis

Anscombe's Quartet

img-center-70

69 / 407
  • Facilitator uses example of Anscombe's Quartet to demonstrate the need to visually inspect data
  • For more information, see this article

Introduction to Statistical Analysis

Histogram

70 / 407

Introduction to Statistical Analysis

Histogram

img-right-45

  • Charts the frequency of instances in the data
71 / 407

Introduction to Statistical Analysis

Histogram

img-right-45

  • Charts the frequency of instances in the data
  • Shows the frequency distribution
72 / 407

Introduction to Statistical Analysis

Histogram

img-right-45

  • Charts the frequency of instances in the data
  • Shows the frequency distribution
  • Values are grouped into class intervals
73 / 407

Introduction to Statistical Analysis

Histogram

img-right-45

  • Charts the frequency of instances in the data
  • Shows the frequency distribution
  • Values are grouped into class intervals img-right-45
  • Best to have a consistent size to class intervals
74 / 407

Introduction to Statistical Analysis

Histogram

img-right-45

  • Charts the frequency of instances in the data
  • Shows the frequency distribution
  • Values are grouped into class intervals img-right-45
  • Best to have a consistent size to class intervals

 
 
http://mathematica.stackexchange.com/questions/59520/histogram-with-variable-bin-size

75 / 407

Introduction to Statistical Analysis

Creating a Histogram for Height in Post-Its

76 / 407

+Have participants write their height, in inches, on a post-it +Collect and graph post-its on poster paper +Point out that there are a lot of bars that are quite small +Not very helpful in understanding trends in our data +What happens when we group the bars together? Group evenly to be sure we can draw conclusions. +Can you see a trend better now?

Introduction to Statistical Analysis

Creating a Histogram for Height in Excel

77 / 407

Introduction to Statistical Analysis

Installing Data Analysis ToolPak

img-right-75

  • File
  • Options
  • Add-ins
  • Manage
  • “Go…”
78 / 407

Introduction to Statistical Analysis

Installing Data Analysis ToolPak

img-center-50

79 / 407

Introduction to Statistical Analysis

Setup Your Bins

80 / 407
  • Starting with arbitrarily determining bin size by range helps to introduce the topic
  • Later introduce equal interval based on either range of the data or the limits of the lower and upper bounds to eliminate outlines
  • Key learning point is that we control how to tell the story of the data with the bin size we select
  • Easily to obscure or manufacture a pattern to the data by picking the wrong bin size

Introduction to Statistical Analysis

Setup Your Bins

img-right-20

  • Use an empty column and label it “Bins”
81 / 407
  • Starting with arbitrarily determining bin size by range helps to introduce the topic
  • Later introduce equal interval based on either range of the data or the limits of the lower and upper bounds to eliminate outlines
  • Key learning point is that we control how to tell the story of the data with the bin size we select
  • Easily to obscure or manufacture a pattern to the data by picking the wrong bin size

Introduction to Statistical Analysis

Setup Your Bins

img-right-20

  • Use an empty column and label it “Bins”
  • Start with the max of the first bin
82 / 407
  • Starting with arbitrarily determining bin size by range helps to introduce the topic
  • Later introduce equal interval based on either range of the data or the limits of the lower and upper bounds to eliminate outlines
  • Key learning point is that we control how to tell the story of the data with the bin size we select
  • Easily to obscure or manufacture a pattern to the data by picking the wrong bin size

Introduction to Statistical Analysis

Setup Your Bins

img-right-20

  • Use an empty column and label it “Bins”
  • Start with the max of the first bin
  • Create an entry for each bin you want
83 / 407
  • Starting with arbitrarily determining bin size by range helps to introduce the topic
  • Later introduce equal interval based on either range of the data or the limits of the lower and upper bounds to eliminate outlines
  • Key learning point is that we control how to tell the story of the data with the bin size we select
  • Easily to obscure or manufacture a pattern to the data by picking the wrong bin size

Introduction to Statistical Analysis

Setup Your Bins

img-right-20

  • Use an empty column and label it “Bins”
  • Start with the max of the first bin
  • Create an entry for each bin you want
  • Use a formula to save time
84 / 407
  • Starting with arbitrarily determining bin size by range helps to introduce the topic
  • Later introduce equal interval based on either range of the data or the limits of the lower and upper bounds to eliminate outlines
  • Key learning point is that we control how to tell the story of the data with the bin size we select
  • Easily to obscure or manufacture a pattern to the data by picking the wrong bin size

Introduction to Statistical Analysis

Creating a Histogram (Height)

Under the Data Ribbon

img-center-65

85 / 407

Introduction to Statistical Analysis

Creating a Histogram (Height)

img-center-100

86 / 407

Introduction to Statistical Analysis

Creating a Histogram (Height)

img-center-60

87 / 407

Introduction to Statistical Analysis

Creating a Histogram (Height)

img-center-100

88 / 407

Introduction to Statistical Analysis

Distributions of Data

89 / 407

Introduction to Statistical Analysis

Normal(-ish) Distribution

img-center-90

90 / 407

Introduction to Statistical Analysis

Long-tail Distribution

img-center-95

91 / 407

Introduction to Statistical Analysis

Bi-Modal Distribution

img-center-100

92 / 407

Introduction to Statistical Analysis

Measures of Central Tendency

93 / 407

Introduction to Statistical Analysis

Measures of Central Tendency

  • Quantitative data tends to cluster around some central value
94 / 407

Introduction to Statistical Analysis

Measures of Central Tendency

  • Quantitative data tends to cluster around some central value
  • Contrasts with the spread of data around that center (i.e. the variability in the data)
95 / 407

Introduction to Statistical Analysis

Measures of Central Tendency

  • Quantitative data tends to cluster around some central value
  • Contrasts with the spread of data around that center (i.e. the variability in the data)
  • Mean is a more precise measure and more often used
96 / 407

Introduction to Statistical Analysis

Measures of Central Tendency

  • Quantitative data tends to cluster around some central value
  • Contrasts with the spread of data around that center (i.e. the variability in the data)
  • Mean is a more precise measure and more often used
  • Median is better when there are extreme outliers
97 / 407

Introduction to Statistical Analysis

Measures of Central Tendency

  • Quantitative data tends to cluster around some central value
  • Contrasts with the spread of data around that center (i.e. the variability in the data)
  • Mean is a more precise measure and more often used
  • Median is better when there are extreme outliers
  • Mode is used when the data is categorical (as opposed to numeric)
98 / 407

Introduction to Statistical Analysis

Measuring Variability

99 / 407

Introduction to Statistical Analysis

Range

100 / 407

Introduction to Statistical Analysis

Range

  • The gap between the minimum value and the maximum value
101 / 407

Introduction to Statistical Analysis

Range

  • The gap between the minimum value and the maximum value
  • Calculated by subtracting the minimum from the maximum
102 / 407

Introduction to Statistical Analysis

Range

  • The gap between the minimum value and the maximum value
  • Calculated by subtracting the minimum from the maximum
  • Use the MAX and MIN functions in Excel to calculate this for our data
103 / 407

Introduction to Statistical Analysis

Quartiles

104 / 407
  • Facilitator introduces concept by discussing percentiles
  • Prompts participants with the scenario: "Your child comes home and says they scored on the 98th percentile on the SAT. What does that mean?"
  • The answer is that they scored at or above 98% of the students who took the SAT
  • This introduces the idea important to understanding Quartiles as breaking up the data into 4 equal portions of the data

Introduction to Statistical Analysis

Quartiles

img-center-45 Image credit Ark0n CC BY-SA 3.0

105 / 407
  • Facilitator introduces concept by discussing percentiles
  • Prompts participants with the scenario: "Your child comes home and says they scored on the 98th percentile on the SAT. What does that mean?"
  • The answer is that they scored at or above 98% of the students who took the SAT
  • This introduces the idea important to understanding Quartiles as breaking up the data into 4 equal portions of the data

Introduction to Statistical Analysis

Quartiles

img-center-45 Image credit Ark0n CC BY-SA 3.0

  • Quartiles split the data into four equal groups
106 / 407
  • Facilitator introduces concept by discussing percentiles
  • Prompts participants with the scenario: "Your child comes home and says they scored on the 98th percentile on the SAT. What does that mean?"
  • The answer is that they scored at or above 98% of the students who took the SAT
  • This introduces the idea important to understanding Quartiles as breaking up the data into 4 equal portions of the data

Introduction to Statistical Analysis

Quartiles

img-center-45 Image credit Ark0n CC BY-SA 3.0

  • Quartiles split the data into four equal groups
  • First quartile is 0-25% of the data
107 / 407
  • Facilitator introduces concept by discussing percentiles
  • Prompts participants with the scenario: "Your child comes home and says they scored on the 98th percentile on the SAT. What does that mean?"
  • The answer is that they scored at or above 98% of the students who took the SAT
  • This introduces the idea important to understanding Quartiles as breaking up the data into 4 equal portions of the data

Introduction to Statistical Analysis

Quartiles

img-center-45 Image credit Ark0n CC BY-SA 3.0

  • Quartiles split the data into four equal groups
  • First quartile is 0-25% of the data
  • Second quartile is 25-50% of the data
108 / 407
  • Facilitator introduces concept by discussing percentiles
  • Prompts participants with the scenario: "Your child comes home and says they scored on the 98th percentile on the SAT. What does that mean?"
  • The answer is that they scored at or above 98% of the students who took the SAT
  • This introduces the idea important to understanding Quartiles as breaking up the data into 4 equal portions of the data

Introduction to Statistical Analysis

Quartiles

img-center-45 Image credit Ark0n CC BY-SA 3.0

  • Quartiles split the data into four equal groups
  • First quartile is 0-25% of the data
  • Second quartile is 25-50% of the data
  • Third quartile is 50-75% of the data
109 / 407
  • Facilitator introduces concept by discussing percentiles
  • Prompts participants with the scenario: "Your child comes home and says they scored on the 98th percentile on the SAT. What does that mean?"
  • The answer is that they scored at or above 98% of the students who took the SAT
  • This introduces the idea important to understanding Quartiles as breaking up the data into 4 equal portions of the data

Introduction to Statistical Analysis

Quartiles

img-center-45 Image credit Ark0n CC BY-SA 3.0

  • Quartiles split the data into four equal groups
  • First quartile is 0-25% of the data
  • Second quartile is 25-50% of the data
  • Third quartile is 50-75% of the data
  • Fourth quartile is 75-100% of the data
110 / 407
  • Facilitator introduces concept by discussing percentiles
  • Prompts participants with the scenario: "Your child comes home and says they scored on the 98th percentile on the SAT. What does that mean?"
  • The answer is that they scored at or above 98% of the students who took the SAT
  • This introduces the idea important to understanding Quartiles as breaking up the data into 4 equal portions of the data

Introduction to Statistical Analysis

Quartiles

img-center-45 Image credit Ark0n CC BY-SA 3.0

  • Quartiles split the data into four equal groups
  • First quartile is 0-25% of the data
  • Second quartile is 25-50% of the data
  • Third quartile is 50-75% of the data
  • Fourth quartile is 75-100% of the data
  • Use the QUARTILE function in Excel to calculate this
111 / 407
  • Facilitator introduces concept by discussing percentiles
  • Prompts participants with the scenario: "Your child comes home and says they scored on the 98th percentile on the SAT. What does that mean?"
  • The answer is that they scored at or above 98% of the students who took the SAT
  • This introduces the idea important to understanding Quartiles as breaking up the data into 4 equal portions of the data

Introduction to Statistical Analysis

Interquartile Range

112 / 407

Introduction to Statistical Analysis

Interquartile Range

  • “Middle” 50% of data (between 1st Quartile and 3rd Quartile)
113 / 407

Introduction to Statistical Analysis

Interquartile Range

  • “Middle” 50% of data (between 1st Quartile and 3rd Quartile) img-center-80

Image source

114 / 407

Introduction to Statistical Analysis

Outliers

img-right-30

115 / 407

Introduction to Statistical Analysis

Outliers

img-right-30

  • Any data points less than 1.5x the IQR or greater than 1.5x the IQR are considered outliers
116 / 407

Introduction to Statistical Analysis

Outliers

img-right-30

  • Any data points less than 1.5x the IQR or greater than 1.5x the IQR are considered outliers
  • Helps identify data points that may skew the analysis
117 / 407

Introduction to Statistical Analysis

Outliers

img-right-30

  • Any data points less than 1.5x the IQR or greater than 1.5x the IQR are considered outliers
  • Helps identify data points that may skew the analysis
  • Focus on the “meat” of the data
118 / 407

Introduction to Statistical Analysis

Outliers

img-right-30

  • Any data points less than 1.5x the IQR or greater than 1.5x the IQR are considered outliers
  • Helps identify data points that may skew the analysis
  • Focus on the “meat” of the data

 

Image source FlowingData.com

119 / 407

Introduction to Statistical Analysis

Do We Have Any Outliers in Our Data?

120 / 407
  • First calculate the Upper Limit -> participants will usually calculate the formula as =1.5 * IQR + Q3
  • In calculating the Lower Limit -> participants will use the same order to get =1.5 * IQR - Q1, which leads to an incorrect result. Make sure to point this out to them
  • Reflect on any outliers with the class

Introduction to Statistical Analysis

Standard Deviation

121 / 407

Introduction to Statistical Analysis

Standard Deviation

  • The average distance of each data point from the mean
122 / 407

Introduction to Statistical Analysis

Standard Deviation

  • The average distance of each data point from the mean img-center-40
123 / 407

Introduction to Statistical Analysis

Standard Deviation

  • The average distance of each data point from the mean img-center-40
  • Larger the standard deviation, the greater the spread
124 / 407

Introduction to Statistical Analysis

Standard Deviation

  • The average distance of each data point from the mean img-center-40
  • Larger the standard deviation, the greater the spread img-center-100
125 / 407

Introduction to Statistical Analysis

Standard Deviation

img-center-90

126 / 407

Introduction to Statistical Analysis

Measures of Variability

127 / 407

Introduction to Statistical Analysis

Measures of Variability

  • Describe the distribution of our data
128 / 407

Introduction to Statistical Analysis

Measures of Variability

  • Describe the distribution of our data
  • Range (Maximum – Minimum)
129 / 407

Introduction to Statistical Analysis

Measures of Variability

  • Describe the distribution of our data
  • Range (Maximum – Minimum)
  • Inter-quartile Range
130 / 407

Introduction to Statistical Analysis

Measures of Variability

  • Describe the distribution of our data
  • Range (Maximum – Minimum)
  • Inter-quartile Range
  • Standard Deviation
131 / 407

Introduction to Statistical Analysis

Measures of Variability

  • Describe the distribution of our data
  • Range (Maximum – Minimum)
  • Inter-quartile Range
  • Standard Deviation
  • Identification of outliers (1.5 x IQR)
132 / 407

Introduction to Statistical Analysis

Descriptive Statistics

133 / 407

Introduction to Statistical Analysis

Descriptive Statistics

  • Quantitatively describe the main features of a dataset
134 / 407

Introduction to Statistical Analysis

Descriptive Statistics

  • Quantitatively describe the main features of a dataset
  • Help distinguish distributions and make them comparable
135 / 407

Introduction to Statistical Analysis

Descriptive Statistics

  • Quantitatively describe the main features of a dataset
  • Help distinguish distributions and make them comparable
  • 5 number summary
136 / 407

Introduction to Statistical Analysis

Descriptive Statistics

  • Quantitatively describe the main features of a dataset
  • Help distinguish distributions and make them comparable
  • 5 number summary
        - Minimum
137 / 407

Introduction to Statistical Analysis

Descriptive Statistics

  • Quantitatively describe the main features of a dataset
  • Help distinguish distributions and make them comparable
  • 5 number summary
        - Minimum
        - 1st Quartile
138 / 407

Introduction to Statistical Analysis

Descriptive Statistics

  • Quantitatively describe the main features of a dataset
  • Help distinguish distributions and make them comparable
  • 5 number summary
        - Minimum
        - 1st Quartile
        - Median
139 / 407

Introduction to Statistical Analysis

Descriptive Statistics

  • Quantitatively describe the main features of a dataset
  • Help distinguish distributions and make them comparable
  • 5 number summary
        - Minimum
        - 1st Quartile
        - Median
        - 3rd Quartile
140 / 407

Introduction to Statistical Analysis

Descriptive Statistics

  • Quantitatively describe the main features of a dataset
  • Help distinguish distributions and make them comparable
  • 5 number summary
        - Minimum
        - 1st Quartile
        - Median
        - 3rd Quartile
        - Maximum
141 / 407

Introduction to Statistical Analysis

Exploratory Data Analysis

142 / 407

Introduction to Statistical Analysis

Exploratory Data Analysis

  • Goal -> Discover patterns in the data
143 / 407

Introduction to Statistical Analysis

Exploratory Data Analysis

  • Goal -> Discover patterns in the data
  • Understand the context
144 / 407

Introduction to Statistical Analysis

Exploratory Data Analysis

  • Goal -> Discover patterns in the data
  • Understand the context
  • Summarize fields
145 / 407

Introduction to Statistical Analysis

Exploratory Data Analysis

  • Goal -> Discover patterns in the data
  • Understand the context
  • Summarize fields
  • Use graphical representations of the data
146 / 407

Introduction to Statistical Analysis

Exploratory Data Analysis

  • Goal -> Discover patterns in the data
  • Understand the context
  • Summarize fields
  • Use graphical representations of the data
  • Explore outliers
147 / 407

Introduction to Statistical Analysis

Exploratory Data Analysis

  • Goal -> Discover patterns in the data
  • Understand the context
  • Summarize fields
  • Use graphical representations of the data
  • Explore outliers

Tukey, J.W. (1977). Exploratory data analysis. Reading, MA: Addison-Wesley.

148 / 407

Introduction to Statistical Analysis

15 MIN BREAK

img-center-100

Source: https://xkcd.com/539/

149 / 407

Introduction to Statistical Analysis

Let's Try That Again

Vehicle Collisions in NYC

150 / 407

Introduction to Statistical Analysis

But before we get too deep into data...

151 / 407
  • Facilitator demonstrates the concepts of a pivot table by doing a human pivot table
  • Facilitator asks participants to move to the front of the room
  • Facilitator leads them through exercises to show sort, filter, and aggregate

Introduction to Statistical Analysis

img-center-100

152 / 407

Introduction to Statistical Analysis

PivotTables

153 / 407
  • Facilitator introduces concept and usage of pivottables to the class
  • Best to introduce by having someone describe how they use a pivottable in their work (connects to the day-to-day concretely)

Introduction to Statistical Analysis

PivotTables

  • A data summarization tool
154 / 407
  • Facilitator introduces concept and usage of pivottables to the class
  • Best to introduce by having someone describe how they use a pivottable in their work (connects to the day-to-day concretely)

Introduction to Statistical Analysis

PivotTables

  • A data summarization tool
  • Useful to quickly understand data
155 / 407
  • Facilitator introduces concept and usage of pivottables to the class
  • Best to introduce by having someone describe how they use a pivottable in their work (connects to the day-to-day concretely)

Introduction to Statistical Analysis

PivotTables

  • A data summarization tool
  • Useful to quickly understand data
  • Can use to graph data totals
156 / 407
  • Facilitator introduces concept and usage of pivottables to the class
  • Best to introduce by having someone describe how they use a pivottable in their work (connects to the day-to-day concretely)

Introduction to Statistical Analysis

PivotTables

  • A data summarization tool
  • Useful to quickly understand data
  • Can use to graph data totals

img-center-100

157 / 407
  • Facilitator introduces concept and usage of pivottables to the class
  • Best to introduce by having someone describe how they use a pivottable in their work (connects to the day-to-day concretely)

Introduction to Statistical Analysis

Creating a PivotTable

img-center-60

158 / 407
  • Facilitator describes the steps to creating a pivottable in Excel
  • Please don't model selecting all data before selecting "Insert Pivottable" -> creates (blank) field in pivottable
  • Just allow to select all data by itself

Introduction to Statistical Analysis

Creating a PivotTable

img-center-60

  • Should default to all your data unless you have any cells selected
159 / 407
  • Facilitator describes the steps to creating a pivottable in Excel
  • Please don't model selecting all data before selecting "Insert Pivottable" -> creates (blank) field in pivottable
  • Just allow to select all data by itself

  • Remind participants not to have any data selected when inserting a pivottable. When this comes up in class (because someone did it), use it as a teachable moment

Introduction to Statistical Analysis

Creating a PivotTable

img-center-60

  • Should default to all your data unless you have any cells selected
  • Should default to a new worksheet
160 / 407
  • Facilitator describes the steps to creating a pivottable in Excel
  • Please don't model selecting all data before selecting "Insert Pivottable" -> creates (blank) field in pivottable
  • Just allow to select all data by itself

  • Remind participants not to have any data selected when inserting a pivottable. When this comes up in class (because someone did it), use it as a teachable moment

Introduction to Statistical Analysis

Creating a PivotTable

img-right-30

Drag and drop fields to visualize

161 / 407

Introduction to Statistical Analysis

Creating a PivotTable

img-right-30

Drag and drop fields to visualize

  • Row labels
162 / 407

Introduction to Statistical Analysis

Creating a PivotTable

img-right-30

Drag and drop fields to visualize

  • Row labels
  • Values
163 / 407

Introduction to Statistical Analysis

Creating a PivotTable

img-right-30

Drag and drop fields to visualize

  • Row labels
  • Values
  • Filter
164 / 407

Introduction to Statistical Analysis

Creating a PivotTable

img-right-30

Drag and drop fields to visualize

  • Row labels
  • Values
  • Filter
  • Column Labels
165 / 407

Introduction to Statistical Analysis

Creating a PivotTable of Dates

img-center-80

166 / 407

Introduction to Statistical Analysis

Calculating Descriptive Statistics

img-center-100

167 / 407

Introduction to Statistical Analysis

Calculating Descriptive Statistics

img-center-100

168 / 407

Introduction to Statistical Analysis

Calculating Descriptive Statistics

img-center-100

169 / 407

Introduction to Statistical Analysis

Calculating Descriptive Statistics

img-center-100

170 / 407

Introduction to Statistical Analysis

Questions of the Data

171 / 407

Introduction to Statistical Analysis

Questions of the Data

  • What is the mean number of accidents per day?
172 / 407

Introduction to Statistical Analysis

Questions of the Data

  • What is the mean number of accidents per day?
  • Is mean or median the best way to describe this data?
173 / 407

Introduction to Statistical Analysis

Questions of the Data

  • What is the mean number of accidents per day?
  • Is mean or median the best way to describe this data?
  • Are there any outliers in this data?
174 / 407

Introduction to Statistical Analysis

Wrap-Up

175 / 407
  • Facilitator reviews the concepts introduced in the morning and ensures all questions are answered

Introduction to Statistical Analysis

Wrap-Up

  • Reviewed basic descriptive statistics
176 / 407
  • Facilitator reviews the concepts introduced in the morning and ensures all questions are answered

Introduction to Statistical Analysis

Wrap-Up

  • Reviewed basic descriptive statistics
  • Calculated basic descriptive statistics in Excel
177 / 407
  • Facilitator reviews the concepts introduced in the morning and ensures all questions are answered

Introduction to Statistical Analysis

Wrap-Up

  • Reviewed basic descriptive statistics
  • Calculated basic descriptive statistics in Excel
  • Discussed histograms
178 / 407
  • Facilitator reviews the concepts introduced in the morning and ensures all questions are answered

Introduction to Statistical Analysis

Wrap-Up

  • Reviewed basic descriptive statistics
  • Calculated basic descriptive statistics in Excel
  • Discussed histograms
  • Created histograms in Excel
179 / 407
  • Facilitator reviews the concepts introduced in the morning and ensures all questions are answered

Introduction to Statistical Analysis

Wrap-Up

  • Reviewed basic descriptive statistics
  • Calculated basic descriptive statistics in Excel
  • Discussed histograms
  • Created histograms in Excel
  • Analyzed NYC motor vehicle collision data
180 / 407
  • Facilitator reviews the concepts introduced in the morning and ensures all questions are answered

Introduction to Statistical Analysis

Lunch

181 / 407

Introduction to Statistical Analysis

Welcome Back!

182 / 407

Introduction to Statistical Analysis

Let's Get Back to the Data

183 / 407
  • Facilitator allows participants to apply learning in a release exercise

Introduction to Statistical Analysis

Preparing the Data - Insert Column

img-center-80

184 / 407

Introduction to Statistical Analysis

Preparing the Data - Calculate Days Open

img-center-60

185 / 407

Introduction to Statistical Analysis

Preparing the Data - Format Result

img-center-100

186 / 407

Introduction to Statistical Analysis

Preparing the Data - Calculate Hours Open

img-center-80

187 / 407

Introduction to Statistical Analysis

Now Let's Calculate the Descriptive Statistics

188 / 407

Introduction to Statistical Analysis

Calculating Descriptive Statistics

img-center-100

189 / 407

Introduction to Statistical Analysis

Calculating Descriptive Statistics

img-center-70

190 / 407

Introduction to Statistical Analysis

Calculating Descriptive Statistics

img-right-45

191 / 407

Introduction to Statistical Analysis

Calculating Descriptive Statistics

img-right-45

  • What's missing?
192 / 407

Introduction to Statistical Analysis

Calculating Descriptive Statistics

img-right-45

  • What's missing?
  • Q1
193 / 407

Introduction to Statistical Analysis

Calculating Descriptive Statistics

img-right-45

  • What's missing?
  • Q1
  • Q3
194 / 407

Introduction to Statistical Analysis

Calculating Descriptive Statistics

img-right-45

  • What's missing?
  • Q1
  • Q3
  • IQR
195 / 407

Introduction to Statistical Analysis

Calculating Descriptive Statistics

img-right-45

  • What's missing?
  • Q1
  • Q3
  • IQR
  • Upper Bound
196 / 407

Introduction to Statistical Analysis

Calculating Descriptive Statistics

img-right-45

  • What's missing?
  • Q1
  • Q3
  • IQR
  • Upper Bound
  • Lower Bound
197 / 407

Introduction to Statistical Analysis

Calculating Descriptive Statistics

img-right-45

  • What's missing?
  • Q1
  • Q3
  • IQR
  • Upper Bound
  • Lower Bound

Calculate those now
(trust us, it'll be useful)

198 / 407

Introduction to Statistical Analysis

Creating a Histogram

img-center-65

199 / 407

Introduction to Statistical Analysis

Creating a Histogram

img-center-100

200 / 407

Introduction to Statistical Analysis

Creating a Histogram - Bin Size 100

img-center-100

201 / 407

Introduction to Statistical Analysis

Creating a Histogram - Bin Size 100

img-center-100

202 / 407

Introduction to Statistical Analysis

Creating a Histogram - Bin Size 50

img-center-80

203 / 407

Introduction to Statistical Analysis

Creating a Histogram - Bin Size 50

img-center-90

204 / 407

Introduction to Statistical Analysis

Creating a Histogram - Formatting Histogram

img-center-100

205 / 407

Introduction to Statistical Analysis

Creating a Histogram

img-center-85

206 / 407

Introduction to Statistical Analysis

Do these tell a true and compelling story?

 

207 / 407

Introduction to Statistical Analysis

Do these tell a true and compelling story?

What do we do about that?

208 / 407

Introduction to Statistical Analysis

Things to Think About

209 / 407

Introduction to Statistical Analysis

Things to Think About

  • Do we need to display all of the data?
210 / 407

Introduction to Statistical Analysis

Things to Think About

  • Do we need to display all of the data?
  • What data do we keep?
211 / 407

Introduction to Statistical Analysis

Things to Think About

  • Do we need to display all of the data?
  • What data do we keep?
  • How do we determine what to show?
212 / 407

Introduction to Statistical Analysis

Things to Think About

  • Do we need to display all of the data?
  • What data do we keep?
  • How do we determine what to show?
  • How do we be clear about what we're not showing?
213 / 407

Introduction to Statistical Analysis

Removing Outliers (Using 1.5 x IQR)

img-center-95

214 / 407

Introduction to Statistical Analysis

Removing Outliers (Using 1.5 x IQR)

215 / 407

Introduction to Statistical Analysis

Removing Outliers (Using 1.5 x IQR)

img-right-25

  • Creating 10 equal bins (IQR of 9.275 divided by 10)
216 / 407

Introduction to Statistical Analysis

Removing Outliers (Using 1.5 x IQR)

img-right-25

  • Creating 10 equal bins (IQR of 9.275 divided by 10)
  • Alternative strategy for determining bins
217 / 407

Introduction to Statistical Analysis

Removing Outliers (Using 1.5 x IQR)

218 / 407

Introduction to Statistical Analysis

Removing Outliers (Using 1.5 x IQR)

img-right-40

  • Only 3,659 service requests greater than 9.275 hours (upper bound)
219 / 407

Introduction to Statistical Analysis

Removing Outliers (Using 1.5 x IQR)

img-right-40

  • Only 3,659 service requests greater than 9.275 hours (upper bound)
  • Represents less than 10% (~9.73%) of 37,615 total service requests
220 / 407

Introduction to Statistical Analysis

What Do We Know?

221 / 407

Introduction to Statistical Analysis

What Do We Know?

  • The median time a noise complaint is open is 2 hours
222 / 407

Introduction to Statistical Analysis

What Do We Know?

  • The median time a noise complaint is open is 2 hours
  • 50% of the noise complaints are closed between 1-4 hours (median is 2 hours, IQR is 3 hours)
223 / 407

Introduction to Statistical Analysis

What Do We Know?

  • The median time a noise complaint is open is 2 hours
  • 50% of the noise complaints are closed between 1-4 hours (median is 2 hours, IQR is 3 hours)
  • There is a long tail of complaints that take longer to close (range of 1158 hours, standard deviation of 36 hours)
224 / 407

Introduction to Statistical Analysis

15 Min Break

img-center-70 Source

225 / 407

Introduction to Statistical Analysis

Correlations

226 / 407

Introduction to Statistical Analysis

Correlations

  • Values tend to have a relationship
227 / 407

Introduction to Statistical Analysis

Correlations

  • Values tend to have a relationship
  • That relationship can be of several types
228 / 407

Introduction to Statistical Analysis

Correlations

  • Values tend to have a relationship
  • That relationship can be of several types
    - Proportional (increase in one increases the other)
229 / 407

Introduction to Statistical Analysis

Correlations

  • Values tend to have a relationship
  • That relationship can be of several types
    - Proportional (increase in one increases the other)
    - Inversely proportional (increase in one decreases the other)
230 / 407

Introduction to Statistical Analysis

Correlations

  • Values tend to have a relationship
  • That relationship can be of several types
    - Proportional (increase in one increases the other)
    - Inversely proportional (increase in one decreases the other)
  • Example -> Height and weight
231 / 407

Introduction to Statistical Analysis

Correlations - Height and Weight

img-center-90

How do we measure this relationship?

232 / 407

Introduction to Statistical Analysis

Coefficient of Correlation

233 / 407
  • Facilitator provides examples of r values with their corresponding visualizations

Introduction to Statistical Analysis

Coefficient of Correlation

  • Quantifies the amount of shared variability between variables
234 / 407
  • Facilitator provides examples of r values with their corresponding visualizations

Introduction to Statistical Analysis

Coefficient of Correlation

  • Quantifies the amount of shared variability between variables
  • Ranges between -1 and +1
235 / 407
  • Facilitator provides examples of r values with their corresponding visualizations

Introduction to Statistical Analysis

Coefficient of Correlation

  • Quantifies the amount of shared variability between variables
  • Ranges between -1 and +1
    - Negative numbers are inversely proportional
236 / 407
  • Facilitator provides examples of r values with their corresponding visualizations

Introduction to Statistical Analysis

Coefficient of Correlation

  • Quantifies the amount of shared variability between variables
  • Ranges between -1 and +1
    - Negative numbers are inversely proportional
    - Positive numbers are directly proportional
237 / 407
  • Facilitator provides examples of r values with their corresponding visualizations

Introduction to Statistical Analysis

Coefficient of Correlation

  • Quantifies the amount of shared variability between variables
  • Ranges between -1 and +1
    - Negative numbers are inversely proportional
    - Positive numbers are directly proportional
    - The closer to either -1 or +1, the greater the correlation
238 / 407
  • Facilitator provides examples of r values with their corresponding visualizations

Introduction to Statistical Analysis

Coefficient of Correlation

img-center-100 http://www.statisticshowto.com/what-is-the-correlation-coefficient-formula/

239 / 407

Introduction to Statistical Analysis

Coefficient of Correlation

img-center-85

http://pixshark.com/correlation-examples.htm

240 / 407

Introduction to Statistical Analysis

img-center-100

241 / 407
  • Facilitator provides example of situation where correlation is high because the variables aren't independent (flight departure delay and flight arrival delay)

Introduction to Statistical Analysis

Correlations - Height and Weight

img-center-90

Download the data so we can check this out

242 / 407

Introduction to Statistical Analysis

We can use the formula...

img-center-90

243 / 407

Introduction to Statistical Analysis

...Or we could use Excel

img-center-90

244 / 407

Introduction to Statistical Analysis

Let's Try That Again

245 / 407
  • Recycling Rate is based on the measurements DSNY takes of its trucks entering and leaving the dump facility (called "tipping") -> A truck is weigned on the way in to tip and on the way out. The difference is the amount tipped
  • The recycling rate is the weight of recyclables / the total weight of collected refuse
  • Sanitation districts align with community districts -> trucks collect only in that CD and the CD is logged by the truck

Introduction to Statistical Analysis

Correlations - Recycling and Median Income

img-center-95

These are even slightly more correlated (r=0.88478) Check it yourself

http://iquantny.tumblr.com/post/79846201258/the-huge-correlation-between-median-income-and

246 / 407

Introduction to Statistical Analysis

Correlations - Recycling and Median Income

img-center-100

247 / 407

Introduction to Statistical Analysis

Coefficient of Determination

248 / 407

Introduction to Statistical Analysis

Coefficient of Determination

  • The percentage of variance in one variable shared with the other
249 / 407

Introduction to Statistical Analysis

Coefficient of Determination

  • The percentage of variance in one variable shared with the other
  • More shared variability implies a stronger relationship
250 / 407

Introduction to Statistical Analysis

Coefficient of Determination

  • The percentage of variance in one variable shared with the other
  • More shared variability implies a stronger relationship
  • Calculate by squaring the correlation coefficient
251 / 407

Introduction to Statistical Analysis

Coefficient of Determination

  • The percentage of variance in one variable shared with the other
  • More shared variability implies a stronger relationship
  • Calculate by squaring the correlation coefficient
    - Ex. The correlation of determination for median income vs recycling rates is 78%
252 / 407

Introduction to Statistical Analysis

However...

253 / 407

Introduction to Statistical Analysis

Correlations

img-center-70

Source

254 / 407
  • Facilitator provides example of spurious correlation to prompt a discussion about the difference between correlation and causation
  • Great question to ask: "Do you think the consumption of margarine has anything to do with the divorce rate in Maine?" -> Important to point out the R value is very high
  • Lots of opportunity for coincidental correlations to occur and we should be aware of that

Introduction to Statistical Analysis

Correlations

img-center-100

255 / 407
  • Facilitator discusses the first case (X->Y) as the ideal, but then introduces the other situations that are more likely in the real world
  • A good example of this is the correlation between ice cream sales and homicides

Introduction to Statistical Analysis

Correlations

img-center-100

Correlation does not imply causation

256 / 407
  • Facilitator discusses the first case (X->Y) as the ideal, but then introduces the other situations that are more likely in the real world
  • A good example of this is the correlation between ice cream sales and homicides
  • Describe the Randomized Controlled Trial and prompt participants to reflect on whether they have enough control over their data to make causal inferences

Introduction to Statistical Analysis

Predictive Modeling

257 / 407
  • After establishing that we can't draw causal inferences, we can use the correlations to make predictions about data we don't know based on what we do know
  • Facilitator emphasizes that models are only as good as the data and understanding you have of what the data says

Introduction to Statistical Analysis

Prediction

258 / 407

Introduction to Statistical Analysis

Prediction

  • Knowing the relationship between variables (i.e. the correlation), we can predict values based on the relationship
259 / 407

Introduction to Statistical Analysis

Prediction

  • Knowing the relationship between variables (i.e. the correlation), we can predict values based on the relationship
  • Can estimate the magnitude as well as the general trend
260 / 407

Introduction to Statistical Analysis

Prediction

  • Knowing the relationship between variables (i.e. the correlation), we can predict values based on the relationship
  • Can estimate the magnitude as well as the general trend
  • More data points, the better the prediction
261 / 407

Introduction to Statistical Analysis

Prediction

  • Knowing the relationship between variables (i.e. the correlation), we can predict values based on the relationship
  • Can estimate the magnitude as well as the general trend
  • More data points, the better the prediction
  • Example -> Knowing the relationship between median income and recycling rates, what can we predict about recycling rates as median incomes grow in communities?
262 / 407

Introduction to Statistical Analysis

Linear Regression

263 / 407

Introduction to Statistical Analysis

Linear Regression

  • Using the known relationship between continuous variables, we can predict unseen values
264 / 407

Introduction to Statistical Analysis

Linear Regression

  • Using the known relationship between continuous variables, we can predict unseen values
  • Assumes relationship is linear
265 / 407
  • Great example of this is the work MODA did in support of the initial roll-out of Pre-K for All when the city needed to find space quickly for centers
  • MODA compared the utility usage with the square footage of city buildings -> utility usage and occupancy is a linear relationship, with each person added to an area consuming a roughly even amount of utilities
  • outliers with high square footage and low utility usage were likely under-utilized and could potentially be available to be used

Introduction to Statistical Analysis

Formula for a line

img-center-80

http://www.algebra-class.com/slope-formula.html
266 / 407

Introduction to Statistical Analysis

Formula for a line

img-center-75

http://www.mathwarehouse.com/algebra/linear_equation/slope-of-a-line.php
267 / 407

Introduction to Statistical Analysis

Formula for a line

268 / 407

Introduction to Statistical Analysis

Formula for a line

  • Draw a line that minimizes the distance between each point
269 / 407

Introduction to Statistical Analysis

Formula for a line

  • Draw a line that minimizes the distance between each point
  • “Line of best fit” -> minimizes the sum of squared residuals
270 / 407

Introduction to Statistical Analysis

Formula for a line

  • Draw a line that minimizes the distance between each point
  • “Line of best fit” -> minimizes the sum of squared residuals

img-center-85

http://nbviewer.ipython.org/github/justmarkham/DAT4/blob/master/notebooks/08_linear_regression.ipynb
271 / 407

Introduction to Statistical Analysis

Linear Regression

272 / 407

Introduction to Statistical Analysis

Linear Regression

  • Characteristics of the line defines the relationship
273 / 407

Introduction to Statistical Analysis

Linear Regression

  • Characteristics of the line defines the relationship
  • Slope -> relationship between independent and dependent variable (how Y increases per unit of X)
274 / 407

Introduction to Statistical Analysis

Linear Regression

  • Characteristics of the line defines the relationship
  • Slope -> relationship between independent and dependent variable (how Y increases per unit of X)
  • Intercept -> expected mean value of Y at X=0
275 / 407

Introduction to Statistical Analysis

Linear Regression

  • Characteristics of the line defines the relationship
  • Slope -> relationship between independent and dependent variable (how Y increases per unit of X)
  • Intercept -> expected mean value of Y at X=0
  • Values along the line are the predicted values for any given value X
276 / 407

Introduction to Statistical Analysis

Displaying a Trendline in Excel

img-left-40

277 / 407

Introduction to Statistical Analysis

Displaying a Trendline in Excel

img-left-40img-right-50

278 / 407

Introduction to Statistical Analysis

Calculating coefficients in Excel

img-center-100

279 / 407

Introduction to Statistical Analysis

Calculating coefficients in Excel

img-center-100

280 / 407

Introduction to Statistical Analysis

Calculating coefficients in Excel

img-center-100

281 / 407

Introduction to Statistical Analysis

Linear Regression Line

img-center-100

282 / 407

Introduction to Statistical Analysis

Linear Regression Line

img-center-100

RecyclingRate = 0.0000001869 * MedianIncome + 0.07480414

283 / 407

Introduction to Statistical Analysis

What is this telling us?

RecyclingRate = 0.0000001869 * MedianIncome + 0.07480414

284 / 407

Introduction to Statistical Analysis

What is this telling us?

RecyclingRate = 0.0000001869 * MedianIncome + 0.07480414

  • Every dollar increase in median income will increase the recycling rate by about 0.0000001869
285 / 407

Introduction to Statistical Analysis

What is this telling us?

RecyclingRate = 0.0000001869 * MedianIncome + 0.07480414

  • Every dollar increase in median income will increase the recycling rate by about 0.0000001869
  • If the median income was zero, the recycling rate would be about 0.07480414
286 / 407

Introduction to Statistical Analysis

What is this telling us?

RecyclingRate = 0.0000001869 * MedianIncome + 0.07480414

  • Every dollar increase in median income will increase the recycling rate by about 0.0000001869
  • If the median income was zero, the recycling rate would be about 0.07480414
  • What is the expected recycling rate for a community district with a median Household income of $70,000?
287 / 407

Introduction to Statistical Analysis

What is this telling us?

RecyclingRate = 0.0000001869 * MedianIncome + 0.07480414

  • Every dollar increase in median income will increase the recycling rate by about 0.0000001869
  • If the median income was zero, the recycling rate would be about 0.07480414
  • What is the expected recycling rate for a community district with a median Household income of $70,000? 0.000001869 * 70000 + 0.07480414 = 0.205677023
288 / 407

Introduction to Statistical Analysis

We've created a model to make predictions!

 

 

289 / 407

Introduction to Statistical Analysis

We've created a model to make predictions!

 

The predictions are just not very good

290 / 407

Introduction to Statistical Analysis

img-center-100

291 / 407

Introduction to Statistical Analysis

How is Linear Regression Useful in Cities?

292 / 407

Introduction to Statistical Analysis

How is Linear Regression Useful in Cities?

  • Make predictions
293 / 407

Introduction to Statistical Analysis

How is Linear Regression Useful in Cities?

  • Make predictions
  • Identify outliers
294 / 407

Introduction to Statistical Analysis

Making Decisions in a Resource Constrained World

295 / 407
  • Facilitator invites participants to reflect on whether they have enough resources for the work they do (they don't)
  • Facilitator invites them to reflect on what they often come up short on

Introduction to Statistical Analysis

Making Decisions in a Resource Constrained World

  • Types of constraints
296 / 407
  • Facilitator invites participants to reflect on whether they have enough resources for the work they do (they don't)
  • Facilitator invites them to reflect on what they often come up short on

Introduction to Statistical Analysis

Making Decisions in a Resource Constrained World

  • Types of constraints
    - Money
297 / 407
  • Facilitator invites participants to reflect on whether they have enough resources for the work they do (they don't)
  • Facilitator invites them to reflect on what they often come up short on

Introduction to Statistical Analysis

Making Decisions in a Resource Constrained World

  • Types of constraints
    - Money
    - Time
298 / 407
  • Facilitator invites participants to reflect on whether they have enough resources for the work they do (they don't)
  • Facilitator invites them to reflect on what they often come up short on

Introduction to Statistical Analysis

Making Decisions in a Resource Constrained World

  • Types of constraints
    - Money
    - Time
    - Resources
299 / 407
  • Facilitator invites participants to reflect on whether they have enough resources for the work they do (they don't)
  • Facilitator invites them to reflect on what they often come up short on

Introduction to Statistical Analysis

Making Decisions in a Resource Constrained World

  • Types of constraints
    - Money
    - Time
    - Resources
    - Political Concerns
300 / 407
  • Facilitator invites participants to reflect on whether they have enough resources for the work they do (they don't)
  • Facilitator invites them to reflect on what they often come up short on

Introduction to Statistical Analysis

Making Decisions in a Resource Constrained World

  • Types of constraints
    - Money
    - Time
    - Resources
    - Political Concerns
  • Need ways to optimize around what’s available
301 / 407
  • Facilitator invites participants to reflect on whether they have enough resources for the work they do (they don't)
  • Facilitator invites them to reflect on what they often come up short on

Introduction to Statistical Analysis

Decision Modeling

302 / 407

Introduction to Statistical Analysis

Decision Modeling

  • The use of mathematical or scientific methods to determine allocation of time, money, and/or other resources
303 / 407

Introduction to Statistical Analysis

Decision Modeling

  • The use of mathematical or scientific methods to determine allocation of time, money, and/or other resources
  • Meant to improve or optimize the performance of a system
304 / 407

Introduction to Statistical Analysis

Decision Modeling

  • The use of mathematical or scientific methods to determine allocation of time, money, and/or other resources
  • Meant to improve or optimize the performance of a system
  • Other terms:
305 / 407

Introduction to Statistical Analysis

Decision Modeling

  • The use of mathematical or scientific methods to determine allocation of time, money, and/or other resources
  • Meant to improve or optimize the performance of a system
  • Other terms:
    - Operations research
306 / 407

Introduction to Statistical Analysis

Decision Modeling

  • The use of mathematical or scientific methods to determine allocation of time, money, and/or other resources
  • Meant to improve or optimize the performance of a system
  • Other terms:
    - Operations research
    - Management science
307 / 407

Introduction to Statistical Analysis

Decision Modeling Process

img-center-100

308 / 407
  • Facilitator describes an example real-world system or process that exists for which we're trying to optimize

Introduction to Statistical Analysis

Decision Modeling Process

img-center-100

309 / 407
  • Facilitator describes the process of modeling that system, however imperfectly into a model of how that system or process works

Introduction to Statistical Analysis

Decision Modeling Process

img-center-100

310 / 407
  • Facilitator describes the means by which we model the system is by setting up a series of equations and inequalities
  • These represent how the system works (how many of X can be processed at a time, how many Y hours it takes to produce a given amount, etc.)

Introduction to Statistical Analysis

Decision Modeling Process

img-center-100

311 / 407
  • From manipulating these equations and inequalities, we arrive at some conclusions about how the system works and can be optimized

Introduction to Statistical Analysis

Decision Modeling Process

img-center-100

312 / 407
  • We then make some correlation between the model and the real world system: how does this impact how we manage our resources in the real world?
  • Examples include staffing, expenditure, or other allocation of resources

Introduction to Statistical Analysis

Decision Modeling Process

img-center-100

313 / 407
  • The final step is to implement the conclusions in the real world
  • Because our model is only an approximation, then we need to assess the real-world impact
  • Making a change in the "real world" will also impact our models, as they need to change to reflect the new way in which the system operates

Introduction to Statistical Analysis

Requirements of DM Process

314 / 407

Introduction to Statistical Analysis

Requirements of DM Process

  • You have to understand the real world process
315 / 407

Introduction to Statistical Analysis

Requirements of DM Process

  • You have to understand the real world process
  • You have to be able to quantify the real world process
316 / 407

Introduction to Statistical Analysis

Requirements of DM Process

  • You have to understand the real world process
  • You have to be able to quantify the real world process
  • You need to test your assumptions
317 / 407

Introduction to Statistical Analysis

Requirements of DM Process

  • You have to understand the real world process
  • You have to be able to quantify the real world process
  • You need to test your assumptions
  • The decisions made based on the model will have an impact that need to be accounted for in the future
318 / 407

Introduction to Statistical Analysis

Optimizing Parking Ticket Revenue

Given an understanding of the basic constraints at work, we can optimize the placement of ticket agents around NYC to maximize revenue

319 / 407

Introduction to Statistical Analysis

Constraints

320 / 407

Introduction to Statistical Analysis

Constraints

  • Density of illegally parked vehicles varies by location
321 / 407

Introduction to Statistical Analysis

Constraints

  • Density of illegally parked vehicles varies by location
  • Number of illegally parked vehicles varies by location
322 / 407

Introduction to Statistical Analysis

Constraints

  • Density of illegally parked vehicles varies by location
  • Number of illegally parked vehicles varies by location
  • Amount of fine varies by location
323 / 407

Introduction to Statistical Analysis

Constraints

  • Density of illegally parked vehicles varies by location
  • Number of illegally parked vehicles varies by location
  • Amount of fine varies by location
  • Number of agents is limited
324 / 407

Introduction to Statistical Analysis

Constraints

  • Density of illegally parked vehicles varies by location
  • Number of illegally parked vehicles varies by location
  • Amount of fine varies by location
  • Number of agents is limited
  • Only so many tickets an agent can write in a day
325 / 407

Introduction to Statistical Analysis

Constraints

  • Density of illegally parked vehicles varies by location
  • Number of illegally parked vehicles varies by location
  • Amount of fine varies by location
  • Number of agents is limited
  • Only so many tickets an agent can write in a day
  • Only so many tickets are actually paid
326 / 407

Introduction to Statistical Analysis

Constraints

  • Density of illegally parked vehicles varies by location
  • Number of illegally parked vehicles varies by location
  • Amount of fine varies by location
  • Number of agents is limited
  • Only so many tickets an agent can write in a day
  • Only so many tickets are actually paid
  • Some neighborhoods are more concerned about illegal parking than others
327 / 407

Introduction to Statistical Analysis

Constraints

  • Density of illegally parked vehicles varies by location
  • Number of illegally parked vehicles varies by location
  • Amount of fine varies by location
  • Number of agents is limited
  • Only so many tickets an agent can write in a day
  • Only so many tickets are actually paid
  • Some neighborhoods are more concerned about illegal parking than others
  • Every borough must have at least one ticket agent
328 / 407

Introduction to Statistical Analysis

Constraints

img-center-100

329 / 407

Introduction to Statistical Analysis

Now let’s optimize!

img-center-60 Photo by Jungwoo Hong on Unsplash

Click to open our practice dataset

330 / 407

Introduction to Statistical Analysis

Setup Spreadsheet

img-center-100

331 / 407

Introduction to Statistical Analysis

Sum Assigned Agents

img-center-100

332 / 407

Introduction to Statistical Analysis

Add Total Number of Agents (1000)

img-center-100

333 / 407

Introduction to Statistical Analysis

Sum Revenue

img-center-100

334 / 407

Introduction to Statistical Analysis

Now for those systems of equations...

via GIPHY

335 / 407

Introduction to Statistical Analysis

Calculating Number of Tickets

336 / 407

Introduction to Statistical Analysis

Calculating Number of Tickets

  • Number of Agents
337 / 407

Introduction to Statistical Analysis

Calculating Number of Tickets

  • Number of Agents
  • Multiplied by the number of tickets they can write (200)
338 / 407

Introduction to Statistical Analysis

Calculating Number of Tickets

  • Number of Agents
  • Multiplied by the number of tickets they can write (200)
  • Multiplied by the density of illegally parked cars
339 / 407

Introduction to Statistical Analysis

Calculating Number of Tickets

  • Number of Agents
  • Multiplied by the number of tickets they can write (200)
  • Multiplied by the density of illegally parked cars
    Number of Tickets = # of Agents * 200 * Density of Illegally Parked Cars
340 / 407

Introduction to Statistical Analysis

Calculating Number of Tickets - Example

Number of Tickets = # of Agents * 200 * Density of Illegally Parked Cars
341 / 407

Introduction to Statistical Analysis

Calculating Number of Tickets - Example

Number of Tickets = # of Agents * 200 * Density of Illegally Parked Cars
  • 100 Agents in Manhattan
342 / 407

Introduction to Statistical Analysis

Calculating Number of Tickets - Example

Number of Tickets = # of Agents * 200 * Density of Illegally Parked Cars
  • 100 Agents in Manhattan
  • Max number of tickets an agent can write is 200
343 / 407

Introduction to Statistical Analysis

Calculating Number of Tickets - Example

Number of Tickets = # of Agents * 200 * Density of Illegally Parked Cars
  • 100 Agents in Manhattan
  • Max number of tickets an agent can write is 200
  • The density of illegally parked cars is 40%
344 / 407

Introduction to Statistical Analysis

Calculating Number of Tickets - Example

Number of Tickets = # of Agents * 200 * Density of Illegally Parked Cars
  • 100 Agents in Manhattan
  • Max number of tickets an agent can write is 200
  • The density of illegally parked cars is 40%
Estimated Number of Tickets = 100 * 200 * 0.4 = 8,000
345 / 407

Introduction to Statistical Analysis

Add Ticketing Calculation

Ticketed = # of Agents * 200 * Density of Illegally Parked Cars

img-center-100

346 / 407

Introduction to Statistical Analysis

Calculating Revenue

347 / 407

Introduction to Statistical Analysis

Calculating Revenue

  • Number of Agents
348 / 407

Introduction to Statistical Analysis

Calculating Revenue

  • Number of Agents
  • Multiplied by the number of tickets they can write (200)
349 / 407

Introduction to Statistical Analysis

Calculating Revenue

  • Number of Agents
  • Multiplied by the number of tickets they can write (200)
  • Multiplied by the fine in that borough
350 / 407

Introduction to Statistical Analysis

Calculating Revenue

  • Number of Agents
  • Multiplied by the number of tickets they can write (200)
  • Multiplied by the fine in that borough
  • Multiplied by the percentage of fines collected in that borough
351 / 407

Introduction to Statistical Analysis

Calculating Revenue

  • Number of Agents
  • Multiplied by the number of tickets they can write (200)
  • Multiplied by the fine in that borough
  • Multiplied by the percentage of fines collected in that borough
  • Multiplied by the density of illegally parked cars
352 / 407

Introduction to Statistical Analysis

Calculating Revenue

  • Number of Agents
  • Multiplied by the number of tickets they can write (200)
  • Multiplied by the fine in that borough
  • Multiplied by the percentage of fines collected in that borough
  • Multiplied by the density of illegally parked cars
Revenue = # of Agents * 200 * Fine * % Collect * Density of Illegally Parked Cars
353 / 407

Introduction to Statistical Analysis

Calculating Revenue - Example

Revenue = # of Agents * 200 * Fine * % Collect * Density of Illegally Parked Cars
354 / 407

Introduction to Statistical Analysis

Calculating Revenue - Example

Revenue = # of Agents * 200 * Fine * % Collect * Density of Illegally Parked Cars
  • 100 Agents in Manhattan
355 / 407

Introduction to Statistical Analysis

Calculating Revenue - Example

Revenue = # of Agents * 200 * Fine * % Collect * Density of Illegally Parked Cars
  • 100 Agents in Manhattan
  • Max number of tickets an agent can write is 200
356 / 407

Introduction to Statistical Analysis

Calculating Revenue - Example

Revenue = # of Agents * 200 * Fine * % Collect * Density of Illegally Parked Cars
  • 100 Agents in Manhattan
  • Max number of tickets an agent can write is 200
  • Fine is $90
357 / 407

Introduction to Statistical Analysis

Calculating Revenue - Example

Revenue = # of Agents * 200 * Fine * % Collect * Density of Illegally Parked Cars
  • 100 Agents in Manhattan
  • Max number of tickets an agent can write is 200
  • Fine is $90
  • They collect 75% of fines
358 / 407

Introduction to Statistical Analysis

Calculating Revenue - Example

Revenue = # of Agents * 200 * Fine * % Collect * Density of Illegally Parked Cars
  • 100 Agents in Manhattan
  • Max number of tickets an agent can write is 200
  • Fine is $90
  • They collect 75% of fines
  • The density of illegally parked cars is 40%
359 / 407

Introduction to Statistical Analysis

Calculating Revenue - Example

Revenue = # of Agents * 200 * Fine * % Collect * Density of Illegally Parked Cars
  • 100 Agents in Manhattan
  • Max number of tickets an agent can write is 200
  • Fine is $90
  • They collect 75% of fines
  • The density of illegally parked cars is 40%
Estimated Revenue = 100 * 200 * 90 * 0.75 * 0.4 = $540,000
360 / 407

Introduction to Statistical Analysis

Add Revenue Calculation

Revenue = # of Agents * 200 * Fine * % Collect * Density of Illegally Parked Cars

img-center-100

361 / 407

Introduction to Statistical Analysis

Spreadsheet Setup

img-center-100

Everything is now setup as a set of relationships (equations)

362 / 407

Introduction to Statistical Analysis

Excel will find optimal number of agents

img-center-100

363 / 407

Introduction to Statistical Analysis

Installing Solver

img-right-75

  • File
  • Options
  • Add-ins
  • Manage
  • “Go…”
364 / 407

Introduction to Statistical Analysis

Installing Solver

img-center-50

365 / 407

Introduction to Statistical Analysis

Using Solver

img-center-100

366 / 407

Introduction to Statistical Analysis

Using Solver - Parameters Dialog Box

img-center-50

367 / 407

Introduction to Statistical Analysis

Using Solver - Setting Objective

img-center-100

368 / 407

Introduction to Statistical Analysis

Using Solver - Setting Objective

img-center-100

  • Maximize the value in cell Sheet1!$I$9 (Total Revenue)

img-center-35

369 / 407

Introduction to Statistical Analysis

Using Solver - Cells to be Changed

img-center-100

370 / 407

Introduction to Statistical Analysis

Using Solver - Cells to be Changed

img-center-100

  • Tell Excel the cells to change to reach the objective

img-center-85

371 / 407

Introduction to Statistical Analysis

Constraints

372 / 407

Introduction to Statistical Analysis

Constraints

  • Number of assigned agents must be greater than or equal to minimum but less than or equal to the maximum
373 / 407

Introduction to Statistical Analysis

Constraints

  • Number of assigned agents must be greater than or equal to minimum but less than or equal to the maximum
  • The number of tickets can’t exceed the estimated number of illegally parked cars
374 / 407

Introduction to Statistical Analysis

Constraints

  • Number of assigned agents must be greater than or equal to minimum but less than or equal to the maximum
  • The number of tickets can’t exceed the estimated number of illegally parked cars
  • The total number of assigned agents must be less than or equal to 1000
375 / 407

Introduction to Statistical Analysis

Constraints

  • Number of assigned agents must be greater than or equal to minimum but less than or equal to the maximum

img-center-75

376 / 407

Introduction to Statistical Analysis

Constraints

  • The number of tickets can’t exceed the estimated number of illegally parked cars

img-center-75

377 / 407

Introduction to Statistical Analysis

Constraints

  • The total number of assigned agents must be less than or equal to 1000

img-center-75

378 / 407

Introduction to Statistical Analysis

Final Touches

379 / 407

Introduction to Statistical Analysis

Final Touches

img-center-60

  • Set solving method to Simplex LP
380 / 407

Introduction to Statistical Analysis

Final Touches

img-center-60

  • Set solving method to Simplex LP img-right-40
  • Check over parameters and then click Solve
381 / 407

Introduction to Statistical Analysis

Results

img-center-100

382 / 407

Introduction to Statistical Analysis

Results

img-center-100

  • This is the optimal assignment of ticket agents to maximize revenue
383 / 407

Introduction to Statistical Analysis

Results

img-center-100

  • This is the optimal assignment of ticket agents to maximize revenue
  • We can adjust the parameters to test different conditions
384 / 407

Introduction to Statistical Analysis

Results

img-center-100

  • This is the optimal assignment of ticket agents to maximize revenue
  • We can adjust the parameters to test different conditions
  • Averaging the results of different models helps us arrive at a more reliable estimate
385 / 407

Introduction to Statistical Analysis

Wrap-Up

img-center-50 Source

386 / 407

Introduction to Statistical Analysis

Goals for the Course

387 / 407

Introduction to Statistical Analysis

Goals for the Course

  • Review descriptive statistics in the context of operational decision making
388 / 407

Introduction to Statistical Analysis

Goals for the Course

  • Review descriptive statistics in the context of operational decision making
  • Discuss correlation and simple linear regression analysis in the context of operational decision making
389 / 407

Introduction to Statistical Analysis

Goals for the Course

  • Review descriptive statistics in the context of operational decision making
  • Discuss correlation and simple linear regression analysis in the context of operational decision making
  • Introduce decision modeling and their use
390 / 407

Introduction to Statistical Analysis

Goals for the Course

  • Review descriptive statistics in the context of operational decision making
  • Discuss correlation and simple linear regression analysis in the context of operational decision making
  • Introduce decision modeling and their use
  • Practice calculating descriptive statistics, calculating correlation, and developing predictive models in Excel
391 / 407

Introduction to Statistical Analysis

Key Takeaways for the Course

392 / 407

Introduction to Statistical Analysis

Key Takeaways for the Course

  • You will be more familiar with basic descriptive statistics
393 / 407

Introduction to Statistical Analysis

Key Takeaways for the Course

  • You will be more familiar with basic descriptive statistics
  • You will be better able to describe correlation and simple linear regression
394 / 407

Introduction to Statistical Analysis

Key Takeaways for the Course

  • You will be more familiar with basic descriptive statistics
  • You will be better able to describe correlation and simple linear regression
  • You will better understand the value of decision models in operational decision making
395 / 407

Introduction to Statistical Analysis

Key Takeaways for the Course

  • You will be more familiar with basic descriptive statistics
  • You will be better able to describe correlation and simple linear regression
  • You will better understand the value of decision models in operational decision making
  • You will be practiced in calculating descriptive statistics, calculating correlation, and developing predictive models in Excel
396 / 407

Introduction to Statistical Analysis

Key Excel Functions for Statistics

  • =AVERAGE(): Calculates the arithmetic mean for a range of numbers
  • =MEDIAN(): Calculates the median for a range of numbers
  • =MODE(): Calculates the mode for a range of numbers
  • =MAX() and =MIN(): Calculates the maximum and minimum number for a range of numbers
  • =QUARTILE(): Calculate quartiles for a range of numbers
  • =CORREL(): Calculates the coefficient of correlation between two ranges of numbers
398 / 407

Introduction to Statistical Analysis

Have trouble with the analysis?

399 / 407

Introduction to Statistical Analysis

For More Information

400 / 407

Introduction to Statistical Analysis

For More Information

401 / 407

Introduction to Statistical Analysis

Contact Information

Mark Yarish

  • Email: mark[at]datapolitan[dot]com
  • Twitter: @MarkYarish

Elizabeth DiLuzio

  • Email: elizabeth[at]datapolitan[dot]com
  • Twitter: @LizDiLuzio
406 / 407

Introduction to Statistical Analysis

Thank You!

407 / 407

Introduction to Statistical Analysis

Welcome

2 / 407
  • Facilitators introduce themselves
  • Facilitators (respectfully) assert authority to be teaching material
  • Facilitators begin creating a safe, comfortable container for participants
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow