layout:true
Excel for Data Analysis I - NYC Parks and Recreation
-- class:center, middle ![img-center-50](images/datapolitan.png) # Excel for Data Analysis I
City of New York Parks and Recreation - - - ## Instructors: Elizabeth DiLuzio and Manon Vergerio ### Follow along at: http://bit.ly/excel-analysis-i #### See the code at: http://bit.ly/excel-analysis-i-code --- class:center,middle # Welcome ??? + Facilitators and TA introduce themselves, establishing mastery but also creating openings for vulnerability and awareness --- # A Few Ground Rules ??? + Facilitators establish the intention we have for the culture of the classroom -- + Step up, step back -- + Be curious and ask questions! -- + Assume noble regard and positive intent -- + Respect multiple perspectives -- + Listen deeply -- + Be present (phone, email, social media, etc.) --- # Introduce Yourself to Your Neighbor + Who are you? + Where do you work? + What has been the proudest moment in your job? + What's your favorite national or state park? ??? + An opportunity for participants to get to know each other and settle into the class + We bring them into a positive space related to their work + Facilitator reminds them the data isn't the end but the means to achieve the actions that make us feel good about our jobs --- # Housekeeping ??? + Facilitator sets expectations with the students + Establishes the "contract" for the class -- + We’ll have one 15 minute break in the morning -- + We’ll have an hour for lunch -- + We’ll have a 15 minute break in the afternoon -- + Class will start promptly after breaks -- + Feel free to use the bathroom if you need during class -- + Please take any phone conversations into the hall to not disrupt the class --- class:center,middle # Let's Get Started ## [Click to download this data](data/Exercise1.xlsx) ??? + Facilitator leads participants through initial exercise that is a "dumb show" of analysis getting to a quick answer + This exercise helps provide a foundation for the discussion of data and Excel as a tool of analysis --- class:center,middle # But before we get too deep into data... ??? + Facilitator demonstrates the concepts of a pivot table by doing a human pivot table + Facilitator asks participants to move to the front of the room + Facilitator leads them through exercises to show sort, filter, and aggregate --- class:center,middle ![img-center-100](images/human_pivot.png) --- name:dumbshow ## Which Borough has the Most 311 Service Requests? ??? + Test of knowledge for the class on the subject and the process + Prompt to ask in order to call out assumptions and then reveal chart to show what the data says + Chart will be what we create -- ![img-center-100](images/parks_311_boro_rev.png) --- # How Do We Get the Answer? .middle[![img-center-100](images/data_graph_rev.png)] ??? + Test of knowledge in the specific skill -> see if class knows about Pivottables (have participants describe) + Instructor introduces pivottables --- # PivotTables ??? + Facilitator introduces concept and usage of pivottables to the class + Best to introduce by having someone describe how they use a pivottable in their work (connects to the day-to-day concretely) -- + A data summarization tool -- + Useful to quickly understand data -- + Can use to graph data totals -- ![img-center-100](images/pt1.png) --- # Creating a PivotTable ![img-center-60](images/pt2.png) ??? + Facilitator describes the steps to creating a pivottable in Excel + Please don't model selecting all data before selecting "Insert Pivottable" -> creates `(blank)` field in pivottable + Just allow to select all data by itself -- + Should default to all your data .red[unless you have any cells selected] ??? + Remind participants not to have any data selected when inserting a pivottable. When this comes up in class (because someone did it), use it as a teachable moment -- + Should default to a new worksheet --- exclude: true # Creating a PivotTable ![img-right-40](images/pt3.png) ## Drag and drop fields to visualize + Row labels + Values + Filter + Column Labels --- exclude:true # Creating a PivotTable ![img-center-100](images/pt4.png) --- # Summarize the `Borough` Field ??? + Facilitator completes the walk-through showing participants how to get the count by borough as shown in the beginning -- + Click `Borough` in the field list and drag into Rows -- + Click `Borough` again in the field list and drag into Values -- ![img-center-90](images/parks_311_boro_rev.png) --- class:center,middle # A Quick Recap from
[Introduction to Data Analytics](http://training.datapolitan.com/parks-intro-analytics/#1) ??? + Facilitator concludes the walk-through to shift to a discussion of data and analysis --- # The Value of Data ??? + Facilitator brings back discussion of the value of data, prompting participants to describe the value of data in their job + Optionally, facilitator can ask the value of the data just aggregated and how it might be useful to decisionmakers -- + Data tells a story about something that's happened -- + Can describe what happened directly or indirectly -- ![img-center-80](images/311data_parks.png) ??? + Facilitator helps participants see the direct measurements of service requests while indirectly measuring when Park incidents actually happen --- class: center,middle # What are some of the data challenges you face? ??? + Facilitator holds discussion with participants about the data issues they face on a regular basis + Facilitator prompts participants to think about their data problems as we go through the data techniques today --- exclude:true class:middle > Facts do not "speak for themselves." They speak for or against competing theories. Facts divorced from theory or visions are mere isolated curiosities. ## -Thomas Sowell *A Conflict of Visions* --- # What is Analysis? -- >“Analysis is simply the pursuit of understanding, usually through detailed inspection or comparison” ## - Carter Hewgley, former Director of Analytics, [Center for Government Excellence](https://govex.jhu.edu/) ??? + Facilitator prompts participants to think about analysis in the context of their work and what it is they are trying to accomplish --- class:center,middle # Overview of Excel ??? + Facilitator holds frank discussion about the good and bad of Excel + If necessary, draw on experiences from previous data exercise (Count by Borough) if participants aren't experienced with Excel --- # Benefits of Excel ??? + Facilitator prompts participants to reflect on what they like about Excel before showing list + Poll class for their key challenges working with Excel + Discuss the ways in which Excel is beneficial in daily work -- + Easy to use -- + Very visual -- + Lots of features and functions -- + Easy to make charts -- + Does a lot of formatting for you --- # Drawbacks of Excel ??? + Facilitator prompts participants to reflect on what they don't like about Excel before showing list + Gauge level of experience with Excel in class -- + Not very intuitive -- + Hard to find what you’re looking for (and they keep moving things around) -- + Lots of features and functions -- + Easy to make (bad) charts -- + Does a lot of formatting for you ??? + Poll class for their key challenges working with Excel + Discuss the ways in which Excel is challenging in their daily work --- class:center,middle # WRAP-UP ![img-center-100](images/imp_guide.png) ??? + Facilitator reflects on material with students to get a view on what we've covered + Facilitator discussed the Implementation Guide and invites them to use it in the course of the class --- class:center,middle # 15 MIN BREAK --- # 5 Data Analytics Tasks ??? + Facilitator introduces the 5 main tasks of analysis + Facilitator reminds participants we've already done the first task + The goal is to dispell the "black box" sense of analytics as something mystic and unknowable -- 1. Aggregating (PivotTable) -- 2. Sorting -- 3. Filtering -- 4. Manipulating -- 5. Visualizing --- # 1. Aggregating Data ??? + Facilitator reminds participants we've already done this work with the available data + Facilitator prompts participants to think of other ways they aggregate data for insights -- + Trends only become clear in aggregate -- + Often where you discover the "so what" -- + Aggregating data meaningfully can be tricky -- ![img-center-70](images/311data_parks.png) --- # 2. Sorting ??? + Facilitator discusses briefly the value of sorting, soliciting examples from participants of when they've sorted data for their work -- + Reorganize rows in a dataset based on the values in a column -- + Can sort on multiple columns --- # Sorting by Date ??? + Facilitator leads participants through simple exercise to sort data in Excel -- ![img-center-100](images/sort_overview.png) -- ![img-center-100](images/sort1_box.png) ??? + Facilitator emphasizes the need to expand the selection, describing (or better having participants describe) the consequences of not expanding the selection --- # Sorting by Date ![img-center-65](images/sort4a.png) ??? + Facilitator demonstrates filtering by multiple columns -- ![img-center-65](images/sort4b.png) --- # Sorting by Date ![img-center-60](images/sort4.png) -- ![img-center-60](images/sort_final.png) ??? + Facilitator finishes demonstration and solicits feedback from participants --- # 3. Filtering ??? + Facilitator leads brief discussion of filtering, soliciting examples from participants on applicability to their own work -- + Only show rows that contain some value -- + Can filter by multiple values -- + Can filter by values in multiple columns --- # Filtering ![img-center-100](images/filter_slide83.png) ??? + Facilitator demonstrates how to filter in Excel --- # Filtering ![img-center-80](images/filtering_new_tree_slide_84.png) ??? + Facilitator demonstrates filtering using the search box + This isn't a Parks example and there's [an open issue to replace the graphic](https://github.com/Datapolitan-Training/parks-excel-analysis-i/issues/14) --- # Remember this? ![img-center-50](images/ideation_post_its.png) ??? + Having concluded the demonstration of initial skills, facilitator conducts a short ideation exercise of no more than 5 mins to brainstorm questions for the exercises to follow + The goal of this is to connect and reinforce [the learning from Introduction to Data Analytics](http://training.datapolitan.com/parks-intro-analytics/#brainstorm) --- # Your Turn ??? + Facilitator moves participants into groups and runs brief ideation session with them to get questions they will explore + This is a release exercise for them to practice skills and meant to go until lunch with an optional check-in if time -- + Working in groups, brainstorm some key questions you'd like to explore in this dataset -- + We'll be using this project you're creating as we go through the class -- + And you can change your minds -- + Focus on practicing the skills rather than arriving at an interesting result -- + **Document your steps in whatever way makes sense** ??? + While we want to invite them to begin working with documentation, the topic of documentation will be taken up after lunch -- + We're here to help if you have any questions --- class:center,middle # WRAP-UP ??? + Facilitator concludes morning session, assessing whether students have any outstanding questions and releases participants for an hour --- class:center,middle # LUNCH --- class:center, middle # WELCOME BACK! --- # A Word About Documentation ??? + Facilitator challenges participants to remember the steps in their analysis + This is a teachable moment to talk about documenting work -- + Take a minute and look over what you wrote down -- + Could you redo your work based on what you did? Why or why not? -- + Imagine this was an important project, what would you do different? --- # Why Document? ??? + Facilitator emphasizes need for work to be auditible and reproducible -- + Keep record of steps -- + Allows us to verify work --
_(Also helps us sleep well at night)_ -- + Hard (but not impossible) to do in Excel -- ## .center[Your future self will thank you] --
.center[Trust me] --- # Documentation Best Practices ??? + Facilitator solicits best practices from participants before offering thoughts + Acknowledge there are many ways to achieve the singular goal -- + Save a clean copy of the data, including source information -- + Save a working copy of the data -- + Keep a list of key steps, ideally in the workbook itself ([create a new tab](https://support.office.com/en-us/article/insert-or-delete-a-worksheet-19d3d21e-a3b3-4e13-a422-d1f43f1faaf2)) -- + Redo the steps to make sure they're complete -- ![img-center-100](images/document_ex1_rev_box.png) ??? + A copy of the example notebook is in the repo and [linked in the slides](data/Exercise1_complete.xlsx) --- # Exploratory Data Analysis ??? + Facilitator introduces key ideas in EDA for discussion -- + Goal -> Discover patterns in the data -- + Understand the context -- + Summarize fields -- + Use graphical representations of the data -- + Explore outliers -- #### Tukey, J.W. (1977). Exploratory data analysis. Reading, MA: Addison-Wesley --- # 4. Manipulating Data (In a good way...) ??? + Facilitator introduces topic of manipulating (or "transforming") data + Solicit feedback from participants on when they need to transform the data they work with and how they do it -- + Sometimes available categories don't make sense -- + Values may not be in the format you need (or have mistakes) -- + You always want to have a clean copy of the data to go back to -- + Best to keep track of what you've done --- # Conditional Formatting ![img-center-65](images/cf1.png) ??? + Optionally, the facilitator can introduce conditional formatting to more visually display the information being analyzed + Be mindful of the participants and how easily (or not) they take in this additional information, which is a digression from the main-line of information presented + Solicit examples from participants on when they have used conditional formatting --- # Conditional Formatting ![img-center-70](images/con_formatting_slide86.png) --- # Functions in Excel ??? + Facilitator introduces functions by soliciting examples from participants on the use of functions in their work -- + Allow you to do calculations in Excel -- + Also allow you to manipulate text -- + [And a whole lot more...](#excel-guide) --- exclude:true name:cat-function # Some Categories of Functions in Excel + Database functions + Date and time functions + Financial functions + Logical functions + Lookup and reference functions + Math and trigonometry functions + Statistical functions + Text functions + 13 categories in total + [Click for a list of some useful functions](#excel-guide) --- # Extracting Hour From Timestamp ![img-center-100](images/hour1.png) ??? + Facilitator leads participants through an exercise to use a simple function to extract the hour from the created date + This is a good point to demonstrated the help text that displays when a function name is typed into the cell ![img-center-50](images/function_help.png) --- class:middle,center [![img-center-100](images/hour_def_box.png)](https://support.office.com/en-us/article/HOUR-function-e2833b50-0db0-499b-acc5-e9ae03de8fbb) --- # Extracting Hour From Timestamp ![img-center-100](images/hour2.png) --- # Extracting Hour From Timestamp ![img-center-100](images/hour3.png) ??? + Facilitator discusses data types and how to convert from one to the other (in this case date to number) + It's key to remember we're converting the time to a number that represents the hour of the day, not to just the hour. There is a subtle difference that keeps coming up when participants just want to reformat the date-time to display hour --- # Extracting Hour From Timestamp ![img-center-100](images/hour4.png) --- # Extracting Hour From Timestamp ![img-center-100](images/hour5.png) --- # Extracting Hour From Timestamp ![img-center-100](images/hour6.png) --- # Extracting Hour From Timestamp ![img-center-100](images/hour7.png) --- # Charting Complaints by Week Day ![img-center-100](images/day_graph.png) ??? + Optional exercise to chart the days complaints come in + Not recommended if class is struggling with the basic concept of extracting hour --- # Charting Noise Complaints by Week Day [![img-center-100](images/week2.png)](https://support.office.com/en-us/article/day-function-8a7d1cbb-6c7d-4ba1-8aea-25c134d03101) -- + Not what we want ??? + This is an opportunity to introduce the function wizard if there's time + Demonstrate how to search for a function --- # Charting Noise Complaints by Week Day [![img-center-85](images/week3.png)](https://support.office.com/en-us/article/WEEKDAY-function-60e44483-2ed1-439f-8bd0-e404c190949a) -- + What we want --- # Charting Noise Complaints by Week Day ![img-center-100](images/week1.png) --- # Charting Noise Complaints by Week Day ![img-center-100](images/week4.png) --- # Charting Noise Complaints by Week Day ![img-center-100](images/week5.png) --- # Charting Noise Complaints by Week Day -- + Refresh the PivotTable ??? + Important skill to reinforce with participants -> always refresh pivottable if you've made any changes to your data, even if you create a new pivottable, it won't take your updates until you refresh -- + Add `dow` to Rows -- + Format the chart ![img-center-75](images/dow_parks.gif) ??? + Facilitator pauses to check knowledge and make sure students understand material + Graph data to make sure they get the reason for what we're doing before showing the `choose` + If time, discuss the need to have the same number of each day represented in the data in order to avoid bias --- # Charting Noise Complaints by Week Day + You could also use `CHOOSE` ??? + See not above about skipping this if participants are having difficulty with the previous exercise -- ![img-center-80](images/choose1.png) --- # Charting Noise Complaints by Week Day ![img-center-90](images/choose2.png) -- ![img-center-85](images/choose3.png) --- # 5. Visualizing Data ??? + Facilitator introduces the basic concepts of why we visualize data before leading participants through an exercise to create the chart shown -- + Quickly communicate information -- + Tell a clearer story -- + A picture is worth a thousands words --- class:center,middle # When Are Parks Complaints Received? ![img-center-90](images/311data_parks.png) --- # Charting Parks Complaints by Hour ![img-right-60](images/hour_chart1.png) + Refresh PivotTable ??? + Facilitator reminds participants of the importance of refreshing the pivottable when altering data --- # Charting Parks Complaints by Hour ![img-right-40](images/hour_chart2.png) + Refresh PivotTable + Find `Hour` and add it to "Rows" --- # Charting Parks Complaints by Hour + Refresh PivotTable + Find `Hour` and add it to "Rows" + Style the chart ![img-center-70](images/311data_parks.png) --- # Exporting Graphics ??? + Facilitator demonstrates steps to export graphics from Excel to other applications + Solicit feedback from participants on why this is important and any challenges they've had in doing so -- + You can copy and paste graphics into any Microsoft application -- + The default is a Microsoft Office Graphic Object ??? + I prefer to paste as picture to preserve the formatting, but there are reasons to paste as a graphic object that can be edited + Ideally facilitator will discuss the pros and cons with participants -- ![img-right-50](images/copy1.png) -- + You can change to an image or other format -- + [More information is online](http://www.thespreadsheetguru.com/blog/best-way-to-paste-special-excel-objects-as-powerpoint-images) --- # Your Turn ??? + Facilitator initiates a release exercise for participants to continue working on their mini-projects + The idea is to connect the previous learning in Introduction to Data Analytics to come up with a workable process map sequencing their questions into a logical order -- + Working in your groups, develop a process map for your project -- + Focus on the inputs and outputs, but think about the potential impact -- + Create a story with graphs and other explanatory aids -- + Don't forget to add documentation -- + You'll have more time to work on this and be ready to present -- + Feel free to reach out if you have any questions --- class:center,middle # WRAP-UP ??? + Facilitator closes out session with optional reflection on the work so far and solicits any questions before putting participants on a break --- class:center,middle # 15 MIN BREAK --- class:center,middle # Visualizing Data ??? + Facilitator leads a discussion of the key elements in visualization + The goal is to get participants to recognize their own innate sense of what makes for a "good" visualization and to trust that sense + Throughout this discussion, I ask why they like or dislike a particular visualization compared to another, forcing them to articulate their reasons --- # Why do we visualize data? ![img-center-70](images/vision_zero.png) .caption[Source: Ontodia Civic Dashboards Project] ??? + Facilitator introduces the participants to the Vision Zero initiative and asks where traffic fatalities happen + Note: [Source link no longer works](http://blog.civicdashboards.com/2014/01/vision-zero-visualizing-traffic-related-fatalities/) --- # Why do we visualize data? ![img-center-100](images/vision_zero_table.png) ??? + Facilitator asks if this is a better or worse way to display the data and why --- # Bar Chart ![img-center-95](images/311_chart1_parks.png) ??? + Facilitator prompts with question "What are we visualizing here?" --- #"Barless" Bar Chart ![img-center-95](images/311_chart2_parks.png) ??? + Facilitator prompts with question "What are we visualizing here?" + Important to let them realize this is the same data and then ask if it is better or worse than previous, and why + Good to point out the lack of line between the point and the label, hard to compare across categories --- # Horizontal Bar Chart ![img-center-95](images/311_chart3_parks.png) ??? + Key feature of this chart is labels that are easier to read across and longer range so smaller quantities can be compared + Odd for data to be displayed this way vs. vertical bars --- # Ranked Horizontal Bar Chart ![img-center-95](images/311_chart4_parks.png) ??? + Sorting the data allows for data points to be rank-ordered and easier to understand --- # Grouped Bar Chart ![img-center-95](images/311_chart6_parks.png) ??? + Demonstrates the impact of losing color, and how powerful color is as a discriminator in visualizations --- # Precognitive Processing ![img-center-100](images/chart7.png) ??? + Facilitator prompts participants to find the red circle on the right and then on the left to demonstrate the ease of determining difference in color vs shape + Connect to idea of [precognitive (or "preattentive") visual processing](https://www.interaction-design.org/literature/article/preattentive-visual-properties-and-how-to-use-them-in-information-visualization) to make visualiations easier to understand --- # Grouped Bar Chart ![img-center-95](images/311_chart5_parks.png) ??? + Allows categories to be compared together + Increases the information density (also saves space) --- # Stacked Bar Chart ![img-center-95](images/311_chart7_parks.png) ??? + Facilitator introduces stacked bar chart, discussing pros (able to compare) with cons (hard to interpret without labels) + Solicit feedback from students on what they think of the chart type --- # 100% Stacked Bar Chart ![img-center-95](images/311_chart7a_parks.png) ??? + Facilitator introduces 100% stacked bar chart, discussing pros (able to compare proportions) with cons (hard to interpret without labels) + Solicit feedback from students on what they think of the chart type --- # Stacked Bar Chart + Which of these do you like better? ![img-left-50](images/311_chart7_parks.png) ![img-right-50](images/311_chart7a_parks.png) ??? + Facilitator runs brief discussion of which group likes and why -- ##### + Why? ??? + Most participants don't like either, but they should be able to describe why in clear terms --- # Line Chart ![img-center-95](images/311_chart8_parks.png) ??? + Facilitator discusses when to use a line chart (time-series data connected to each other) + Possible to point out the visualization encodes not only the level but the relationship of one to the next (slope of the line) + Can point out this is like the heart-beat of the city, or at least of NYC Parks --- # Why use a line chart? ![img-center-100](images/311_chart8a_parks.png) ??? + Facilitator asks participants to pick a visualization of the data and why, iwth the opportunity to change if they are compelled one way or the other. + Column chart is more accurate in that each day is a discreet unit. Area chart gives "feeling" of call volume better than simple line chart. No one should like the dot chart. --- # Line Chart with Categorical Data ![img-center-95](images/311_chart9_parks.png) ??? + This is an example of what not to do with the data. Each category is distinct and not related, though the chart implies it is. Participants will likely have done this before, but should be asked to reflect on whether one category has an impact on another. + Good opportunity to reflect on the fact Excel doesn't care about the kind of chart you create and will produce whatever you tell it to --- # Multi-Line Chart ![img-center-90](images/311_chart10_parks.png) ??? + Facilitator introduces multi-line charts and their value (comparison, increased information density) --- # Multi-Line Chart ![img-center-100](images/311_chart11_parks.png) ??? + Facilitator describes situation with too much information + Good point to emphasize "Not everything that is true is interesting" and give them permission to not display all the data if it doesn't contribute to a true and compelling story (so long as they're not ignoring the data) --- # Pie Charts ![img-center-85](images/311_chart13_parks.png) ??? + Facilitator asks "When do you use a pie chart?" or "What are pie charts good for?" (absolutely nothing...) + Pie charts are best used to show proportions (and not time-series data or more than one variable) --- # Pie Charts ![img-center-100](images/chart15_trim.png) ??? + Facilitator prompts participants to identify the largest slices in each pie chart -- ![img-center-75](images/chart16.png) "Piecharts". Licensed under [CC BY 1.0](https://creativecommons.org/licenses/by/1.0/legalcode) via [Wikimedia Commons](http://commons.wikimedia.org/wiki/File:Piecharts.svg#/media/File:Piecharts.svg) with modifications ??? + Comparing the results, it's clear the difficulty discerning nearly similar quantities --- # Pie Charts ![img-center-90](images/chart14.png) "Badpie" by Gilgongo - Own work. Licensed under [CC BY-SA 3.0](https://creativecommons.org/licenses/by-sa/3.0/us/legalcode) via [Wikimedia Commons](http://commons.wikimedia.org/wiki/File:Badpie.png#/media/File:Badpie.png) ??? + Facilitator presents an alternative scenario of many small slices as a potential issue + Good to note tendency to add labels to the point of making it a table with color and point out the color reuse in the chart, making it hard to interpret --- # Pie Charts ![img-center-medium](images/chart17.png) Source: http://simplystatistics.org/2012/11/26/the-statisticians-at-fox-news-use-classic-and-novel-graphical-techniques-to-lead-with-data/ For Analysis: http://peltiertech.com/use-bar-charts-not-pies/ ??? + Facilitator carefully points out this is a common issue and leads them through an exploration of this chart, asking: 1. "Which candidate got the most support?" 2. "Which slice looks bigger?" + Participants pick up on the issue of being more than 100%. Review links for more information on the context (respondents could pick more than one figure to support) --- # Donut Chart ![img-center-80](images/311_chart18_parks.png) ??? + The donut chart is preferred in any instance where one would normally use a pie chart + The loss of visual accuity occurs in the center of the pie (so we pull it out) --- name:treemap # [Treemap](https://en.wikipedia.org/wiki/Treemapping) ![img-center-100](images/treemap_parks.png) ??? + An alternative visualization that is more advanced, but interesting for it's possibilities (still probably need numbers) --- exclude:true # For more on Pie Charts 1. Spence, I. (2005). [No Humble Pie: The Origins and Usage of a Statistical Chart](http://www.psych.utoronto.ca/users/spence/Spence%202005.pdf). *Journal of Educational and Behavioral Statistics*, 30(4), 353–368. 2. Heer, J. and Bostock, M. (2010). [Crowdsourcing Graphical Perception: Using Mechanical Turk to Assess Visualization Design](http://hci.stanford.edu/publications/2010/crowd-perception/heer-chi2010.pdf). *CHI 2010*, April 10–15, 2010, Atlanta, Georgia, USA. --- class:center,middle # How do you learn to create good visualizations? # --- class:center,middle # How do you learn to create good visualizations? # ...Make a lot of bad visualizations --- name:final-ex # Your Turn ??? + Facilitator provides release exercise for end of class, emphasizing this is their time to practice the skills covered and ask for any help to work their maps -- + Looking at the work you've done, apply the principles of good charts to your work -- + Create a quick presentation of your data story to share with the group -- + We'll be here to help -- + Make sure you've updated your documentation to be complete -- + Feel free to use the [list of useful functions](#excel-guide) -- +
Click to submit your work
??? + Task: Students work as a group to develop key insights and tell the story as it relates to the problem defined in the morning session, documenting their work as they go for presentation to their fellow students + Outcome: Students have implemented the process they outlined in morning session and documented their steps for review and feedback from their peers --- class:center,middle # Presentations ??? + Ideally, students will have put both their visualizations and the documentation together into PowerPoint or Word and submitted to the link + Reinforce the work done with positive comments but also suggestions on how to do more wherever possible --- exclude:true # Verification + Review the documentation the group provided + Rerun the analysis (without losing their work) + Be ready to present their work to the class ??? + Task: As a group, review the work of another group, reviewing documentation and verifying analytical steps for accuracy, completeness, and ability to answer the initial question + Outcome: Students practice reviewing documentation and communicating findings to an audience, as well as giving and receiving constructive feedback in a supportive environment + Format: Small group presentations and facilitated discussion + Outline + Students as a group review the work of another group and verifying results + Students present the work of the group they reviewed to the class for brief discussion and feedback --- class:center,middle # WRAP-UP ??? + Facilitator reviews learning with participants --- # 5 Data Analytics Tasks -- 1. Aggregating (PivotTable) -- 2. Sorting -- 3. Filtering -- 4. Manipulating -- 5. Visualizing --- # What Else We've Covered -- + The Value of Data -- + The Definition of Analysis -- + What else? ??? + Facilitator solicits feedback from participants on the exerience ahead of survey --- class:center,middle # Some Warnings ??? + Facilitator provides some basic awareness around issues to be mindful of in analysis --- # To Err is Human > Since 1995, 88% of the 113 spreadsheets audited in 7 studies ... have contained errors. ## - Raymond Panko, "[What We Know About Spreadsheet Errors](http://panko.shidler.hawaii.edu/My%20Publications/Whatknow.htm)" --- # Spreadsheet Errors Have Consequences ![img-center-80](images/athensprotest.jpg) .caption[Image Credit: [http://underclassrising.net](https://www.flickr.com/photos/0742/4163565280), [CC BY-SA 2.0](https://creativecommons.org/licenses/by-sa/2.0/)] --- # Cautionary Tale - Austerity in Europe ![img-center-60](images/reinhart_rogoff_coding_error.png) Source: Mike Konczal, "[Researchers Finally Replicated Reinhart-Rogoff, and There Are Serious Problems](http://rooseveltinstitute.org/researchers-finally-replicated-reinhart-rogoff-and-there-are-serious-problems/)" --- # Data Sets -- + [311 Noise Complaint data](https://data.cityofnewyork.us/Social-Services/311-Service-Requests-from-2010-to-Present/erm2-nwe9) -- + [Exercise Data with formulas and documentation example](data/Exercise1_complete.xlsx) ??? + Facilitator references sources of information for class --- # Resources ??? + Facilitator provides key resources for continued learning -- + [Microsoft Excel keyboard shortcuts](https://support.office.com/en-us/article/keyboard-shortcuts-in-excel-for-windows-1798d9d5-842a-42b8-9c99-9b7213f0040f) -- + [Microsoft Office Community](https://answers.microsoft.com/en-us/msoffice/forum/msoffice_excel?sort=LastReplyDate&dir=Desc&tab=All&status=all&mod=&modAge=&advFil=&postedAfter=&postedBefore=&threadType=All&isFilterExpanded=false&page=1) - Community Q&A Board -- + [7 essential Excel tricks every office worker needs to know](https://www.microsofttraining.net/sharing/11-seven-essential-excel-tricks-every-office-worker-needs-to-know.html) ??? + We cover 6 of the 7 in class -- + [11 Places That Can Turn You Into A MS Excel Power User](https://medium.com/the-mission/eleven-places-that-can-turn-you-into-a-microsoft-excel-power-user-c119f18e138c) -- + [Data Sensemaking](http://fellinlovewithdata.com/teaching/developing-a-data-sensemaking-course) (blogpost by Enrico Bertini) -- + [Caveats for Data Visualization](https://www.data-to-viz.com/caveats.html) (Tips and warnings) -- + [Copy of today's handout](workbook.pdf) -- + [Datapolitan training classes](http://training.datapolitan.com/) -- + [In Praise of Documentation](https://medium.com/@Datapolitan/in-praise-of-documentation-245a2bf3d74b) (A blogpost on documenting) --- name:final # Final Thoughts ??? + Facilitator wraps up class with key points to remember -- + Data can tell a story, but doesn't speak for itself -- + Analysis is the search for understanding and where we learn to tell that story -- + Be good to your data and it will be good to you -- + If you don't document it, it's like it never happened --- exclude: true + Don't try to boil the ocean + Data can tell a story, but doesn't speak for itself + Good analysis is about telling a true and compelling story + Be good to your data and it will be good to you + Try to focus on outcomes, not just outputs --- exclude:true # Post-Class Survey + We'll be reaching out to you in about 2 weeks + This is a pilot and we need your feedback + Please let us know how this has helped you in your work + This will help us improve this class and our other planned classes --- # .center[Contact Information] ## Elizabeth DiLuzio
Email: Elizabeth[at]datapolitan[dot]com
Website:
http://www.evallearn.com
Twitter:
@lizdiluzio
## Manon Vergerio + Email: manon.vergerio[at]gmail[dot]com --- class:center, middle # THANK YOU! # Please take a moment to
[provide your feedback](https://docs.google.com/forms/d/e/1FAIpQLScqxFDbN09n67Od5Cj5hIWatMIwnqBoQhLzTnMn4JT5hcLSBw/viewform?usp=sf_link) --- name: excel-guide # Key Excel Functions for Summarizing Data + [`=SUM()`](https://support.office.com/en-us/article/SUM-function-043e1c7d-7726-4e80-8f32-07b23e057f89): Calculates the sum for a range of numbers + [`=COUNT()`](https://support.office.com/en-us/article/COUNT-function-a59cd7fc-b623-4d93-87a4-d23bf411294c): Counts the number of cells containing numbers in a range + [`=COUNTA()`](https://support.office.com/en-us/article/COUNTA-function-7dc98875-d5c1-46f1-9a82-53f3219e2509): Counts the number of non-blank cells in a range + [`=HOUR()`](https://support.office.com/en-us/article/HOUR-function-a3afa879-86cb-4339-b1b5-2dd2d7310ac7): Extracts the hour from a timestamp + [`=WEEKDAY()`](https://support.office.com/en-us/article/WEEKDAY-function-60e44483-2ed1-439f-8bd0-e404c190949a): Extracts the day of the week from a timestamp + [`=CHOOSE()`](https://support.office.com/en-us/article/CHOOSE-function-FC5C184F-CB62-4EC7-A46E-38653B98F5BC): Uses an index number to return a result from an ordered list of values --- # Key Excel Functions for Summarizing Data + [`=MID()`](https://support.office.com/en-us/article/MID-MIDB-functions-d5f9e25c-d7d6-472e-b568-4ecb12433028): Select a specified number of characters from a text string + [`=LEFT()`](https://support.office.com/en-us/article/LEFT-LEFTB-functions-9203d2d2-7960-479b-84c6-1ea52b99640c): Select a specified number of characters from the beginning of a text string + [`=RIGHT()`](https://support.office.com/en-us/article/RIGHT-RIGHTB-functions-240267ee-9afa-4639-a02b-f19e1786cf2f): Select a specified number of characters from the end of a text string + [`=FIND()`](https://support.office.com/en-us/article/FIND-FINDB-functions-c7912941-af2a-4bdf-a553-d0d89b0a0628): Find the location of a given character in a text string --- # Key Excel Functions for Summarizing Data + [`=CONCATENATE()`](https://support.office.com/en-us/article/CONCATENATE-function-8f8ae884-2ca8-4f7a-b093-75d702bea31d): Combine characters together into a text string + [`=VLOOKUP()`](https://support.office.com/en-us/article/VLOOKUP-function-0bbc8083-26fe-4963-8ab8-93a18ad188a1): An operation to look up a value in another location based on an index value + [`=IF()`](https://support.office.com/en-us/article/IF-function-69aed7c9-4e8a-4755-a9bc-aa8bbff73be2): A function for logical comparison between values to return a desired result given a particular condition