# # Google Analytics - 2 Ask - Module 3 - Spreadsheet Magic
## 0. Overview
>[! quote] Spreadsheets are a key data analytics tool. Here you’ll learn both why and how data analysts use spreadsheets in their work. You’ll also investigate how structured thinking helps analysts understand problems and come up with solutions.
### Learning Objectives
>[! info] Module 3 Learning Objectives
>- [ ] Discuss the data analyst's use of [[Spreadsheet|spreadsheets]] with reference to roles and responsibilities
>- [ ] Demonstrate the use of [[Spreadsheet]] to complete basic tasks of the data analyst including entering and organizing data
>- [ ] Demonstrate an understanding of the use of [[Formula|formula]] in [[Spreadsheet|spreadsheets]] including a definition and examples
>- [ ] Compare [[Formula]] and [[Function]] with reference to similarities and differences
>- [ ] Describe the key ideas associated with [[Structured thinking]] including [[Problem domain]], [[Scope of work (SOW)]], and [[context]]
---
### Glossary for Module 3
- [[AVERAGE]]
- [[Borders]]
- [[Cell reference]] | [[COUNT]]
- [[Equation]]
- [[Fill handle]] | [[Filtering]]
- [[Header]]
- [[Math expression]] | [[Math function]] | [[MAX]] | [[MIN]]
- [[Open data]] | [[Operator]] | [[Order of operations]]
- [[Problem domain]] | [[Range]] | [[Report]] | [[Return on investment (ROI)]] | [[Revenue]] | [[Scope of work (SOW)]] | [[Sorting]] | [[SUM]]
<hr>
## 1. Work with [[Spreadsheet|spreadsheets]]
### 1.1 Video: _The amazing spreadsheet_ (1:00)
- spreadsheets often unsung heroes
- will often be the first tool used.
- because of functions and formulas this enables an analyst to work very quickly within spreadsheet environment.
### 1.2 Video: _Get to work with spreadsheets_ (2:00)
- some of the ways that data analysts use spreadsheets to help with day-to-day stuff
#### Major tasks
acquire data, and add to environment.
- organize data
- use a [[Pivot table]]
- Then can filter the data within the table
- calculate your data
- [[Formula]] and [[Function]]
### 1.3 Reading: _Spreadsheet and the [[data life cycle]]_
- looking at how spreadhseets relate to each phase of the [[data life cycle]] : Plan, capture, manage, analyze, archive and destroy.
#### [[data life cycle - Plan]]
- Define and develop organizational standards for users who will work within spreadsheets.
- How to format cells
- How to name your [[Attribute|attributes]]
- Standardize commenting and sharing of spreadsheet documents.
#### [[data life cycle - Capture]]
- connect a spreadsheet to a data source. No COPY/PASTA
- Doing so, enables automatic updates on the spreadsheet.
#### [[data life cycle - Manage]]
- Users can manage all kinds of data within a spreadsheet.
- Spreadsheets can store data, organize data, filter data.
- Standards can also outline who and how data is accessed within a spreadsheet.
#### [[data life cycle - Analyze]]
- data in a spreadhseet can be easily analyzed
- Tools can include simple visualizations and [[Pivot table]] and [[Pivot chart]]
- Spreadsheets also include [[Math function]] and [[Formula|formula]] to enable fast analysis.
#### [[data life cycle - Archive]]
- spreadsheets not used often can be archived. Handy if the data represents historical data to be saved for later.
#### [[data life cycle - Destroy]]
- delete them if required to do so ending the lifecycle.
>[! cue] [Shortcuts for sheets](https://support.google.com/docs/answer/181110#zippy=%2Cmac-shortcuts)
See the link in the cue for a list of shortcuts for Google Sheets
>[!cue] [MS Excel shortcuts](https://support.microsoft.com/en-us/office/keyboard-shortcuts-in-excel-1798d9d5-842a-42b8-9c99-9b7213f0040f)
See the link in the cue for a list of shortcuts that work in Microsoft Excel.
### 1.4 Hands-on activity: _Intro. to Google Sheets_
Steps
1. Access the applications
2. Create a new spreadsheet. There is a typo here.
3. Edit and format your spreadsheet
4. Share the spreadsheet
5. Share a link to a file or folder
6. Unshare the spreadsheet
The activity took us through the steps. This was very basic but I appreciated the author's being completely inclusive which enables every level of user the ability to learn the spreadsheet. I am sure that in future readings I will begin to expand my knowledge because I don't know everything.
### 1.5 Reading: _Step-by-step: Basic spreadsheet tasks_
This is the first reading that is a step-by-step type. These will be used to follow along in each step of the upcoming video with the same title.
This activity guided us on the precise steps to do what is in the subsequent video.
### 1.6 Video: _Basic spreadsheet tasks_ (4:00)
- best practice is to create a file with a short, descriptive file name.
- best practice is to store the file in a hierarchical folder structure.
### 1.7 Reading:_Learn more about spreadsheet basics_
<hr>
## 2. [[Formula]] in [[Spreadsheet|spreadsheets]]
### 2.1 Step-by-step: _Formulas for success_
Processes demonstrated:
1. Create a [[Formula|formula]]
2. Use [[Cell reference]] in a formula.
3. copy a formula
1. The formula will update with changes to the embedded cell references in this example
4. Calculate average sales
5. Calculate the percent change in sales
6. Correct a formula error
### 2.2 Video: _Formulas for success_ (7:00)
- Explain the basics of using spreadsheet formulas and calculations
- The video covers all of the steps covered already above in 2.1
Discusses [[Math expression]]
Discusses [[Operator]] and the use in [[Formula|formulas]]
Defines [[Range]]
### 2.3 Reading: _Quick reference: Formulas in Spreadsheets_
Define [[Fill handle]]
Define [[Cell reference#Absolute reference example|absolute cell reference]] and [[Cell reference#Relative reference example|relative cell reference]]
Define [[Equation]]
### 2.4 Hands-On: _Bakery Sales insights_
This activity had me updating the spreadsheet to identify highest units sold and highest revenue item. To do this I had to manually aggregate and sum the different categories together using formulas. Then I looked at total revenue which is also using formulas,
The key takeaway from this is that spreadsheet formulas are a tool to quickly process and analyze a dataset.
### 2.5 Video: _Spreadsheet errors and fixes_ (7:00)
Introduced the `#DIV/O` error
the solution to fix in spreadsheet is to use `IFERROR` function. [[IFERROR]]
Then we covered a bunch of errors.
>[! cue] [[Formula#Spreadsheet Error Codes|Common Spreadsheet Error Codes]]
#### common errors in spreadsheets
When using a formula or function sometimes errors occur. Each error code will indicate a different kind of problem
[[ERROR!]] : Unique to Google Sheets. This means the formula cannot be interpreted as entered
[[DIV 0!]]: This error indicates that a cell being referenced is trying to divide by zero.
[[NA Error]]: This means that data in a spreadsheet cannot be found.
[[NAME?]] : the formula name is entered incorrectly.
[[NUM!]]: A calculation cannot be executed as specified. Mixing numerical cells with strings might be the issue.
[[VALUE!]] : a general error the indicates an issue with the cell or cell references.
[[REF!]]: indicates that a referenced cell data has been deleted.
### 2.6 Reading: _More about errors and fixes_
>[! cue] [Google Sheets formula error guide](https://www.benlcollins.com/spreadsheets/formula-parse-error/) & [MS EXCEL error helper docs](https://support.microsoft.com/en-us/office/formulas-and-functions-294d9486-b332-48ed-b489-abe7d0f9eda9?ui=en-US&rs=en-US&ad=US#id0eaabaaa=errors)
The course provided a great PDF file (kept private) that goes into further detail about the above errors.
>[! tip] Use conditional formatting on a worksheet to reveal all cells with errors!
### 2.7 Hands-On: _Resolve spreadsheet errors_
Used a template spreadsheet that had a lot of errors. Practiced resolving the errors. A lot less stressful when you know what to look for. Like a programming language!
This is a connection that I personally never made. I am glad, again, that I am owning my beginner status.
### 2.8 Quiz: _test your knowledge on using formulas in spreadsheets_
4 questions, untimed, 75% minimum to proceed.
#### Results
100%
<hr>
## 3. [[Function]] in [[Spreadsheet|spreadsheets]]
### 3.1 Step-by-step: _Functions 101_
This article will be demonstrated in the following video.
Function reviewed.
>[! cue] [[Function#Spreadsheet functions|List Spreadsheet Functions]]
- [[AVERAGE]]
- [[IFERROR]]
- [[MAX]] | [[MIN]]
- [[SUM]]
### 3.2 Video: _Functions 101_ (6:00)
Practiced implementing the functions on a small dataset
### 3.3 Reading: _Quick Reference: Functions in spreadsheets_
for [[Cell reference|cell references]] there is a difference between absolute, relative and mixed.
>[! cue] Absolute, Relative, and Mixed references
- Relative reference is the default reference in spreadsheets. The _relative_ part means that if a formula is oringally written in H2 as `SUM(B2:D2)` and that formula and function is copied to an adjacent cell, the referenced range will adjust accordingly. This enables the analyst to quickly apply formula across rows or down columns. Copying the formula from H2 to the next row H3 will adjust the formula in that cell to `SUM(B3:D3)` Also, copying from H3 to I3 will adjust the formula as `SUM(C3:E3)` if that behavior is not desired the analyst then uses either absolute references or mixed references.
- absolute references are implemented by `
before the column letter and `
before the row number. An absolute reference will not change the cell reference when copied to another cell.
- `$B$2` is an absolute reference to Column B and Row 2
- Mixed reference is any combination of the two `$B2` where the absolute reference is on column B, but relative to row 2. Or, `B$2` where the absolute reference is on row 2 but relative to column.
>[! tip] use the `F4` key to quickly toggle reference styles on the range of cells when editing a formula.
>[! cue] [[COUNTIF]] Function
The COUNTIF function is a popular function to use where it counts the number of cells that meet a given criteria which is passed to the function as an argument.
### 3.4 Hands-On: _Create a custom data table_
Using a spreadsheet and functions to create a table of useful data.
- Dataset volume of 32.5k observations on 6 attributes, Higher than I expected for a practice activity.
- Understand the data: they define each of the attributes using a table. Much like a data dictionary. Column Name, Column Description, and Sample data all provided. Excellent metadata !
#### Scenario objectives
1. What is the total number of applications received each month?
1. Setup summary data tab
2. Create field Month and provide each of the month names; eg "January"
3. Extract month from date column in raw data sheet using the [[TEXT]] function in a new column named Month
1. Question, does this depends on if the date column is formatted as a datetime column ??
2. in this case `TEXT(A2, "mmmm")` takes the value in A2 `1/1/23 0:01` and converts it to `January` in the cell where we are using the function.
4. then on the summary worksheet, run a [[COUNTIF]] returning the total number of times the month name appears in the raw data data range on the column created in the previous step. Now we have summarized all of the data.
5. Confirm result by summing the total. should equal the number of rows we have. If not, the data needs further cleaning
1. Table data check verified.
2. How many applications were received? 32,596
3. In which months were the lowest and highest number of applications received? February (min), July (max)
4. What was the average number of applications received per month. 2716.3333
Once the data was organized it was very easy to get the answers needed. I wish I had this skill in a recent job application where I was not selected due to my low levels of spreadsheet knowledge.
### 3.5 Quiz: _Test your knowledge on using functions in spreadsheets_
4 questions. all on functions.
#### Results
100%
<hr>
## 4. Save time with [[Structured thinking|structured thinking]]
### 4.1 Video: _Before solving a problem, understand it_ (2:00)
>[! cue] Def [[Problem domain]]
This helps you understand the problem.
>[! info] Problem Domain
>The specific area of analysis that encompasses every activity affecting or affected by the problem
Great metaphor a jigsaw puzzle with no box
we need to train our brains to think structurally.
### 4.2 Video: _[[Scope of work (SOW)]] and [[Structured thinking]]_ (4:00)
>[! cue] Def [[Structured thinking]]
Structured thinking will enable you to understand problems at a high level. This will lead you to be more effective.
Clear list of what is being expected, a timeline for tasks and activities and checkpoints
Structured thinking is essential to success
Start with the [[Problem domain]] acting as the foundation.
Another way to practice structured thinking is to use a [[Scope of work (SOW)]]
what might be in a [[Scope of work (SOW)|SOW]]
-
Having a SOW will be a great to have in order to avoid set backs.
### 4.3 Interactive activity: _Creating a [[Scope of work (SOW)|SOW]]_
This is an interesting exercise where I needed to categorize different activities into [[Scope of work (SOW)|SOW]] categories;
- Deliverables
- The main goals/deliverables of the project. So, anything new trying to be added on to this would be "out of scope" and de-prioritized
- Timeline
- These are all activities that had a due date
- Milestones
- these were mostly things that needed to be confirmed
- Reports
- This were things like feedback, final lists, performance scores
[[Scope of work - Deliverables]]
[[Scope of work - Milestones]]
[[Scope of work - Timeline]]
[[Scope of work - Reports]]
### 4.4 Hands-On: _Create a [[Scope of work (SOW)|SOW]]_
- The point of data analysis projects is complete business tasks.
- SOW documents keep everyone on the same page.
- We want to define and establish what the problem is and ask questions. these questions inform the requirements for the project.
- No standard template for SOW
- essential components are
- [[Scope of work - Deliverables|sow-deliverables]]
- [[Scope of work - Milestones|sow-milestones]]
- [[Scope of work - Timeline|sow-timeline]]
- [[Scope of work - Reports|sow-reports]]
### 4.5 Video type: _Staying Objective_ (4:00)
- data needs context.
>[! cue] Def [[context]]
Numbers don't mean much without context.
Descriptive : What happened?
Diagnostic: Why did it happen?
Predictive: What will happen?
Prescriptive: How can we make it happen?
Each level becomes more and more valuable, harder to do (more technical), and more susceptible to bias or bad data.
If the analysis is not objective, the results will be misleading.
### 4.6 Reading: _The importance of [[context]]_
Context can turn raw data into meaningful information
Giving the data perspective by defining it.
- Who: who funded the data collection? who created the data?
- What: The things in the world this data could have an impact on.
- Where : The origin of the data
- When: The time when the data was created or collected
- Why: The motivation behind the collection of the data.
- How: The method used to create or collect it.
### 4.7 Self-Reflection: _Work with datasets_
In this activity I needed to create a [kaggle](https://www.kaggle.com) account and download a dataset.
Then I had to apply the [[SMART methodology]] and design an [[Scope of work (SOW)|SOW]] on what I would do.
### 4.8 Quiz: _Test your knowledge on [[Structured thinking]]_
4 questions untimed, 75% minimum
#### Result
100%. However, I had to open my notes and review some key terminology. Open notes are allowed, BTW.
[[Scope of work (SOW)|SOW]]
[[Structured thinking]]
[[SMART methodology]]
<hr>
## 5. Module 3 challenge
### 5.1 Glossary from Module 3
- [[AVERAGE]]
- [[Borders]]
- [[Cell reference]] | [[COUNT]]
- [[Equation]]
- [[Fill handle]] | [[Filtering]]
- [[Header]]
- [[Math expression]] | [[Math function]] | [[MAX]] | [[MIN]]
- [[Open data]] | [[Operator]] | [[Order of operations]]
- [[Problem domain]] | [[Range]] | [[Report]] | [[Return on investment (ROI)]] | [[Revenue]] | [[Scope of work (SOW)]] | [[Sorting]] | [[SUM]]
### 5.2 Module 3 challenge QUIZ
8 questions, timed. 80% higher needed to pass.
#### Results:
96.75% missed one .25 of a point on areas in a [[Scope of work (SOW)|scope of work]] document. Basically this was a _select all that apply_ question and I failed to select all of the correct responses.
<hr>
## Summary and takeaways
>[! summary-top] Module 3 Key Take Aways
>- [[Spreadsheet]] and [[Function#Spreadsheet functions|Spreadsheet functions]] make analysis tasks easy and help organize the data on a spreadsheet
>- [[Structured thinking]] , [[SMART methodology]], [[context]], and [[Scope of work (SOW)|SOW]] are all important tools to use in ASK phase of the analytical process.
>- Always identify the [[Problem domain]] to serve as foundation work in understanding and asking key questions.