# Google Analytics - 1 Foundations - Module 2
## 0. Overview
>[! quote] In this part of the course, you'll learn about the data life cycle and data analysis process. They are both relevant to your work in this program and on the job. You’ll also be introduced to applications that help guide data through the data analysis process.
### Learning Objectives
>[! info] Module 2 Learning Objectives
> - [ ] Identify key software applications critical to the data analyst and their work; includes spreadsheets, databases, query languages, and visualization tools.
> - [ ] Identify relationships between the [[data analysis process]] and the courses in the [[Nexus - Google Analytics Certificate Course|Google Analytics Certificate]]
> - [ ] Explain the [[data analysis process]] making specific references to each phase.
> - [ ] Discuss the use of data in everyday life decisions
> - [ ] Discuss the role of the spreadsheet, query languages, and data visualization tools in [[data analytics]]
> - [ ] Discuss the phases of the [[data life cycle]]
---
### Glossary for Module 2
- [[Database]]
- [[Formula]]
- [[Function]]
- [[Query]]
- [[Query language]]
- [[stakeholder]]
- [[Structured Query Language]]
- [[Spreadsheet]]
- [[SQL]]
---
## 1. Follow the [[data life cycle]]
### Video: _Learn about data phases and tools_ (2:00)
Bringing data to life starts with the right tool, [[Spreadsheet]], [[Database]], [[Query language]], [[data visualizations]]
### Interactive activity: _Phases of data analysis_
This was a categorization exercise where the students places cards in the correct order. each card had a phase of the [[data analysis process]] on it. By now, it should be committed pretty firmly to memory. ==APP-ASA==
>[!cue] [[data analysis process|APP-ASA]]
The phases of the data analysis process are:
- [[data analysis process - Ask]]
- [[data analysis process - Prepare]]
- [[data analysis process - Process]]
- [[data analysis process - Analyze]]
- [[data analysis process - Share]]
- [[data analysis process - Act]]
That was it.
### Video: _Stages of the [[data life cycle]]_ (4:00)
>[!cue] [[data life cycle]] (0:42)
The data analysis process exists within the middle of the data lifecycle; phase 4 Anazyze
>[! info] Data Life Cycle
>1. Plan
>2. Capture
>3. Manage
>4. Analyze
>5. Archive
>6. Destroy
#### [[data life cycle - Plan]]
The planning phase is where business determines what kinds of data it needs to acheive its goal.
- could they rely on their data strategy to inform the plan?
#### [[data life cycle - Capture]]
Once a business decides on what the data should be they then move to the capture phase. ~~the capture phase outlines how data is acquired, where the data is stored, and how the data is secured.~~
The capture phase is where data is collected from a variety of sources and then brought into the org.
EG from an outside resource, NOAA weather data, or from a company's own docs and files usually stored on a database.
>[! cue] Def [[Database]]
>[! info] Database
>A collection of data stored in a computer system.
#### [[data life cycle - Manage]]
How we care for our data, how and where it is stored, the tools used to secure it and what actions are taken to ensure that it is maintained properly.
>[! tip] This is where my most recent experience was within data management. #til
#### [[data life cycle - Analyze]]
The data is now used to solve problems and the [[data analysis process]] would be applied.
#### [[data life cycle - Archive]]
Storing data in a place where it is still available but it might not be used again.
#### [[data life cycle - Destroy]]
The end of the life cycle is the destruction.
### Reading: _Variations of the [[data life cycle]]_
>[!cue] recap [[data life cycle]]
>[! warning] [[data life cycle]] <> [[data analysis process]]
>both have stages / phases but they are NOT interchangeable.
#### U.S. Fish and Wildlife
Their data life cycle presented
1. Plan
2. Acquire
3. Maintain
4. Access
5. Evaluate
6. Archive
A few more were presented. See the [[data life cycle#Examples and variants|Variants]] section of the main definition.
### Practice Assignment: _Test your knowledge on the [[data life cycle]]_
This quiz requires 75%
I am practicing the methods suggested in [[Google Analytics - 1 Foundations - Module 1#Reading _Assessment taking strategies_|Assessment taking strategies]]
- read all the questions first.
- read them fully again before answering.
#### Result:
100%. No issues on this one. Had to remind myself of the activities in [[data life cycle - Manage]] vs [[data life cycle - Capture]].
---
## 2 Outline the [[data analysis process]]
### Video: _The phases of data analysis and this program_ (6:00)
>[! cue] [[data analysis process]] <> [[data life cycle]]
This program is designed to following the steps in the data analysis **_process_** Analysis is a process of analyzing data. Sounds similar to [[data life cycle]], but they are different.
#### [[data analysis process - Ask|Ask]]
>[! cue] Update def card.
We do (typically) two key things:
1. Define the problem to solved.
- Look at the current state and identify the obstacles in the way of some desired future state.
1. Fully understanding stakeholder expectations in the project.
- determine who the stakeholders are.
- Could be managers, other business leaders etc.
>[! cue] Def [[stakeholder]]
>[! info] Stakeholders
>People who invest time and resources into a project and are interested in its outcome.
==Maintaining communication with stakeholders ensures that you stay engaged in the process. developing strong communication strategies is key to success over long term career.== This is a [[_Self Assessments]] component.
This part of the ask phase helps analysts keep focused on the problem
- [[root cause|5-why's]] are helpful here.
- Will learn how to ask effective questions and strategies to help share what is discoved to keep interested.
#### [[data analysis process - Prepare|Prepare]]
>[! cue] Update def card.
Where analysts collect and store the data to be used.
Will learn how to identify the types of data that will be useful for solving a particular problem.
Making fair and impartial insights from data can be hard. We will learn how to identify and avoid bias in the future.
#### [[data analysis process - Process|Process]]
>[! cue] Update def card.
Find and eliminate errors/inaccuracies
Done by;
- cleaning data
- Transforming the dataset into a more useful format
- combining two or more datasets.
- removing outliers
Then will learn how to check the output data to make sure it is complete and correct.
- fix typos
- investigate inconsistencies
- deal with missing or inacurate information/statements/conclusions.
Then will learn how to verify and share cleansing with [[stakeholder]] group
#### [[data analysis process - Analyze|Analyze]]
>[! cue] Update def card.
- involves using tools to _do_ the analysis.
- spreadsheets
- sql
#### [[data analysis process - Share|Share]]
>[! cue] Update def card.
In this course. we will learn how to interpret the results from [[data analysis process - Analyze|Analyze]] and how to share them with others to make effective data-driven decision.
[[data visualizations]] is an essential tool
- create compelling slideshows
- be fully prepared to answer questions.
Then will take a break from [[data analysis process]] to learn [[R Programming]]
#### [[data analysis process - Act|Act]]
>[! cue] Update def card.
In this course ... prepare for job search and complete a case study project.
### Reading: _More on the phases of data analysis and this program_
The reading informs more details on the the course's structure and how it directly is related to each phase of [[data analysis process]].
### Video: _Molly: Example of the data analysis process_ (6:00)
The process is essentially the same for all the types of analysis.
The presenter, Molly, goes through an example of the process.
- All steps are critical
- The process phase is where we meet the data for the first time on its own terms.
- the purpose of analysis is to answer the questions that were considered ahead of time.
- Sharing tells the data's story. not the analyst's.
### Discussion prompt: _Consider importance_
The prompt is asking you to think about if any one step is more important than the others and to also justify the answer.
>[! quote] My response
>I have slowed down to allow the material to sink in fully. I think this question is a good opportunity to reflect on mistakes I have made and what I have learned from this course so far.
>
>At my last organization, our process was represented as a flywheel. It was understood that the enactment of findings only signaled the beginning of the next round of research! Although I understood the flywheel, I struggled to execute projects successfully. In practice, I found myself answering too many questions. Questions, that were, never asked. Ha! I must have been such a headache to my team. Now I know that greater care should have been spent in the ASK and Prepare phase.
>
>Take this anecdote as a cautionary tale, though. Hyperfocus or being the best "slicer and dicer" of a dataset does not make your project effective. I now believe that one should consider each step in the process of vital importance. Each phase deserves careful consideration and careful execution especially when company resources are involved.
### Practice assignment: _Test your knowledge on the [[data analysis process]]_
There were only four questions. All were about the process discussed so far. This time, though, the questions are focussed on some of the refined definitions or examples of the process.
#### Result
100%. ;)
Slowing down the learning process really is helpful for me.
Being able to pause videos and replay points so that I can make notes that are clear is very very helpful.
---
## 3 The data analysis toolbox
### Video: _Explore data analyst tools_
Common analyst tool categories.
- spreadsheets
- Query languages
- Visualization tools.
#### Spreadsheets
>[! cue] Def: [[Spreadsheet]]
>[! info] Spreadsheet
>A digital worksheet
Spreadsheets give data tabular structure
Spreadsheets have some useful features, [[Formula]] and [[Function]]
>[! cue] Def [[Formula]]
>[! info] Formula
>A set of instructions used to perform a calculation using the data in a [[Spreadsheet]]
>[! cue] Def [[Function]]
>[! info] Function
>A preset command that automatically performs a specified process or task using the data in a [[Spreadsheet]]
#### Query language
>[! cue] Def [[Query language]]
>[! info] Query language
>A computer programming language that allows you to retrieve and manipulate data from a [[Database]].
>
We will learn [[SQL]], a widely used [[Query language]] and make requests in [[Database]] using a [[Query]]
#### Visualization
After preparing, processing, analyzing the data; analysts share the insights by using [[data visualizations]].
>[! cue] Data viz tools
[[Tableau]] and [[Looker]] are popular visualization tools.
- Tableau is popular given the simple drag-drop interface
- Looker is popular because it is an easy way to create visuals based on the results of a query.
### Reading: _Key data analyst tools_
More detail about the topics covered in the video.
Python was mentioned here in passing as it is out of scope for this course.
### Reading: _Choose the right tool for the job_
As an analyst will have to decide the which program or solution is right for the particular project.
>[! cue] Choosing the right tool.
#### Tool comparison [[Spreadsheet|spreadsheet]] vs [[Database]]
| Dimension/Feature | Spreadsheet | Databases |
| --------------------------------- | --------------------------------------------------------------- | ------------------------------------------------------------- |
| How data is accessed | through software application | [[Query language]] |
| How data is structured | row x column on a worksheet | using rules and relationships |
| How data is organized | organized into cell within | via rules and relationships, organized in complex collections |
| How much data can connect to tool | limited to what is passed to application, usually small amounts | Can be huge amounts of data |
| How is data entered | manually, any data (string, number) can go into any cell | strict and consistent data entry |
| How many concurrent users | one user at a time | multiple users |
| How is the tool managed | controlled by user/author of worksheet/workbook | DBMS |
### Practice assignment: _Self-reflection: Review past concepts_
>[!cue] [[data life cycle]] vs [[data analysis process]]
This is a writing prompt exercise requiring at least 40 word responses to each question.
The questions are designed in the spirit of retrospective or self-reflection. The content of the questions focus on [[data life cycle]] and [[data analysis process]]
This is the kind of prompt I like to write about.
Differences/similarities between [[data analysis process - Ask|Ask]] and [[data life cycle - Plan|Plan]] and what is the relationship between the two. And [[data analysis process]] vs [[data life cycle]].
I struggled on the precise meaning of `relationship`. anyway. I will reserve my response because I do not wish to inadvertently reveal the questions. Good food for thought.
### Practice assignment: _Test your knowledge on the data analysis toolbox_
This is a 4 question test over terminology covered in this module.
#### Result
100% :)
---
## 4 Module 2 challenge.
### Glossary for Module 2
- [[Database]]
- [[Formula]]
- [[Function]]
- [[Query]]
- [[Query language]]
- [[stakeholder]]
- [[Structured Query Language]]
- [[Spreadsheet]]
- [[SQL]]
### Quiz
Timed, 50 minutes. 80% required to pass. 10 questions, open book.
#### Results
100%
## 5. Retro
The quiz was not challenging since the material is so very fresh to me. I took the time to meticulously review my documents becuase I want to build a solid foundation for this program. Really keying into concepts of [[data analysis process]] and now the [[data life cycle]]
New to me are the terms in [[data life cycle]]! I keep seeing that terminology in job postings over and over again. Combined with the [[data analysis process]] can make the challenge of being a relatively inexperience data analyst less daunting.
<hr>
>[!summary-top] Module 2 key take aways
>1. [[data life cycle]]; Plan, Capture, Manage, Analyze, Archive, Destroy
>2. [[data analysis process]]; Ask, Prepare, Process, Analyze, Share, Act.
>3. [[data life cycle]] <> [[data analysis process]]
>4. Data Tools: [[Spreadsheet]] vs [[Database]]