Sometimes when people hear the words ‘data analysis’ it sounds a little scary to them, but wipe the sweat off your forehead and read on because I’m about to tell you how to tackle any analysis in just four steps.
It really doesn’t matter whether you’ve been asked to do a short data analysis or a more lengthy one, the steps are still the same! Here they are:
- The Data Request
- The Data Pull
- The Datamining
- The Storytelling
Step #1 – The Data Request
Ask any data analyst and you’ll hear that data requests come from everywhere…and often! They come in every way possible. They come via email, phone, text and (for those analysts not yet working remotely) by someone stopping into their offices. Years ago when I was not yet working remotely, I got many data requests by daring to walk down the hallway or by running into someone in the cafeteria! Data Analysts are definitely popular and people know they are in competition with each other to get a piece of the analyst’s time. So…they will almost tackle you when they see you! :)
No matter how the request comes to you (email…I highly recommend email!), there are several pieces of information that you’ll want to gather from the requestor. Let’s look at what you’ll need to get started:
Key Information to Gather From the Requestor |
---|
– The question(s) the requestor is trying to answer through data |
– The study period to be used |
– The date the report is needed |
– The type of deliverable needed |
– Who the Stakeholders are |
It’s critical that you really get to the root of what the requestor is trying to learn, prove, disprove, etc. from the data. Once you have a clear understanding of that, you will avoid wasting time going off down rabbit trails that lead somewhere your requestor doesn’t want to go.
If you’d like to see the data analysis process in action with a real example, click here to watch my free training now – The Making of a Data Analyst. People are loving it! :)
Having the study period is also critical, so that you ensure you’re reporting on the correct block of time. Obviously, the due date is important to know as well, so that you can plan your work accordingly. As a data analyst, you’re juggling lots of projects at once. Your calendar will keep you on track and be your best friend!
The requestor will likely know what form of a deliverable is needed, but may need a little direction from you. Maybe he/she will be giving a presentation on the report and a Powerpoint slide deck would be best. Possibly, the report will be discussed in a meeting and an Excel document is better suited. If there’s a lot of data and time is short for discussion, it may be best to build the results into a one page dashboard format.
It’s very important to know who the Stakeholders are for the request. These will likely be other team members that can provide input, review findings and answer questions for you as you move through the analysis process. You’ll also want a list of these people so that you can keep them all informed if there are any issues or delays with delivering the report by the agreed upon deadline.
Step #2 – The Data Pull
Once you have the information from Step #1, it’s time to do the data pull. For this step of the process, you’ll use your data source (database, data warehouse, data file, etc.) and you’ll write queries to extract the specific data needed to answer the question or questions posed in your data request.
Sometimes, it may be necessary to pull data from multiple sources in order to get everything needed to satisfy the request. Once you have all data needed, you’ll move through the Data Wrangling process.
Data Wrangling consists of six phases. You’ll move through each of them in order as you prepare your data. The six phases are: Discovering, Structuring, Cleaning, Publishing, Validating and Enriching.
Discovering your data is the work of finding the location of all the data elements you need for your study. It may be necessary to use more than one data source to capture all the data needed.
Structuring your data is the process of take a large set of data and breaking it into separate parts, or tables, that allow you to move through your data analysis work much easier. It also avoids data duplication that can occur in some instances with too many data fields living inside one data table.
Cleaning your data is critical. Clean data is the most critical asset an organization has to guide it’s management, and thereby success. So, the accuracy of the data should be a top priority for all team members. You’ve likely heard the phrase “Garbage In, Garbage Out”, right? This phrase was coined in reference to data. The cleanliness of an organization’s data depends first on the accuracy of the people entering the data in the beginning, and then ultimately on the analyst using the data. Taking the time upfront to ensure you don’t have missing data, duplicate data, incorrectly entered data, etc. will smooth out and speed up your data analysis while also ensuring the validity of your report.
Enriching your data is a process of finding additional data in another source that can add value to your study. For example, if you’re preparing a report for a hotel chain and one of the metrics is the average daily occupancy rate, it would make your report much more meaningful if you can find a benchmark value that reflects the national average daily occupancy rate. This quickly gives your data more meaning by being able to compare it to what’s going on in the nation.
Validating your data simply means ensuring several checks are built into your process to ensure accuracy and reasonability. A few examples of these data checks are format checks, code checks, and data type checks.
Publishing your data is preparing your data for use. Once you’ve moved through the other steps in the Data Wrangling process, you should feel very comfortable that your data is complete, clean, correct and ready for use.
Step #3 – The Datamining
The Datamining process is the point where an analyst really starts to perform the data analysis on the data set and pulls out the requested data, along with any other points of importance found along the data journey.
You’ll start at a high level with summarized data that will help steer your analysis. Each level of datamining takes you closer to the ultimate answer that you’re looking for. Just as an archeologist digs for treasures, a data analyst digs for answers.
Step #4 – The Storytelling
Once all the necessary data has been extracted and evaluated, the analyst will begin to prepare a visual story. This is the deliverable that we discussed earlier. Remember, this story is sometimes presented via a graphical slide deck presentation, other times in a document for print, or possibly put into a one page dashboard or scorecard, and still other times simply placed in a spreadsheet format.
No matter the deliverable type, you should always take the reader or audience on a short journey through your datamining process and allow them to see how you came to the conclusions in the report. This is extremely helpful and will increase your reader’s comprehension of the information delivered.
There you have it! The process of performing a data analysis. Not so scary, right? I go into lots more detail in my free webinar. Click here to watch it now!
Let me know if I can help,
If you’d like to learn more about life as a Data Analyst and how to break into this amazing career path, check out my free training – The Making of a Data Analyst.