In this article, we’ll break down the process of data analysis into steps and look at each one separately, but before that let’s define it.
What is Data Analysis?
We need to know exactly what data analysis is before we can understand the process.
Analysis of data is the procedure of first of all setting goals as to what data you need and what questions you’re hoping it will answer, then collecting the information, then inspecting and interpreting the data, with the aim of sorting out the bits that are useful, in order to suggest conclusions and help with decision making by various users.
It focuses on knowledge discovery for predictive and descriptive purposes, sometimes discovering new trends, and sometimes to confirm or disprove existing ideas.
STEP 1: Setting of Goals
This is the first step in the data modeling procedure. It’s vital that understandable, simple, short, and measurable goals are defined before any data collection begins.
These objectives might be set out in question format, for example, if your business is struggling to sell its products, some relevant questions may be, “Are we overpricing our goods?” and “How is the competition’s product different to ours?”
Asking these kinds of questions at the outset is vital because your collection of data will depend on the type of questions you have. So, to answer your question, “How is the competition’s product different to ours?” you will need to gather information from customers regarding what it is they prefer about the other company’s product, and also launch an investigation into their product’s specs.
To answer your question, “Are we overpricing our goods?” you will have to gather data regarding your production costs, as well as details about the price of similar goods on the market.
As you can appreciate, the type of data you’ll be collecting will differ hugely depending on what questions you need answered. Data analysis is a lengthy and sometimes costly procedure, so it’s essential that you don’t waste time and money by gathering data that isn’t relevant. It’s vital to ask the right questions so the data modeling team knows what information you need.
STEP 2: Clearly Setting Priorities for Measurement
Once your goals have been defined, your next step is to decide what it is you’re going to be measuring, and what methods you’ll use to measure it.
#Determine What You’re Going to be Measuring
At this point, you’ll need to determine exactly what type of data you’ll be needing in order to answer your questions. Let’s say you want to answer the question, “How can we cut down on the number of people we employ without a reduction in the quality of our product?”
The data you’ll need will be along these lines: the number of people the business is currently employing; how much the business pays these employees each month; other benefits the employees receive that are a cost to the company, such as meals or transport; the amount of time these employees are currently spending on actually making the product; whether or not there are any redundant posts that have may have been taken over by technology or mechanization.
As soon as the data surrounding the main question has been obtained, you’ll need to ask other, secondary, questions pertaining to the main one, such as, “Is every employee’s potential being used to the maximum?” and “Are there perhaps ways to increase productivity?”
#Choose a Measurement Method
It’s vital that you choose the criteria that’ll be utilized in the measurement of the data you’re going to collect. The reason being that the way in which the data is collected will determine how it gets analyzed later.
You need to be asking how much time you want to take for the analysis project. You also need to know the units of measurement you’ll be using. For example, if you market your company’s product overseas, will your money measurements be in dollars or Kenya Shillings?
In terms of the employee question we discussed earlier, you would, for example, need to decide if you’re going to take the employees’ bonuses or their safety equipment costs into the picture or not.
STEP 3: Data Gathering
The next step of the data modeling procedure is the actual gathering of data. Now that you know your priorities and what it is that you’re going to be measuring, it’ll be much simpler to collect the information in an organized way.
There are a few things to bear in mind before gathering the data: Check if there already is any data available regarding the questions you have asked. There’s no point in duplicating work if there already is a record of, say, the number of employees the company has. You will also need to find a way of combining all the information you have.
Perhaps you’ve decided to gather employee information by using a survey. Think very carefully about what questions you put onto the survey before sending it out. It’s preferable not to send out lots of different surveys to your employees, but to gather all the necessary details the first time around.
Also, decide if you’re going to offer incentives for filling out the questionnaires to ensure you get the maximum amount of cooperation.
Data preparation involves gathering the data in, checking it for accuracy, and entering it into a computer to develop your database. You’ll need to ensure that you set up a proper procedure for logging the data that’s going to be coming in and for keeping tabs on it before you can do the actual analysis.
If you’ve gathered data to analyze if your product is overpriced, for instance, check that the dates have been included, as prices and spending habits tend to fluctuate seasonally. Remember to ascertain what budget your company sets aside for data collection and analysis, as this will help you choose the most cost-efficient methods of collection to use.
STEP 4: Data Scrubbing/Cleansing
Data scrubbing, or cleansing, is the process where you’ll find, then amend or remove any incorrect or superfluous data. Some of the information that you’ve gathered may have been duplicated, it may be incomplete, or it may be redundant.
Since computers cannot reason as humans can, the data input needs to be of a high quality. For instance, a human will pick up that a zip code on a customer survey is incorrect by one digit, but a computer will not.
It helps to know the main sources of so called “dirty data”. Poor data capture such as typos are one, lack of company wide standards, missing data, different departments within the company each having their own separate databases, and old systems containing obsolete data, are a few others.
The process involves identifying which data sources are not authoritative, measuring the quality of the data, checking for incompleteness or inconsistency, and cleaning up and formatting the data. The final stage in the process will be loading the cleaned information into the log or “data warehouse” as it’s sometimes called.
It’s vital that this process is done, as “junk data” will affect your decision making in the end. For instance, if half of your employees didn’t respond to your survey, these figures need to be taken into account. Finally, remember that data scrubbing is no substitute for getting good quality data in the first place.
STEP 5: Analysis of Data
Now that you have collected the data you need, it is time to analyze it. There are several methods you can use for this, for instance, data mining, business intelligence, data visualization, or exploratory data analysis. The latter is a way in which sets of information are analyzed to determine their distinct characteristics. In this way, the data can finally be used to test your original hypothesis.
Descriptive statistics is another method of analyzing your information. The data is examined to find what the major features are. An attempt is made to summarize the information that has been gathered. Under descriptive statistics, analysts will generally use some basic tools to help them make sense of what sometimes amounts to mountains of information.
The mean, or average of a set of numbers can be used. This helps to determine the overall trend, and is easy and quick to calculate. It won’t provide you with much accuracy when gauging the overall picture, though, so other tools are also used.
Sample size determination, for instance. When you’re measuring information that has been gathered from a large workforce, for example, you may not need to use the information from every single member to get an accurate idea.
STEP 6: Result Interpretation
Once the data has been sorted and analyzed, it can be interpreted. You will now be able to see if what has been collected is helpful in answering your original question. Does it help you with any objections that may have been raised initially?
Are any of the results limiting, or inconclusive? If this is the case, you may have to conduct further research. Have any new questions been revealed that weren’t obvious before?
If all your questions are dealt with by the data currently available, then your research can be considered complete and the data final. It may now be utilized for the purpose for which it was gathered- to help you make good decisions.
It is of paramount importance that the data you have gathered is meticulously and carefully interpreted. It’s extremely vital that your company has access to experts who can give you the correct results. Call us incase you require our service