homepageose.blogg.se

Different statistical tools for data analysis
Different statistical tools for data analysis





different statistical tools for data analysis

After all these stages, the required data will be ready for analysis. During the stage of manipulation, it may be needed to proceed with an existing dataset or remove some dataset or there may be a need to add few more data to answer the question. Data manipulation is done in many ways like plotting the data, creating pivot tables for the variables, correlation, regression, detecting outliers process may take place. Once the data cleaning is done, the right data to answer the question is ready. If the data cleaning is not proper it may lead to a lower accuracy of the model and may tend to misleading conclusions. Deleting the unwanted columns that are not related to the question. How Cleaning the data is done? Data cleaning is the process of modifying the data, removing the duplicate variables, creating dummy variables if needed. Taking the same employee attrition example there may be data related to the family such as family members, years of experience in a previous company, social status, etc., and every variable is to be understood such that to split the data in such a way for answering the question. For finding those kinds of variables understanding the data is more important. There may be few variables that may not be related to the question that the organization has and those variables can be used in future for future analysis.

different statistical tools for data analysis

While understanding we get to know about the data types, rows, and columns, missing in the data, finding the independent and dependent variables, etc., Preparing the data for analysis is done after understanding the data. For that, we first need to study about all the variables whether it is nominal or ordinal.

different statistical tools for data analysis

Why do we need to Understanding the data? Once the data is collected there may be many variables that are related directly or indirectly to the objective. There may be few variables that already available in the database and any new variables that are needed can be added. For example: Taking the same case of employee attrition, the data to be collected are experience in the company, working hours, educational qualification, distance from home, traveling hours, promotion, age of the employee, increment or hike, etc., these data are important to be collected to find the reason for employee attrition. Then organize the existing data with the new data to proceed with the analysis. Apart from that collect the relevant data to satisfy the objective. Now before collecting the new data, identify the existing data that is available from the database.







Different statistical tools for data analysis