A Identify -> Collect -> Clean -> Analyze -> Interpreted
Data Analysis Methods
Descriptive analysis – What happened
Predictive analysis – What will happen
Prescriptive analysis – How will it happen
What are sampling techniques in Data Analysis?
It is the practice of selecting an individual group from a population to study the whole population.
Types of sampling techniques
Random sampling – selects the participants randomly
Systematic sampling
Cluster sampling
Stratified sampling
Judgmental or purposive sampling
What is univariate Analysis?
A data analysis where the data being analyzed contains only one variable.
What is bivariate Analysis?
The analysis involves the analysis of two variables
What is multivariate Analysis?
An analysis of three or more variables to understand the relationship of each variable with the other variables
How can you handle missing values?
What is your process for cleaning data?
Missing data
Duplicate data
Data from different sources
Structural errors
Outliers
What is Quantitative data?
Quantitative data are measures of values or counts and are expressed as numbers.
Quantitative data are data about numeric variables (e.g. how many; how much; or how often).
What is Qualitative data?
Qualitative data are measures of ‘types’ and may be represented by a name, symbol, or number code.
Which validation methods are employed by data analysts?
Field Level Validation
Form Level Validation
Data Saving Validation
What is an outlier?
In data analytics, outliers are values within a dataset that vary greatly from the others (ડેટાસેટની વેલ્યૂ કરતાં ખુબજ અલગ પડતી વેલ્યૂ / डेटासेट की वेल्यू से काफ़ी अलग वेल्यू)
Full form of BI tools
Business Intelligence (BI) tools
What is BI tools?
Business intelligence (BI) tools are types of application software that collect and process large amounts of unstructured data from internal and external systems
List out BI tools available in MS Excel?
A Table, PivotTables, charts, Conditional Formatting, slicers, timeline, PowerPivot
What is Data Mining?
Data mining is the process of discovering relevant information that has not yet been identified before.
What is Diamond mining?
It is the act of digging into large amounts of unrefined ore to discover precious gems or nuggets.
What is Text Mining?
Text mining is the art and science of discovering knowledge, insights, and patterns from an organized collection of textual databases.
What is Web Mining?
Web mining is the art and science of discovering patterns and insights from the Worldwide web.
Types of Web Mining
Web content mining
Web structure mining
Web usage mining
What is Data profiling?
Data profiling is the process of examining, analyzing, and creating useful summaries of data.
What is Data Wrangling (વ્રેન્ગલીંગ / व्रे न्गलिङ्ग )?
It is the process wherein raw data is cleaned, structured, and enriched into a desired usable format for better decision-making.
What is Cluster analysis?
Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group are more similar to each other than to those in other groups.
What is Cohort analysis?
Cohort analysis is a kind of behavioral analytics that breaks the data in a data set into related groups before analysis.
What is EDA (Exploratory Data Analysis)?
It refers to the critical process of performing initial investigations on data to discover patterns.
Decision trees are a simple way to guide one’s path to a decision. The decision may be a simple binary one, whether to approve a loan or not.
What is Big Data?
Big data is an umbrella term for a collection of data sets so large and complex that it becomes difficult to process them using traditional data management tools.
Full form of KNN
K-nearest neighbor
Python libraries used in data analysis
NumPy
Bokeh
Matplotlib
Pandas
SciPy
SciKit
Explain Collaborative Filtering
Based on user behavioral data For example on online shopping sites when you see phrases such as “recommended for you”
What is Predictive Accuracy?
Predictive Accuracy = Correct Predictions / Total Predictions
What will be the Maximum Predicative Accuracy ?
100%
What will be the minimum Predicative Accuracy consider to use
70%
How have you used Excel for data analysis in the past?
.
What is a VLOOKUP, and what are its limitations?
.
What is a pivot table, and how do you make one?
.
How do you find and remove duplicate data?
.
What is Sparkline?
A sparkline is a tiny chart in a worksheet cell that provides a visual representation of data.
What is Slicers?
Slicers provide buttons that you can click to filter tables or PivotTables.
What is Timeline in MS Excel?
Microsoft Excel’s timeline object is a dynamic filter option that filters PivotTables and PivotCharts by Date/Time values.
What is Power Pivot in MS Excel?
It is an Excel add-in you can use to perform powerful data analysis and create sophisticated data models.
Difference between Normal Pivot and PowerPivot
The normal pivot version just lists fields within this single table or source that we’re pointing to. Power Pivot allows us to access any of the fields in any of the tables in our data model, and then analyze them based on any relationships that we’ve defined.