Business Analytics, 2ed: The Science of Data-Driven Decision Making
ISBN: 9789354246197
648 pages
Description
Business Analytics has become one of the most important skills that every student of Management and Engineering should acquire to become successful in their career. The use of analytics across industries for decision making, problem solving, and driving organizational innovation makes it an essential skill to develop. Analytics is used as a competitive strategy by many successful companies.
1. Introduction to Business Analytics
1.1 Introduction to Business Analytics
1.2 Analytics Landscape
1.3 Why Analytics
1.4 Business Analytics: The Science of Data-Driven Decision Making
1.5 Descriptive Analytics
1.6 Predictive Analytics
1.7 Prescriptive Analytics
1.8 Descriptive, Predictive, and Prescriptive Analytics Techniques
1.9 Big Data Analytics
2. Foundations of Data Science: Descriptive Analytics
2.1 Introduction to Descriptive Analytics
2.2 Data Types and Scales of Variable Measurement
2.3 Types of Variable Measurement Scales
2.4 Population and Sample
2.5 Measures of Central Tendency
2.6 Percentile, Decile and Quartile
2.7 Measures of Variation
2.8 Measures of Shape − Skewness and Kurtosis
2.9 Data Visualization
2.10 Feature Engineering Using Visualization
3. Introduction to Probability
3.1 Introduction to Probability Theory
3.2 Probability Theory – Terminology
3.3 Fundamental Concepts in Probability – Axioms of Probability
3.4 Application of Simple Probability Rules – Association Rule Learning
3.5 Bayes’ Theorem
3.6 Random Variables
3.7 Probability Density Function and Cumulative Distribution Function of a Continuous Random Variable
3.8 Binomial Distribution
3.9 Poisson Distribution
3.10 Geometric Distribution
3.11 Parameters of Continuous Distributions
3.12 Uniform Distribution
3.13 Exponential Distribution
3.14 Normal Distribution
3.15 Chi-Square Distribution
3.16 Student’s t-Distribution
3.17 F-Distribution
4. Sampling and Estimation
4.1 Introduction to Sampling
4.2 Population Parameters and Sample Statistic
4.3 Sampling
4.4 Probabilistic Sampling
4.5 Non-probability Sampling
4.6 Sampling Distribution
4.7 Central Limit Theorem (CLT)
4.8 Sample Size Estimation for Mean of the population
4.9 Estimation of Population Parameters
4.10 Method of Moments
4.11 Estimation of Parameters Using Method of Moments
4.12 Estimation of Parameters Using Maximum Likelihood Estimation
5. Confidence Intervals
5.1 Introduction to Confidence Interval
5.2 Confidence Interval for Population Mean
5.3 Confidence Interval for Population Proportion
5.4 Confidence Interval for Population Mean When Standard Deviation is Unknown
5.5 Confidence Interval for Population Variance
6. Hypothesis Testing
6.1 Introduction to Hypothesis Testing
6.2 Setting up a Hypothesis Test
6.3 One-Tailed and Two-Tailed Test
6.4 Type I Error, Type II Error, and Power of the Hypothesis Test
6.5 Hypothesis Testing for Population Mean When Population Variance is Known: One-Sample Z-Test
6.6 Hypothesis Testing of Population Proportion: Z-Test for Proportion
6.7 Hypothesis Test for Population Mean When Population Variance is Unknown: One-Sample t-Test
6.8 Paired-Sample t-Test
6.9 Comparing Two Populations: Two-Sample Z- and t-Test
6.10 Hypothesis Test for Difference in Population Proportion Under Large Samples: Two-Sample Z-Test for Proportions
6.11 Effect Size: Cohen’s D
6.12 Hypothesis Test for Equality of Population Variances (F Test)
6.13 Non-Parametric Tests: Chi-Square Tests
7. Analysis of Variance
7.1 Introduction to ANOVA
7.2 Multiple t-Tests for Comparing Several Means
7.3 One-Way ANOVA
7.4 Two-Way ANOVA
8. Correlation Analysis
8.1 Introduction to Correlation
8.2 Pearson Correlation Coefficient
8.3 Spearman Rank Correlation
8.4 Point Bi-Serial Correlation
8.5 The Phi-Coefficient
9. Simple Linear Regression
9.1 Introduction to Simple Linear Regression
9.2 History of Regression – Francis Galton’s Regression Model
9.3 SLR Model Building
9.4 Estimation of Parameters Using OLS
9.5 Interpretation of SLR Coefficients
9.6 Validation of the SLR Model
9.7 Outlier Analysis
9.8 Confidence Interval for Regression Coefficients β0 and β1
9.9 Confidence Interval for the Expected Value of Y for a Given X
9.10 Prediction Interval for the Value of Y for a Given X
10. Multiple Linear Regression
10.1 Introduction
10.2 Ordinary Least Squares Estimation for MLR
10.3 MLR Model Building
10.4 Part (Semi-Partial) Correlation and Regression Model Building
10.5 Interpretation of MLR Coefficients – Partial Regression Coefficient
10.6 Standardized Regression Coefficient
10.7 Regression Models with Qualitative Variables
10.8 Validation of Multiple Regression Model
10.9 Coefficient of Multiple Determination (R-Square) and Adjusted R-Square
10.10 Statistical Significance of Individual Variables in MLR – t-Test
10.11 Validation of Overall Regression Model – F-test
10.12 Validation of Portions of an MLR Model – Partial F-Test
10.13 Residual Analysis in MLR
10.14 Multi-Collinearity and Variance Inflation Factor
10.15 Auto-Correlation
10.16 Distance Measures and Outliers Diagnostics
10.17 Feature Selection in Regression Model Building (Forward, Backward and Stepwise Regression)
10.18 Avoiding Overfitting – Mallows’s Cp
10.19 Transformations
10.20 Omitted Variable Bias
10.21 Regression Model Deployment
11. Logistic Regression
11.1 Introduction – Classification Problems
11.2 Introduction to Binary Logistic Regression
11.3 Estimation of Parameters in Logistic Regression
11.4 Interpretation of Logistic Regression Parameters
11.5 Logistic Regression Model Diagnostics
11.6 Classification Table, Sensitivity and Specificity
11.7 Optimal Cut-off Probability
11.8 Feature (Variable) Selection in Logistic Regression
11.9 Application of Logistic Regression in Credit Scoring
11.10 Gain Chart and Lift Chart
11.11 Multinomial Logistic Regression
12. Decision Trees
12.1 Decision Trees: Introduction
12.2 Chi-square Automatic Interaction Detection (CHAID)
12.3 Classification and Regression Tree
12.4 Cost-Based Splitting Criteria
12.5 Regression Tree
12.6 Error Matrix and AUC for
13. Forecasting Techniques
13.1 Introduction to Forecasting
13.2 Time-Series Data and Components of Time-Series Data
13.3 Forecasting Techniques and Forecasting Accuracy
13.4 Moving Average Method
13.5 Single Exponential Smoothing (SES)
13.6 Double Exponential Smoothing – Holt’s Method
13.7 Triple Exponential Smoothing (Holt-Winter Model)
13.8 Croston’s Forecasting Method for Intermittent Demand
13.9 Regression Model for Forecasting
13.10 Auto-Regressive (AR), Moving Average (MA) and ARMA Models
13.11 Auto-Regressive (AR) Models
13.12 Moving Average Process MA(q)
13.13 Auto-Regressive Moving Average (ARMA) Process
13.14 Auto-Regressive Integrated Moving Average (ARIMA) Process
13.15 Power of Forecasting Model: Theil’s Coefficient
14. Clustering
14.1 Introduction to Clustering
14.2 Distance and Similarity Measures Used in Clustering
14.3 Quality and Optimal Number of Clusters
14.4 Clustering Algorithms
14.5 K-Means Clustering
14.6 Hierarchical Clustering
15. Prescriptive Analytics
15.1 Introduction to Prescriptive Analytics
15.2 Linear Programming
15.3 Linear Programming (LP) Model Building
15.4 Linear Programming Problem (LPP) Terminologies
15.5 Assumptions of Linear Programming
15.6 Sensitivity Analysis in LPP
15.7 Solving a Linear Programming Problem Using Graphical Method
15.8 Range of Optimality
15.9 Range of Shadow Price
15.10 Dual Linear Programming
15.11 Primal-Dual Relationships
15.12 Multi-Period (Stage) Models
15.13 Linear Integer Programming (ILP)
15.14 Multi-Criteria Decision-Making (MCDM) Problems
16. Stochastic Models and Reinforcement Learning
16.1 Introduction Stochastic Process
16.2 Poisson Process
16.3 Compound Poisson Process
16.4 Markov Chains
16.5 Classification of States in a Markov Chain
16.6 Markov Chains with Absorbing States
16.7 Expected Duration to Reach a State from Other States
16.8 Calculation of Retention Probability and Customer Lifetime Value Using Markov Chains
16.9 Markov Decision Process (MDP) and Reinforcement Learning
16.10 Value Iteration Algorithm
17. Ensemble Methods
17.1 Ensemble Methods: Introduction
17.2 Condorcet’s Jury Theorem
17.3 Random Forest
17.4 Choice of Hyper-parameter Values in Random Forest
17.5 Random Forest Model Development
17.6 Variable Importance
17.7 Sampling Procedures to Improve Accuracy in Random Forest Model
17.8 Boosting
17.9 Gradient Boosting
18. Six Sigma
18.1 Introduction to Six Sigma
18.2 What is Six Sigma?
18.3 Origins of Six Sigma
18.4 Three-Sigma Versus Six-Sigma Process
18.5 Cost of Poor Quality
18.6 Sigma Score
18.7 Industrial Applications of Six Sigma
18.8 Six Sigma Measures
18.9 Defects Per Million Opportunities (DPMO)
18.10 Yield
18.11 Sigma Score (or Sigma Quality Level)
18.12 DMAIC Methodology
18.13 Six Sigma Project Selection for DMAIC Implementation
18.14 DMAIC Methodology – Case of Armoured Vehicle
18.15 Six Sigma Toolbox
Summary
Multiple Choice Questions
Exercises
Case Study: Era of Quality at the Akshaya Patra Foundation
References
Appendix
Index