Strategy Formulation for businesses: looking beyond variable importance
For businesses, while it is important to have an accurate model, an interpretable model is equally important. Apart from wanting to know what our model’s prediction is, we also wonder why it is it high/low and which features are most important in determining the forecast. (Most machine learning algorithms produce variable importance as a part of their model. Every model is evaluated using confusion matrix and there are different metrics like accuracy, precision, recall, and F1 score. Each one gains prominence over the other depending on the business circumstance)
As in the case of many real-life problems, providing a classification algorithm with good accuracy is not good enough, there is a need to indicate the action areas (prioritised!) on which a business need to focus. An example might be predicting customer churn — it is very nice to have a model that is successfully predicting which customers/employees are prone to churn, but identifying which variables are important can help us in early detection and maybe, even in improving the product/service!
In over 15 years of my career, I have been getting importance of variables using many methods right from stated methods (rating, ranking…etc) to derived methods like regression. Lately, we have been using machine learning methods like decision trees, neural networks, random forest…etc. Many a times, I have used TradeOff utilities (max diff scores/ conjoint) as well for ‘importance’.
Irrespective of how I get the variable importance, for making these findings actionable, I like to go back to this very old and commonly used approach – Importance performance plots. You add another dimension here which is ‘performance’. A simple 2by2 with 4 quadrants (as illustrated below):
This quadrant plot will visually show marketers where they should focus their efforts, plan their marketing activities. Also, you would fix ‘poor’ performance, back off if your performance was an ‘overkill’, and ignore low priority elements.
A case study for illustrating it:
One of the leading shampoo brands wanted to ascertain what was driving brand satisfaction and what the gaps were. Thereby, what the future action plan should be (in terms of marketing activities and positioning).
We applied the supervised machine learning algorithms by considering ‘overall satisfaction’ as a dependent with various imagery attributes as independent variables and derived importance score.
This plot clearly gave the client an action plan not only in terms of newer areas to focus on but also areas to de-focus on.