Stay tuned – Receive JSM-news !

Join the JSM mailing list to receive our latest updates.
Email address
All too often companies have only the vaguest idea about what kind of data they’re holding; because such data is very often hidden deeply away in a variety of databases and fragmented across different departments. We identify this data and bring it to light, making it visible, cohesive, comparable and easy to understand so that it really does support YOU in making the right decisions. And if need be, we can also identify any lacking data and define a concept to fill in the gap.

An ensemble segmentation approach to Healthcare problem

Client: A pharma major
Time: May 2015



The client initiated a new biologic therapy and wanted to understand the potential drivers and barriers associated with it. He wanted to have a clear idea about the impact of the existing therapies in the market. To find this out we addressed the issues listed below in our online survey:

– To uncover the attributes that drive the biologic therapies in the treatment of Ulcerative Colitis.
– To capture the views and experiences of physicians managing Ulcerative Colitis patients.
– To identify the key challenges faced by physicians while managing the patients.
– To understand the reasons for ending a particular biologic therapy.
– To identify the most challenging symptoms of Ulcerative Colitis.


We asked gastroenterologist who treat the inflammatory bowel disease in an online survey. Besides the patient’s demographics the survey covered the diagnosis and the history, the tests which were already undertaken, the consultation history, the types of Ulcerative Colitis the patient was suffering from, the concomitant conditions, the current treatment and the reasons for it, the previous treatment, the satisfaction with the current treatment, procedures of hospitalization, the patient’s nutrition and diet, the patient’s compliance and knowledge about Ulcerative Colitis and finally the health care policies undertaken by the patient and the tests the patient had gone through.

The challenge was to apply a statistical technique that could handle huge data as well as multiple variables. We opted for an data-adaptive approach called random forest in which multiple models are generated instead of generating a single model. The concept of generating multiple models is to reduce the error rate. An algorithm was developed to allow the machine to search the data and find useful predictors. The method called “Gini Index” is used for exclusion or inclusion of a variable from the analysis. A low value in the “Gini Index” indicates that the variable is a useful predictor. The random forest technique gives the flexibility in exploring and grouping variables. It gives clear ranking for each variable in order of its importance.

The findings of the survey helped our client to better understand the reasons why biologic therapy is used to treat Ulcerative Colitis. The patient chooses the biologic therapy when it reduces the use of steroid, reliefs the patient from abdominal cramps, acute pain and inflammatory symptoms. Moreover, the findings showed that people diagnosed with Ulcerative Colitis should control and balance their nutrition because the disease often reduces the appetite while increasing the body’s needs. Furthermore the results pointed out the areas of the digestive tract which were most affected by the inflammation. Gastroenterologists found it useful to know the top three most challenging symptoms of Ulcerative Colitis – it helped them to develop biologic therapies that effectively cure Ulcerative Colitis.

Vinita Jaichander

I hold a distinction in M.Sc (Applied Statistics). As an analytics professional, I am feel energetic and confident with introducing new ideas for my work. My work on ‘Bio efficacy on Plant Products’ has been published as a paper in a national magazine. Leisure to me means listening to music and watching movies and sports.

More Posts

Leave a Reply