Stay tuned – Receive JSM-news !

Join the JSM mailing list to receive our latest updates.
Email address
All too often companies have only the vaguest idea about what kind of data they’re holding; because such data is very often hidden deeply away in a variety of databases and fragmented across different departments. We identify this data and bring it to light, making it visible, cohesive, comparable and easy to understand so that it really does support YOU in making the right decisions. And if need be, we can also identify any lacking data and define a concept to fill in the gap.

Tapping the pulse of ‘Digital India’: understanding challenges and solutions

Posted by on Jan 18, 2018 in Case Studies, Data Visualization | No Comments


‘Digital India’ has emerged among the most discussed phrases in the country. It certainly piqued our interest too. In order to understand the challenges and solutions related to achieving a truly ‘Digital India’, we executed an online study (named India Online Study) in association with India Open Data Association (IODA) and IIM Lucknow. The study, which involved questions allowing free text as a response, received over 2500 responses across India.

Interestingly, this was just not it. To accumulate more data, IODA undertook a 1000 kms.+ road trip recording video interviews of varied respondents. However, this article delves into the techniques deployed to derive insights on the challenges and solutions related to ‘Digital India’ from the responses received.


The steps followed to filter and clean ‘Benefits and Challenges’ related text were, as mentioned below:

  • Removal of stop words, punctuation, special characters, and social media symbols.
  • Normalising the text (Stemming and Lemmatization).
  • Tokenising the corpus (TF-IDF).
  • Splitting the responses into sentences.


A Latent Semantic Analysis (LSA) model was applied for analysing the data. It is usually carried out to find the distributional semantics – analysing relationships between a set of documents and the terms they contained by producing a set of concepts related to the documents and terms.

The steps undertaken were, as follows:

  • Applied Singular Value Decomposition(SVD) through Latent Semantic Analysis (LSA) and created 20 topics each for “Benefits” and “Challenges” (considering each sentence as a document).
  • For each sentence, identified the appropriate topic based on the scores of the document-topics relation. (Note: Each respondent may belong to multiple topics.)
  • Topics were profiled based on the terms of the topic.


Using the identified topics and demographic profile of each respondent, we created Tableau dashboard for easy visualisation and interpretation of the insights derived from the responses. Here is a glimpse of the dashboard:


The insights and analysis served as the base for discussion at the co-creation workshop for chalking out an action plan for achieving a truly ‘Digital India’, conducted by IODA and IIM-L at Fifth PAN IIM World Management Conference 2017. Moreover, the ‘India Online’ Study will continue in the coming months to attain a more thorough understanding of the matter and help multiple stakeholders with knowledge to execute suitable and sustainable action on the ground.

Suresh Chekuri

Suresh is a Data Science Intern at Juxt-Smart Mandate. He holds a strong desire to continue learning about Data Science and Deep Learning. He is currently working on one of JSM's internal projects based on Natural Language Processing (NLP) and Deep Learning. Prior to this, he has worked at Oracle India Pvt. Ltd. on ERP and Database Technologies for 13 years. Besides work, he loves to read books and listen to his favourite music.

More Posts

Leave a Reply