Individual Psychographs & Activity

Naveen Ashish, PhD

An individual’s attributes, such as his her or her interests, preferences, views, hobbies, inclinations, passions, aversions, political inclinations and such are of significant interest for a variety of commercial, research or community applications. Marketers or market developers, as just one example, stand to gain significantly from understanding such attributes about individuals or individuals in a given population as a whole. In addition, there is also interest in information about an individual’s “current state” such as their current location and/or activity. For instance mobile intervention health applications (on mobile devices) can significantly optimize and personalize their capabilities by leveraging activity details such as where an individual currently is, what they are now doing, etc.

It is only in social-media that we have the prospects. Beyond availability is also the issue of access. Twitter is most promising in this regard.

It is important however, to understand a few key limitations of the capability, given that we are reliant (solely) on open social-media to glean such information and those are:

  • The fact that for a given population i.e., set of individuals of interest, we can only obtain such information for a small subset of those individuals.
  • Also, that the attributes or aspects for an individual will not be comprehensively synthesized. In other words we have to work with what we get.

That said, such psychographic profile or activity information has to be usable in practical predictive analytics solutions. The profile information needs to be accurate, to the extent possible. The activity information needs also to be accurate and up-to-date as it serves real-time applications in many cases.

The TeraCrunch Socratez ProfilerTM is an AI tool for synthesizing psychographic and activity information from social-media. It build upon the TeraCrunch SocratezTM Unstructured Data Understanding Platform with the following capabilities:

  • Data access The system has real-time, highly scalable access to data from social-media feeds, accessed via APIs as well as data stream providers
  • Individual resolution. Multiple individuals exist with the same name, including on social-media. And psychographic aspects are sought for specific individuals, not (just) their namesakes. A challenge then is that of identifying the social-media identity of a specific individual, of course given some additional context such as their location, profession or any available information. Socratez Profiler employs state-of-the-art entity resolution and disambiguation algorithms and technologies to identify individuals from amongst namesakes, with high probabilistic confidence.
  • Confident information synthesis. The aspects in an individual’s profile typically have to be derived. For instance the fact that “outdoor hiking” is one of a users’ interests Is not something that may explicitly stated but rather derived from multiple, regular posts about hiking activities. As another example, we would have more confidence in the derived aspect of an individual being vegetarian from multiple posts expressing positive sentiment about vegetarianism versus a one off post about the same. Socratez Profiler determines a probabilistic confidence for every derived aspect using machine-learning models.
  • Privacy preserving. Socratez Profiler accesses and derives profile aspects and activity information only off social-media content that is open. However, individual level information is provided to application only with user consent ! In all other cases the tool provides only aggregated and non personally identifying aspects and activity.

Dr. Naveen Ashish is the Co-founder & Data Scientist Advisor to TeraCrunch for AI technologies. He is also the Head of Data Sciences at Fred Hutch (the Fred Hutchinson Cancer Research Center in Seattle, WA). The views and information in this article have no relationship with or are in any way representative of the work at Fred Hutch.

Intelligent Chatbots in Health

Naveen Ashish, PhD                                                                               

Chatbots or Conversational Intelligent Agents, such as Siri, Alexa, Cortana, has gained a significant presence in the modern society. In addition to their impacts on personal life, intelligent agents/devices, as well as conversational chat agents for applications such as customer support and travel reservations, have transformed the way modern businesses are being conducted. By definition, “a chatbot (also known as a talkbot, chatterbot, Bot, chatterbox, Artificial Conversational Entity) is a computer program which conducts a conversation via auditory or textual methods.”

Outside of the personal and business setting, a potential area for positive implementations of Chatbots is healthcare! Chatbots can be instrumental in enabling and/or optimizing multiple key health care and medical research applications; which have been proven to be beneficial for patients, clinicians, caregivers, insurers, responders and researchers.

One example of the first implementation of operative bots in the healthcare research context took place half a century ago. ELIZA, one of the first operative bots, was created to emulate a Rogerian psychologist, who asks patients questions by rearranging patients’ response.

So in what ways/ areas can Chatbots be utilized to make important contribution to healthcare? There are multiple key areas that we highlight.

Information and Support

PERSONALIZATION is defined as the key aspect in the upcoming evolution of patient care and medical treatment advancement.

Recent surveys suggest that nowadays, people are relying on comprehensive internet search platform like Google for health information/advice. The scope of information could range from common colds to severe chronic diseases. However, without sufficient medical training/background, one of the challenges that users face is how to effectively identify information that provides the most potent (efficient) means for understanding and addressing the conditions. In this scenario, a Chatbot kind interface, equipped with medical diagnosis procedures and treatment plans algorithms, can act as an intermediary to bridge the gap in knowledge between the users and the search engine. Specifically, a Chatbot interface can provide a) NL Interface, b) Filter and focus search, c) Personalized and organized search results and interactions. Chatbots can also provide assistance with tasks such as medication reminders, prescription refills, advice on medications and administrative reminders, such as scheduling appointments.

Chatbots for Doctors and Medical Staff

Optimization of service, costs, and the overall efficiency of health care delivery and management, especially in the United States, is another area that can benefit from the implementation of Chatbots. A major contributing factor for the current inefficiency and high costs is the lack of engagement and following-up care with patients. A scalable and cost effective Chatbots/ Virtual Assistant can be deployed to follow up and engage with patients after they left the clinic or hospital facility. Recent studies have found that a Chatbot or Virtual Assistant that was used for following-up care with patients can result in a better patient care experience outside of the clinic/hospital setting and prevent future readmissions. Additionally, doctors/healthcare providers can review the information that is collected by Chatbot from its interactions with patients, including medication intake, overall adherence, treatment visits, tests, ER visits, etc. to intervene with patients’ treatments if necessary.


Depend on the severity and requirements of the health conditions, intervention may be required for patients’ overall health improvement and prevention of future readmission. Some apps-based examples of effective intervention mediums that have yield successful results in recent years are in areas, such as smoking reduction, alcohol or substance usage, anger management, sedentary behavior, and diet management. While patients can now be proactive in achieving their health goal through continuous interactions with guidelines, interactive, menu-driven interface via an app on their smart devices, the success of these apps provides an evidence suggesting that there are significant demands and opportunities for Chatbots/ Virtual Assistant to create a more personalized experience for the users of these interventions.

TeraCrunch Chatbot Development

TeraCrunch has significant core technology as well as solution development expertise in the Chatbot space. Specifically:

  • Automated Natural Language Processing (NLP) and Semantic Text Understanding are at the core of any Chatbot, which is powered by TeraCrunch’s patent pending Socratez text understanding engine.
  • TeraCrunch has developed and deployed custom Chatbots in other domains including home automation and travel assistance.
  • TeraCrunch data scientists have solution development expertise and knowledge of the health and biomedical domain, particularly in cancer.

Lifestyle Attributes Database & Analysis for Cancer Research

Naveen Ashish, PhD

Lifestyle and Cancer

Researchers have traditionally analyzed patient data that is genetic, patient medical record data, and other data such as patient MRI images or Pathology reports in cancer research. Cancer has largely been considered as emanating “from genetic factors”. It is becoming increasingly clear however, that the primary factors towards cancers of various kinds are not solely genetic, but that environment and lifestyle are key contributing factors as well.

By ‘lifestyle’ factors, we imply aspects such as diet, exercise habits, other physical activity, weight and body mass index (BMI), alcohol tobacco or other substance use, sleep habits, stress and anxiety, etc. Many interesting studies, evaluating the impact of some such lifestyle factors, have appeared in recent years. For instance in breast cancer, a number of risk factors have been identified in the pathogenesis of breast tumors and among these, a great number are linked to nutrition and life-style such as alcohol consumption, obesity, and eating patterns (Study Link). A number of epidemiological studies (Study Link, Study Link) have provided convincing evidence that alcohol consumption is an important risk factor for the incidence and mortality of breast cancer. On the other hand, soybean products act as cancer preventive agents as shown in rodents and other animals (Study Link).

Interestingly soy products have been a staple part of the Asian diet for centuries (they are the predominant source of isoflavones, which belong to the family of phytoestrogens) and studies that investigate the relationship between soy food intake after the diagnosis of breast cancer and health status reported a slightly protective effect especially among the Asian population (Study Link). There have been almost 200 publications in the last 10 years, with the scope of research spanning genetic risk factors, late effects from treatment, comorbidities, second malignant neoplasms, reproductive health, psychosocial outcomes, long-term health, and lifestyle behaviors.

Getting Lifestyle Data: Social-Media

While lifestyle attributes are important, this is data that is typically hard to obtain as it is typically not part of patient medical records or other data. Many attributes, for a particular individual, are also dynamic and evolving – for instance exercise schedules, diet or other habits can and do change with time, place and other context. A promising source for gathering such information is social-media ! A person’s posts, conversations comments, likes and dislikes, and other expressions often provide valuable information from which lifestyle attributes can be derived. However, gathering such information in scalable fashion is a data science challenge.

TC-PAD: Lifestyle Data in a Database

TeraCrunch has developed TC-PAD (for Personal Attributes Database) which is comprehensive solution that provides a structured database of key personal attributes of an individual’s lifestyle. TC-PAD requires authorized access from individuals for the social-media feeds and profile, which it then accesses for content. It then applies sophisticated natural language and semantic understanding algorithms (part of the TeraCrunch Socratez text understanding suite) to synthesize a variety of lifestyle attributes, per individual and deliver this data as a structured database. This is an “on-demand” solution where such a lifestyle database can be created virtually instantly for a cohort of individuals, once authorization credentials have been provided.