Data Discovery In Healthcare

A few days ago, QlikTech and Epic announced a technology partnership that will strengthen the integration between their software products as well as provide a forum for their joint customers to share best practices and innovative ways to use both technologies.

For a firm like Ranzal who is currently implementing several population health discovery applications, my first reaction was simply that this partnership made sense.  Both companies are leaders in their respective domains and are very well-regarded.  Beyond that, discovery technologies like Qlik, Tableau and Endeca are quickly establishing a foothold in the blossoming domain of healthcare analytics.  Unlike traditional BI technologies, data discovery tools are meant to quickly mashup disparate datasources and allow users to ask in-the-moment, unanticipated questions.  This alternative approach to analytics is allowing healthcare providers to build self-service discovery applications for broad audiences at speeds unimaginable in the world of the clinical data warehouse.  Since almost all healthcare analytics applications rely on data from the EMR, this partnership seemed natural, if not overdue.

My second reaction was that there was something missing.  In my experience, to get a holistic view of the health system, all of the relevant data must be tapped.  Data discovery on structured data, while powerful, can only tell party of the story.  With 60% of a health system’s data is tied up in unstructured medical notes, reports and journals, Qlik is not fully equipped to allow healthcare practitioners to gain a 360 degree view of their health system.

Endeca shines when structured and unstructured data are both required to paint a complete picture.  In healthcare, properly analyzing clinical data can mean drastically better outcomes at lower costs.  Understanding the “why” behind the “what” means properly tapping the narratives in the medical notes and tools like Endeca are best suited to unlock value when unstructured is prominent.

QlikView is a powerful tool and one cannot question its ease of use and numerous discovery features.  However, in industries rife with unstructured, products like Endeca that treat unstructured as a first class citizen (in the way it acquires, enriches, models, searches, and visualizes unstructured) are better suited to unlock the whole story.

So, I couldn’t help but think that a strong partnership could also be made between other EMR vendors with Oracle Endeca.  We spend a lot of time sizing up the relevant technologies in the data discovery space trying to understand differentiators.  For the types of discovery we’re seeing healthcare when unstructured is necessary to tell the whole story, our money remains on Endeca.

OEID 3.0 First Look – Democratizing Data Discovery

Adjectives like “agile” and “self-service” have long been used to describe approaches to BI that enable organizations to ask their own questions and produce their own answers.  Applied to both processes and products, these labels are applicable any time an organization can relax the “IT bottleneck”.  Over the past decade, the core tenets of the Endeca vision (“no data left behind, ease of use, and agile delivery”) have shaped a product that has empowered organizations to unlock insights in their enterprise data in ways never before possible while simultaneously reducing their reliance on IT to do so.  Notice I said “reduce” their reliance, not “eliminate”.

Data discovery is a quest not a destination.  It is a never-ending initiative.  As soon as new truths come to light from your discovery apps, inevitably, new questions arise as well.  Ideally, these new questions can be answered within the application at hand.  Sometimes, however, finding answers to these new questions requires experimentation and alternative data “mash-ups”.  Almost always in these cases, the time comes to pick up the phone, call IT……and wait.

All of the discovery tools on the market today that promise self-service and agility still require IT’s involvement when new data sources or new data models are required, OEID included.  However, through some new features in the the latest v3.0 release, it appears as if Oracle is making strides to address this dependency.

Granted this is just one man’s opinion and largely speculative, but a few of the new features in the product have me convinced that Oracle is pushing to democratize data discovery.  Through subtle (and not so subtle) changes, it seems they’re shifting the product to a platform — one that empowers the business to broaden their own exploration and answer the next round of questions, further reducing your organizations reliance on IT.


Here’s what got me thinking


A Collaboration Platform

The revamped “home page” experience surfaces new ways to provision and share your applications.  Casual users can now create their own applications, associate them to a data domain, and start composing their apps.  Initially, the applications are “private”, and only made accessible to a group of users hand-picked by you.  You can make your application “public” once you feel it is ready for the prime-time and mass consumption.

Self-Service Data Upload

Another nod in v3.0 to democratization comes with the introduction self-service data upload.  Not only will the upload move data into your data domain, but it will profile your data and (usually) arrive at the proper attribute configuration (data types, etc.)   Currently, this only supports Excel file formats, but if you’re like me, you can see where this is heading…


Better Cluster Management

At first I was a bit miffed by Endeca Server’s move from Jetty to WebLogic 11g (and even a little frustrated by the involved installation process), but reading the v3.0 literature around improved cluster management, it became clear that more sophistication in the cluster support might mean there is a future in the cloud for the product.  Adding and subtracting nodes from your data domains will be required if end users are actively adding more data or opening up their data mashups to more users in their organization.  Elastic computing would have to underpin such a platform with such dynamic, unpredictable resource demands.

A Vision

Again, this is just one man’s hope for the product.  These changes indicate a shift in the way “self-service” is approached.  In future releases, “self-service” and “agile” BI may no longer mean simply asking your own unanticipated questions.  It may mean introducing new data, new applications and collaborating across the enterprise to further fulfill the promise of data discovery without IT.

I hope Oracle continues down this path.  I long for a future where data discovery happens in the cloud so organizations do not have to fumble with infrastructure, scale and upgrades.  I see a future with data uploads across a variety of formats which can then be added to a data marketplace within the product for the whole organization to leverage.  I hope for new capabilities in Studio so that the data configuration, joining, and cleansing that happens in integrator today by ETL experts and data stewards can be accomplished intuitively by the end users and analysts.

It is my hope that 3.0 is not the end game, but the first step of many towards democratizing data discovery and offering a broader definition to “self-service” BI.