Data Discovery In Healthcare

A few days ago, QlikTech and Epic announced a technology partnership that will strengthen the integration between their software products as well as provide a forum for their joint customers to share best practices and innovative ways to use both technologies.

For a firm like Ranzal who is currently implementing several population health discovery applications, my first reaction was simply that this partnership made sense.  Both companies are leaders in their respective domains and are very well-regarded.  Beyond that, discovery technologies like Qlik, Tableau and Endeca are quickly establishing a foothold in the blossoming domain of healthcare analytics.  Unlike traditional BI technologies, data discovery tools are meant to quickly mashup disparate datasources and allow users to ask in-the-moment, unanticipated questions.  This alternative approach to analytics is allowing healthcare providers to build self-service discovery applications for broad audiences at speeds unimaginable in the world of the clinical data warehouse.  Since almost all healthcare analytics applications rely on data from the EMR, this partnership seemed natural, if not overdue.

My second reaction was that there was something missing.  In my experience, to get a holistic view of the health system, all of the relevant data must be tapped.  Data discovery on structured data, while powerful, can only tell party of the story.  With 60% of a health system’s data is tied up in unstructured medical notes, reports and journals, Qlik is not fully equipped to allow healthcare practitioners to gain a 360 degree view of their health system.

Endeca shines when structured and unstructured data are both required to paint a complete picture.  In healthcare, properly analyzing clinical data can mean drastically better outcomes at lower costs.  Understanding the “why” behind the “what” means properly tapping the narratives in the medical notes and tools like Endeca are best suited to unlock value when unstructured is prominent.

QlikView is a powerful tool and one cannot question its ease of use and numerous discovery features.  However, in industries rife with unstructured, products like Endeca that treat unstructured as a first class citizen (in the way it acquires, enriches, models, searches, and visualizes unstructured) are better suited to unlock the whole story.

So, I couldn’t help but think that a strong partnership could also be made between other EMR vendors with Oracle Endeca.  We spend a lot of time sizing up the relevant technologies in the data discovery space trying to understand differentiators.  For the types of discovery we’re seeing healthcare when unstructured is necessary to tell the whole story, our money remains on Endeca.

OEID 3.0 First Look – The Little Things

There’s so much new “goodness” in Oracle Endeca Information Discovery (OEID) 3.0, it’s been a little bit of a challenge to “spread the word” in small enough chunks.  We start writing these posts, get a little excited and pretty soon we’ve got Ranzal’s very own version of the Iliad.

In the coming weeks, there will be a few Iliads, and maybe an Odyssey as well, but before we get too deep into the platform, I wanted to illustrate and elaborate on a couple “small changes” that should prove beneficial to people just coming up to speed and OEID veterans alike.

The Guided Navigation Histogram

As one of my colleagues pointed out, I neglected to highlight a key enhancement to the Guided Navigation user experience when posting to the blog earlier this week.  Often when doing data modeling for an OEID application, you’ll be transforming, joining, doing denormalization and all sorts of other operations on your data as it is being brought into your Endeca Server.  What often happens is that you lose some of the original context that was present in the source system.  For example, you may have a set of sales records that a user has the ability to refine by State, by City and by Product.  When you wanted to give the user the ability to understand “how much data” was behind a given Guided Navigation option, the typical answer was to use Refinement Counts.


As you can see above, this construct gives a numerical value to the frequency of a given attribute value in the current data set.  However, this number often causes confusion for users.  Is it the number of Invoices?  Is it the number of line items on my invoices?  Is it the number of Shipments?  Often, it’s none of these things and simply an artifact of how the data is being modeled.  With OEID 3.0, there is a new way to visually display this frequency data, without the messiness of (often) meaningless numbers.

As you can see above, I get the same ability to message to users that most sales are occurring in Toronto with both versions of the product.  However, OEID 3.0 provides the immediate, visceral context that tells the user, my Toronto transactions are nearly three times as numerous.  In addition, the aforementioned absence of “strange numbers” eliminates confusion and encourages users to explore rather than over-analyze.

Multi-Lingual LQL Parsing And Validation

Continuing with the theme of Internationalization, the LQL Parsing Service now supports a language parameter when compiling and validating queries.  While English is still the lingua franca of the internals of the platform, having the ability to troubleshoot your queries in your native language is a huge plus.  Below, you can see the Metrics Bar Portlet returning my syntax error in Portuguese:

Note: For those of you following along, this is the “Unexpected Symbol” error where the per-select Where clause expects the criteria to be in parentheses.  At least I think it is, my Portuguese is a little rusty.

This concept is supported by the Parsing Service itself so any application making use of the Endeca Server web services can leverage this functionality as well.

Languages in Studio vs. Languages in the Engine

One additional note on support for multiple languages is that Endeca Server actually supports more languages than OEID Studio has been translated into so far.  Users in OEID Studio have ten locales to choose from in the application:

  • German
  • English (United States)
  • French
  • Portuguese (Portugal)
  • Italian
  • Japanese
  • Chinese (Traditional) zh_TW
  • Chinese (Simplified) zh_CN
  • Korean
  • Spanish (Spain)

However, Endeca Server supports the above 10 in addition to the following 12 (with their language codes, as Endeca Server expects them, in parentheses):

  • Catalan (ca)
  • Czech (cs)
  • Greek (el)
  • Hebrew (he)
  • Hungarian (hu)
  • Dutch (nl)
  • Polish (pl)
  • Romanian (ro)
  • Russian (ru)
  • Swedish (sv)
  • Thai (th)
  • Turkish (tr)

Note that Endeca Server expects RFC-3066 codes, which will differ slightly from the locales that are used in Studio as well.  For example, setting the language of a given attribute to en_US would not work in Endeca Server while being a perfectly good locale in Studio.  Language would be “en” for Server in this case.

That’s all for now.  More posts coming later today and tomorrow.  Happy Exploring!