Beginning to Bloom in Health Care
ADVANCE Assistant Editor
Data repositories. Data warehouses. In today’s computer-oriented environment, every health information management (HIM) professional is certain to come across these nebulous terms. What do these designations mean, and how do they differ?
The answer seems to depend on who you ask. Some professionals make no distinctions between a “data repository” and a “data warehouse,” and they use these terms interchangeably. Others argue there are differences between the two, however subtle.
But one thing is certain—these data depots are here to stay. In today’s managed care environment, accessing and correctly utilizing these pools of information can be crucial to an organization’s survival. Furthermore, many indicate that these data vaults are an important component to implementing the computer-based patient record (CPR) of the future.
Is It a Repository or a Warehouse?
“I get the impression that the health care industry is having difficulty making a distinction between these two terms,” noted Cynthia Miller, manager of marketing and sales support in the health care division of ALLTEL Information Services in Atlanta. “In other industries, such as banking, there are clear definitions for and distinctions between these entities.”
Michael Lopez, vice president and chief information officer of the Moses Cone Health System in Greensboro, NC, agreed. “In health care, there are no standard, precise definitions for either one of these terms,” he stated. “When people do make distinctions between the two, there is usually a tremendous amount of overlap.”
To understand either one of these terms, you must first understand the difference between “data” and “information.” Chris Maxwell, vice president of Healthcare Research Affiliates Inc. in Mechanicsburg, PA, explained, “Raw data is meaningless; it is nothing but numbers and facts jumbled together. Data has to undergo various transformation processes before it becomes meaningful information that people can use. What people really want from either data repositories or data warehouses is information, not data.”
Among those who have identified distinctions between the terms, there is agreement on certain factors. The term “repository” is thought by some to be synonymous with “storage area”—to them, the title implies nothing more or less. The term “warehouse,” on the other hand, conjures images of a more meaningful information source.
Claire Dixon-Lee, PhD, RRA, product marketing manager of Medicus Systems Corp. in Evanston, IL, believes that in a repository, data are simply deposited as captured, whereas in a warehouse, the data are categorized and sorted first. “In a warehouse, data are input and maintained in a specified fashion, to guarantee easy access and retrieval.”
Some feel the crux of the disparity lies in the different uses for data stored in a repository vs. data contained in a warehouse. “When I hear the term “repository,” I think of multiple clients feeding data into a centralized area,” stated Richard Bankowitz, MD, director of clinical information management at the University Health System Consortium in Oak Brook, IL. “It’s a working collection of data that people can access immediately; however, it is not presented in a format that facilitates higher level queries.”
“A repository is a collection of data that supports real-time transactions, such as patient care order entry systems or patient billing functions,” Miller commented. “The name warehouse implies that the information stored within is going to be used for different purposes—to conduct research studies, outcomes analysis, variance analysis, managed care contract analysis or market population studies.”
Most experts concur that a data warehouse consists of databases that have been linked together. “A database is an organized collection of data that can be utilized with a data management tool or application,” Dr. Dixon-Lee explained. “Databases are designed with one or more specific purposes in mind. For example, you might have a database that will generate routine analysis reports based on financial transactions. The problem with databases, however, is that you cannot stretch the information to suit any other purpose. Databases definitely have limits.”
“A warehouse is often a conglomerate of databases that have been linked together,” said Dr. Bankowitz. “But, upon entering the warehouse, the data are aggregated, edited, summarized and/or indexed in order to convert it from a raw state into something useable.
“Redundancies must be eliminated within raw data,” Dr. Bankowitz elaborated. “For example, there might be 20 different ways to characterize a diabetic, or three trade names for the same drug; you have to accommodate these disparities. After the data are edited, other factors have to be taken into account. You might want to risk-adjust or cost-adjust the data based on certain variables. When the information is in a coherent format, users can access the data and view it in whatever form they desire.”
Uses of the Data Repository
By making information instantly accessible and allowing for many different types of transactions, data repositories empower users by enabling them to make more informed choices based on comparative analysis. In the health care arena today, who is tapping into and utilizing these information arsenals?
Across the board, there seem to be two types of data repositories common in the field of health care. The first is a repository dedicated to activities associated with patient care and financial transactions occurring on a “real-time” basis; the second is a repository containing clinical, administrative and financial data that will be used to support informational analysis.
“Currently, there is no single database program available that will support both types of repositories simultaneously,” explained Miller. “In the health care market today, you need to maintain two separate databases. One will contain clinical and administrative information, which must be accessible in real time to identify patients and determine their eligibility and to support patient care functions such as order entry, results retrieval and patient care documentation. This database must be available 24 hours a day and provide quick responses, so that health care providers will utilize it.
“The second will store much of the same clinical, administrative and financial information,” Miller elaborated. “But this repository is designed and optimized for data analysis purposes, such as clinical research, patient outcome analysis, population studies and administrative decision support, for example.”
Lopez noted that attempting to combine the two data banks could cause the system to become very slow. “If you try to unite the information into one repository and a user performs a very complicated research study, it could bring the transactional function of the system to a grinding halt,” he cautioned. “By keeping the two types of data repositories separate, you can ensure a very quick response time on the administrative side, where this is crucial.”
Having information at user’s fingertips will hopefully foster better-informed decision making. “Data repositories are a technological tool to assist users in making informed choices,” asserted Maxwell. “Employers gain a more accurate portrait of health care services available on the market; practitioners have accurate analyses pertaining to treatment outcomes; and patients will receive higher quality care.”
Aside from saving facilities monetary and human resources, data repositories perform another function for the entities that utilize them. “Repositories will be used in tandem with the CPR in the future,” stated Dr. Dixon-Lee. “You can’t operate a CPR without having your data grouped together and stored in a useable format.”
John Morgan, PhD, a consultant for 3M Health Information Systems, explained that instituting and maintaining data repositories is a step in the right direction for the future. “No one will be able to create a CPR in one step; it’s too huge a process,” Dr. Morgan stressed. “The products being sold now allow users to make gradual notches up the ladder toward implementing a CPR.”
Essential Elements for Creating A Data Repository
Perhaps you’ve been sold on the benefits of data repositories and now want to implement one within your facility. Is the process merely a matter of filtering information into a common source, pressing a button and—Presto!—facts and figures will spit out in a useable format?
Not exactly. Dr. Bankowitz cautioned that planning is essential. “The first thing an organization must do is decide how the information will be used. This will define how the data within the repository needs to be structured.”
As a first step toward information synthesis, integration at the technological level is important. “It is imperative to have certain significant feeder systems automated in order to get vital information into the repository,” asserted Lopez. “The biggest three are usually pharmacy, radiology and laboratory systems because the information coming from these departments constitutes a critical mass of information. Without a critical mass of data, you have the equivalent of an empty vault—there’s nothing in there worth retrieving.
“To achieve this, a facility’s systems should be based on open architecture,” Lopez continued. “There also should be desktop tools, such as Microsoft Office or Lotus spreadsheet programs, which can be used to access the information.”
Equally essential is the need to integrate at the information level. “It’s relatively simple to link up disparate computer systems so that they ‘communicate’ with one another,” Dr. Bankowitz asserted. “However, unless the data have been aggregated, standardized and synthesized, it won’t make any sense. Don’t think creating a repository is as simple as buying a piece of technology and implementing it—if that’s all you plan to do, you’ll end up with a worthless collection of unrelated data.”
As the speed limit is raised on the information highway, the role of data repositories in the health care industry is going to increase as well. “I think managed care companies are going to want a lot more information on the patients they are covering,” Dr. Dixon-Lee predicted. “They will demand a more complete and detailed profile of patients, in an attempt to ascertain how many times each individual will access the system. This means amassing even more clinical information and creating larger data repositories from which to work.”