Is Data Making Healthcare Sick?

Dinesh Rai MD
7 min readAug 14, 2022

Data is critical to the success of the healthcare industry. Healthcare data helps providers make informed decisions about patient care, allows pharmaceutical companies to develop new and better treatments, and aids in the overall advancement of medical science. The healthcare industry has many different ways to generate and collect data. Electronic medical records (EMRs) are the central core of information that allows providers to track and manage their patients’ healthcare needs, as well as track results from bloodwork, imaging, biopsies, etc. Patient surveys collect feedback on patient satisfaction and hospital performance. Clinical trials and experiments gather data on new treatments and medications. Finally, wearable technology like smartwatches, fitness trackers, or even ingestible microchips collected biomedical data.

However, despite its importance, healthcare data is often plagued by three big problems: it’s siloed, it’s unstructured, and it’s often of poor quality. These problems can make it difficult for different parts of the healthcare industry to communicate and collaborate effectively, which can ultimately lead to subpar patient care. In this article, we’ll explore each of these three big problems in detail and offer some possible solutions.

What is healthcare data?

In healthcare, data is everything. From patient medical records to financial information, every byte of data collected by a healthcare organization has the potential to be analyzed and used to improve patient care. In today’s digitized world, the volume of data created by healthcare organizations is growing exponentially. Making sense of this growing data deluge requires understanding how healthcare data is stored, accessed, and analyzed. The challenge for healthcare organizations today is finding solutions that make it easier to manage the growing volume of health information cost-effectively.

What data problems does healthcare have?

One of the biggest problems with healthcare data is that it’s siloed. That is, it’s stored in isolated databases not connected or interoperable. This can make it difficult for different parts of the healthcare industry to access and share data. Even if various databases contain essential information about the patient’s health, they may not be linked together, making it difficult for the hospitals and physicians to access all relevant data. Imagine you went to the ER for chest pain. The ER doctor may order a number of tests, including a CT scan and an EKG. However, suppose the hospital’s database is not linked to the patient’s primary care physician’s database. In that case, the ER doctor may not be able to access the results of the patient’s last CT scan, which was performed six months ago. As a result, the ER doctor may not have all of the information they need to make an informed decision about how to treat the patient. These silos can also increase the risk of security breaches as one has to access multiple databases for information.

This problem is compounded by the fact that healthcare data is often unstructured. That is, it’s not organized in a way that makes it easy to search and analyze. Unstructured data is challenging to work with because it isn’t organized in a way that makes it easy to search and analyze. One way to think of unstructured data is to imagine a filing cabinet. In a traditional filing cabinet, information is organized into folders and files. This structure makes it easy to find the information you’re looking for. However, if the information in the filing cabinet isn’t organized, it can be challenging to find what you need. The same is true for unstructured data. If the data isn’t organized in a way that makes it easy to search and analyze, it can be hard to find the insights you need. There are many types of unstructured data in healthcare. Some of the most common include physician notes, radiology images, and free-text reports, which don’t reside in neat databases.

The third big problem with healthcare data is its often poor quality. One of the main causes of poor data quality in healthcare is errors in data entry. This can happen when someone enters incorrect information or doesn’t follow the proper procedures for inputting data. For example, if a patient’s weight is entered as pounds instead of kilograms, this will lead to errors in downstream calculations. Another common cause of poor data quality is incorrect coding. This can happen when the wrong code is used to describe a patient’s condition or treatment. Inconsistencies in data collection and storage can also lead to poor data quality. Different medical record systems have inconsistent standards for data storage leading to interoperability difficulties. Data is often incomplete due to multiple factors, including patients refusing to provide information, data not being collected at all, and data being lost. Quality data is vital to the delivery of excellent healthcare, both at the patient and population levels.

What are some solutions?

As you see, several serious issues need to be addressed in healthcare data. One possible solution to the siloed healthcare data problem is to use a data warehouse. A data warehouse is a database designed to store and analyze large amounts of data. Data warehouses are often used to combine data from multiple sources, such as different hospitals or different clinics. A unified database can make it easier for decision-makers to access the information they need. In healthcare, data warehouses are often used to combine claims data, clinical data, and financial data. Other solutions are to use techniques of data virtualization and federation. Data virtualization is the process of creating a single view of data from multiple sources. It acts as a single point of access for distributed databases. Data federation is similar to data virtualization, except it also creates a standard data model. Let’s say you want to trend a patient’s hemoglobin between outpatient and inpatient records. These two databases may store these values differently, like “WBC” and “white blood cell count.” With virtualization, a user can access both using two different commands using these different naming conventions. Data federation would use standard terminology, like “WBC,” and map this to the naming conventions used in the various databases. Now the user only needs to issue one command. This process is much more technically tricky since you have to account for all kinds of different use cases and data sets.

There are several solutions to the problem of using unstructured data in healthcare. Natural language processing (NLP) is a subfield of artificial intelligence that deals with the interaction between computers and human (natural) languages. In healthcare, NLP is used to help extract relevant information from unstructured data sources such as clinical notes, discharge summaries, and radiology reports that can support decision-making, improve patient safety, and drive quality improvement initiatives. NLP is used to create structured data sets from unstructured data. NLP algorithms automatically identify key concepts in text documents, structure the information for further analysis, and enable better search and retrieval of relevant information. With the increasing volume of digital health data generated daily, NLP will play an increasingly important role in supporting healthcare decision-makers. Another solution is to use manual data entry involving staff to manually extract information from unstructured data sources and enter it into a structured data set. Manual data entry can be time-consuming and error-prone, but it may be necessary for some data sources.

Many initiatives can help to improve the quality of healthcare data. Data governance is the process of managing and protecting data. It includes developing policies and procedures for collecting, storing, accessing, and using data. In medicine, it is common to encounter data that is not consistent with the standards of what is expected. By setting and enforcing standards, healthcare organizations can ensure their data is uniform across many databases. Policies should be in place to ensure that only authorized personnel can enter and access specific data. An organization can implement quality control processes to check if data is of the correct type, filled in completely, and is consistent with the standards set. Data governance procedures can be automated, like server-side validation of vital signs (e.g., all fields are filled out and pulse oximetry has to measure between 0%-100%). They can also involve manual processes such as training staff to double-check their entries.

Conclusion

Data is vital to the success of any healthcare organization. In the rapidly evolving world of data analysis, data management, and artificial intelligence, data is the key to reducing costs and improving quality of care. Organizations setting cultures where data is viewed as an asset will lead them to further the science of medicine.

--

--

Dinesh Rai MD
0 Followers

I am a physician interested in the intersection of medicine and technology.