The Challenges of Medical Big DataEconomy Science Technology Society Culture
In the 1980s, Japan pioneered the application of big data to marketing and business management. Sales data collected at convenience stores in Japan are instantly sent to host computers at company headquarters via point-of-sales systems. The data is accumulated daily and utilized in merchandise stocking at individual stores and in product development.
I had vaguely thought that progress in the collection and analysis of data regarding day-to-day clinical practice would lead to optimization aimed at simultaneously improving cost-effectiveness, accessibility, and quality. As it turned out, the road was not so smooth when it came to medical data, which is the ultimate personal information, touching as it does on matters of life and death. But as Japan’s social security system faces its biggest challenge ever in the countdown to a super-aged society, there is no time to be lost in tapping into big data in this realm.
Statements of Medical Expenses Go Electronic
Hospitals and clinics have a wealth of big data characterized by what are known as the 3Vs: volume, velocity, and variety. According to the Ministry of Health, Labor, and Welfare’s 2011 Patient Survey, there were 1.34 million inpatients and 7.26 million outpatients daily nationwide. High-mix, low-volume production is the norm in the clinical front, where patients receive a wide variety of services tailored to their individual illnesses.
The breakdown of treatment provided to each patient is itemized in a statement of medical expenses, which is then sent to an organization responsible for reviewing health insurance claims and reimbursing medical facilities. Merely digitizing this process caused a great fuss. Although most hospitals and clinics were using computers to manage medical expense statements by around 2006, as many as 1.4 billion sheets of paper were printed every year to submit the statements. It was only in 2011 that electronic statements and online filing of claims became obligatory in principle.
Since 2009, the government has stored records of electronic statements in its national database. If analyzed properly they can be used to generate value, the fourth V characterizing big data, but initiatives to that end have only just begun.
Enhancing Quality and Efficiency
Another type of big data that is already helping to improve the quality and efficiency of medical care is clinical data based on the Diagnosis Procedure Combination system. DPC is also included in the national database.
Medical care in Japan goes by a fee-for-service system, wherein all medical resources that went into a patient’s treatment are itemized in the bill. In response to strong criticism that such a system encourages overtreatment and overmedication, an inclusive payment system was sought as an alternative, in which per-diem rates are applied. DPC provides the classifications for determining the unit rates. Since a per-diem flat-fee payment system was introduced in 2003 at 82 advanced treatment hospitals, such as the head hospitals of university medical centers, more hospitals have joined in on a voluntary basis. As of the end of March 2014, around 1,860 hospitals were either participating or preparing to do so, and over 8 million units of clinical data were being added annually. Because DPC is a structured database, moreover, it has the potential to generate more value.
Hospital Comparisons Made Possible
DPC codes consist of 14 digits. In addition to the name of the disease or injury on which the most medical resources were spent during hospitalization, the code indicates in numerical form the patient’s age, weight, and level of consciousness, surgical and other procedures performed, drugs used, and complications and severity, which may affect the quantity of medical resources required (figure 2). Adding to this other pertinent information, such as days in hospital and medical costs, would make for data sets that can increase the transparency of medical information.
In the United States, the pioneer in the field, Diagnosis Related Groups were developed at Yale University in the early 1970s and marketed by 3M. Japan can pride itself in having developed a more sophisticated version of its own rather than simply “importing” the US system and yielding to US leadership. The Health Ministry compiles the DPC data submitted by hospitals and releases the statistics once a year, with the hospital names included.
Unlike statements of medical expenses, which are merely itemized lists of treatments performed, processed DPC data not only gives an overall picture of the medical services that were provided but also makes clear at a glance how large a discrepancy there can be in treatment between different medical institutions for the same diagnosis.
By viewing the data, patients and others can learn which hospitals offer what kind of care for their particular illness, including the length of hospitalization and costs. They can also compare hospitals for performance on such parameters as the average number of discharges per month, share of the medical district, and average length of stay. There are privately run hospital comparison websites that process the raw data released by the Health Ministry into a more navigable format, with a search function and other added features.
A chart compiled from DPC data comparing the number of surgeries performed at advanced treatment hospitals in Tokyo. The rows indicate hospital names, the columns indicate type of disease, and the figure at each intersection shows how many surgeries a hospital performed in a year for a specific disease.
Qualitative Analyses May Lead to Shake out of Hospitals
A data scientist specializing in medicine would be able to make more in-depth analyses that go into the quality of medical care. They could deduce whether or not the procedures at a hospital meet guidelines, for instance, or whether or not any complications developed during hospitalization. In oncology, an area in which drug development is particularly active, hospitals that still use outdated treatment methods could be identified, thereby encouraging the standardization of cancer treatment in Japan.
There are potential plans to eventually link pay-for-performance with the DPC system, but this is bound to arouse much controversy. Nevertheless, as public access increases for data on medical performance, hospitals that cannot maintain a high standard of treatment quality could face the threat of closure.
Lack of Continuity in Clinical Data
Professor Fushimi Kiyohide of Tokyo Medical and Dental University, who has academically supported DPC operation in Japan from an early stage, points out that there are problems to be worked out before DPC data can be utilized to its full potential. The biggest challenge is the inability to track patients over time.
As DPC data is submitted by each hospital, information is available on patients who are currently hospitalized there. But when it comes down to verifying the efficacy of a certain treatment, for instance, one must be able to keep track of patients who underwent the treatment and check whether they recovered, died, or were transferred to another hospital. It is important, then, to be able to connect the dots between multiple data that are highly personal in nature and yet lack personally identifiable information.
Convenience stores in Japan link up purchase data within their chain by having customers register their profile and issuing point cards in return, which the customers then use each time they shop. Needless to say, the importance of linking anonymized personal information is far greater in medical research. Fushimi goes so far as to say, “The fact that clinical data in Japan can’t be correlated with other data is a blemish in our history and undermines national interests.”
Potential Application of “My Numbers” to Medicine
The purpose of analyzing big data in research is to quantitatively and qualitatively test hypotheses. While age and gender are vital information, individual identities are not, and all that is needed is to be able to correlate data by a relevant personal identification number. But this is where the difficulty comes in.
A social security and taxation number system, wherein an individual’s social security information including pensions and tax payment information are all stored under a single ID number, commonly known as My Number, will be launched as early as 2016. It will be the start of the national identification number system, which had been a pipe dream for so many years. Even this has been met with concerns over security and privacy protection, however, and there is a strong feeling that any attempts to link these ID numbers with medical records need to proceed with extreme caution. The MHLW and Japan Medical Association had been contemplating the introduction of a separate medical ID for information regarding medical and nursing care.
The government is purportedly looking to apply the My Number system in the medical field, so that, with the individual’s consent, medical institutions and nursing care facilities can share his or her medical data. Moreover, a bill to revise the Act on the Protection of Personal Information will be submitted to the ordinary session of the Diet as early as fiscal 2015, enabling personally unidentifiable information to be passed on to third parties without the individual’s consent.
If either the medical ID number or My Number were to be coordinated with health insurance, individuals would be able to keep tabs on their own health-related data—such as health checkups and clinical records—across their lifetime, reducing redundant tests and medications. If it becomes possible to conduct research using statements of medical expenses, DPC, and other data on the national database, it would promote their application to real-life measures. Another kind of big data in the medical sector is genetic information, which could potentially be applied to genomic drug discovery and individualized medicine. All of these scenarios would help alleviate the current financial pressure on Japan’s medical system.
The True Value of Data Is in Evaluation and Verification
In closing, let us come full circle to convenience stores. Point-of-sales systems first came into widespread use in the United States. But whereas their original purpose was to prevent input errors and fraud at the cash registers, Seven-Eleven Japan, formerly a subsidiary of the US-born chain, came up with the idea of applying POS data to marketing.
Seven-Eleven Japan is known by a rigorous hypothesize-and-test approach in its use of POS data. Rather than simply stocking up on a product because it is selling well, each store builds a hypothesis based on weather, location, and various other factors, orders accordingly, makes creative efforts to sell the products, and then uses the POS data to check how well the products actually sold. Similarly, in the medical sector, I would hope that clinical big data is not simply used for the purpose of quantitative regulation or for denying health insurance but is taken a step further and harnessed to verify the outcomes of measures that are being taken.