UPDATED: This post was updated for 2018 to reflect new information and more examples. Enjoy!
This is the age of data. There’s data on everything; it’s powerful, it’s growing, and even getting its own ‘brain’ through AI. Today, this is coined as big data: any voluminous amount of structured, semi-structured and unstructured data that has the potential to be mined for information. Big data is often categorized by “the Three Vs”, which are volume (the amount of data), velocity (the speed at which data is streamed), and variety (the different forms data comes in). With 287,078 FDA-registered clinical research studies taking place in the United States at the time of this article, a staggering quantity of clinical data is being produced.
As a result, vast amounts of structured and unstructured clinical data necessitate vast amounts of storage space. Along with the tremendous volume of data in existence comes concerns over the security with which it’s being stored. This is a notable concern in clinical research, as the compliance and safety of data is paramount.
So, what should research teams take into consideration when evaluating a data collection and study management solution’s storage capabilities? In this post, we take a look at different elements of database storage and how they apply to clinical trials.
Storage Capacity
We are so accustomed to seeing gigabyte and terabyte storage sizes in the consumer marketplace that we need to understand most raw clinical data does not require nearly the amount of storage space. In TrialKit, the developers have designed the database in a manner that stores and retrieves data efficiently, and removed any risk of data loss. It also makes the most use of storage space to prevent unnecessary costs for its customers.
For example, consider a single study in TrialKit may consist of 40 study sites gathering data into the same unique database. If each site has target enrollment of 50 patients, that’s a combined enrollment of 900 patients. For 900 patients, there could be 80,000 to a few hundred thousand records (eCRFs) containing the study data (assuming no external pdf or image files have been uploaded as supporting documents). This would be mere 35 megabytes, or at most a single gigabyte, which your grandmother can store on her thumb-drive from 1997. This, of course, is a fairly large example. Most studies are much smaller than that.
Database Structure
Properly thought-out database structure basically equates to safety of data and speed. For a web-based data collection system, those are important factors. When you look at the typical cloud storage database service these days, it offers a high quantity service based on the demand of media-intensive consumers. However, there is very little functionality to those databases, as they are designed to simply store and fetch files. It’s a great model for those purposes, but entering and managing data from clinical trials requires a different approach. The data has large amounts of metadata tied to it; in other words, data about the data. Think in terms of queries, audit trails, and record relationships that need to be made.
In clinical research – or any situation that gathers clinical data, customer data, or company data – a database with dynamic and highly organized structure is critical. Clinical data, for most purposes, consists primarily of simple text and values. This data does not come anywhere close to the size of media files. For most studies taking place today, research databases require less quantity and more behind-the-scenes design quality. The TrialKit platform has recognized and accomplished that; moreover, it is customizable and simple enough for all users. TrialKit’s native mobile app makes data entry and navigating the data even faster than modern web browser technology can accomplish.
Security and Compliance
The threat of data breaches spurs the healthcare industry to protect sensitive and confidential patient data as much as possible. With one too many cyber attacks in 2018, organizations are taking extra precautions to secure medical records, and they seek data collection tools that offer strong security measures.
Database security and system compliance go hand-in-hand. Technology vendors should equip research teams with a solution that adheres to rules outlined by regulatory agencies. For instance, it’s critical for a system that houses clinical data be fully compliant with the FDA’s 21 CFR Part 11 requirements. This regulation, in short, mandates that electronic records and electronic signatures are considered trustworthy, reliable, and equivalent to paper records, accomplished through the use of audits, validation, and other system controls.
In TrialKit, all data is protected at the highest levels against any potential vulnerability. Data entering and exiting the servers is encrypted using the healthcare-industry standard 128-bit SSL and 2048-bit RSA public keys. Additionally, the primary data centers are SOC-certified AWS (Amazon Web Service) to ensure optimal performance, security, and reliability for users in various regions of the globe.
When clinical research teams are evaluating study sizes and deciding on the value of a data collection and study management system such as TrialKit, the biggest question should not be one of storage or reliability of data. TrialKit eliminates any concerns surrounding safe storage for clinical data, freeing research teams so they are able to focus on managing their studies and collecting data in the most effective manner. To learn more about how to collect and store data using TrialKit, get in touch with us today.