When discussing AI projects in healthcare, people usually focus on data. They talk about the availability of “clean,” labeled data to train deep learning models and the need for datasets that represent diverse patient populations. They also discuss the importance of an appropriate infrastructure to support moving those large amounts of training data (including imaging, electronic health record [EHR], genomic, and claims data) quickly and efficiently to GPU compute environments.
All those considerations are important, but there’s another key factor in the AI equation that doesn’t get as much attention: data scientists.
Data scientists are the cornerstone of AI research, at the center of teams that include clinical, IT, administrative, legal, and other resources. A relatively new occupation, data science came into its own in the 21st century. It quickly became the focus of much discussion and admiration, perhaps best exemplified in the Harvard Business Review 2012 article Data Scientist: The Sexiest Job of the 21st Century.
Although higher-education institutions have jumped to fill the void that’s described in that article, with over 830 different data science programs available, there still aren’t enough data scientists to meet the demand. This imbalance, an excess of demand over supply, as predicted by the most straightforward classical economics theory, drives up salaries.
What happens in a hot market? People move around to maximize their compensation, and, most importantly, to maximize their ability to do the job that they were hired to do—which is also what they love doing. In such an effervescent milieu, any organization that’s involved in AI projects, regardless of the industry that it operates in, should do whatever is possible to retain its data scientists.
A common complaint among data scientists is the time that they spend looking for, moving, and cleaning data. By many accounts, up to 80% of a data scientist’s time is spent on those tasks. It seems reasonable to conclude that such a skewed use of their time doesn’t contribute to job satisfaction.
So, what can a data management company like NetApp offer to alleviate that pain? In short, it can facilitate an integrated data pipeline with a single management platform that spans the edge, the core, and the cloud. With a data fabric that’s built on NetApp® ONTAP® software, your data scientists can have the right data available at the right time, in the right place, and at the right cost—seamlessly.
Esteban joined NetApp to build a Healthcare AI practice leveraging our full portfolio to help create ML-based solutions that improve patient care, and reduce provider burnout. Esteban has been in the Healthcare IT industry for 15 years, having gone from a being storage geek at various startups to spending 12 years as a healthcare-storage geek at FUJIFILM Medical Systems. He's a visible participant in the AI-in-Healthcare conversation, speaking and writing at length on the subject. He is particularly interested in the translation of Machine Learning research into clinical practice, and the integration of AI tools into existing workflows. He is a competitive powerlifter in the USAPL federation so he will try to sneak early-morning training in wherever he's traveling.