This September, our PharmaTec series discussion group focused on cutting edge data analytics. Biopharma data analytics has seen important advancements over the past few years, connecting different data types across the drug discovery & development. New techniques have changed how we can use omics, real world and clinical data. Our Data Analytics Discussion Group brought together key opinion leaders for an hour of discussion taking a deep dive into these areas as well as data assets & infrastructure.
Lili Peng, Associate Director (External Innovation Data Sciences, Biogen) opened this month’s discussion group proceedings with an introductory presentation titled “Data Analytics: Innovation Opportunities & Challenges for The Biopharma Industry”. Some of the key topics our experts covered included whether analytics should start with the data or a use case first, different analytical platforms and the application of AI/ML technologies.
Joining Lili as a panellist this month to foster discussion was Giovanni Dall’Olio (Data Strategy and Design Manager, GlaxoSmithKline) and we also had representation from several key pharma companies in the field such as Amgen, Roche Pharmaceuticals, GlaxoSmithKline and many more.
Moving Forward: From Data Silo to Data Lakes and Data Mesh’s
Traditionally, Pharma data has been stored in separate “silos”. A data silo is a repository of data that is owned by a single group, isolated from other departments. This creates difficulties when analysing data, especially when dealing with larger data sets. These challenges have creating demand for a next-generation solution, which has appeared in the form of ‘data lakes.’
A data lake is a new, smart approach to information management and enterprise reporting that stores all information on a single interface. It is a centralized repository to store all structured and unstructured data, which can be used to derive analytics for business. Designed for big data analytics, data lakes effectively solve the challenges commonly associated with data silos.
Another approach that has been popularity is data mesh’s. Giovanni Dall’Olio, Data Strategy and Design Manager, GlaxoSmithKline explains that he has been “looking at this data mesh philosophy, where your data is a product. So, you apply software engineering practices to data. And then instead of having everything into one data lake, you have you have data domains. So, for example, if you if you use the cloud, you can have you have a catalogue, a page or something where people can search for data sets. Data is owned by the groups that created it. So, the business does not take our research, they build the data product with the help of your team. And, and all the data products created are within a way that they can be interoperable. Think of it like cameras and buckets where you can just select which data set you want. And in turn, you provide the infrastructure.”
Exploring the Data Analysis Process: Data or Use-Case First?
Kelly Mewes (People & Product Leader in Data Curation & Integration, Roche) argues “I think you should start with the data. And it’s only now that we’re, you know, at 19 different use cases that we’re starting to set quite ambitious goals. We want to deliver 10 insights by the end of the year from these use cases, but we’ve been in creation for four or five years. And it’s only now by learning agile methodologies, that we know what we need to deliver. So, the data scientists can draw insights from that, and we can set ambitious goals, but you have to start with the data first, because otherwise, the scientists can’t do their work.”
Despite this, the long turnaround for usable insights from data first scenarios is off-putting for many companies. Lili Peng explains that “We’re very use case driven. if you talk to any senior leaders, they’re just going to ask how starting with the data helps us develop better drugs? And that’s actually a very hard thing to answer”.
Marcia Boakye (People & Product Leader, Pharma Development Data Sciences, Roche) has experience working in both data first and use case first scenarios and argues that the most practical choice may be to work with current and incoming data with a data first mindset while harmonising retrospective data when it is drawn on for use case analysis. She explains that the “work we’re doing is going to be more for prospective data rather than retrospective, because the prospective data becomes your retrospective data. How long are you going to be on the hamster wheel of fixing things because you’re not setting them up correctly in the first place? And that’s the kind of conversations that we had that gave us the time to build things that worked on prospective, rather than retrospective, and where it was about retrospective that became use case driven.”
Machine Learning and Augmented Intelligence: A Symbiotic Relationship
Part of the difficulty of working with historical data is the lack of standardization. AI and ML have proven to be powerful tools for creating insights from variable data sources. Vladimir Anisimov, (Principal Data Scientist, Amgen) explains that there is a “consortium of clinical pharma companies, maybe now about 20 pharma companies share their historical trials and can really be used very efficiently. Now, it’s not hundreds, it’s sometimes 1000s of clinical trials. And then you can use really quite sophisticated techniques like machine learning techniques, random forests, to analyse this data and predict and create some predictive functions for predicting particular parameters for future trials.”
Machine learning and AI are not fool-proof tools however, and human intervention is often necessary. Lili Peng explains that “there’s another definition of AI as augmented intelligence. And the definition isn’t as strong as the definition of artificial intelligence. Augmented intelligence is defined as an AI to enhance human intelligence or augment human intelligence and human decision making rather than operate entirely independently of it or outright replace them. So augmented intelligence also still involves machines. But the machines task is intended to enhance the human worker, rather than to replace the human worker. The machines to work to gather symbiotically, empowering humans to work better smarter to achieve greater business value.”
Upcoming Discussion Group: Mobile Robotics
At Oxford Global, we couldn’t have been more pleased with the turnout for our first ever PharmaTec discussion group. The conversation was engaging, the debate stimulating, and the event provided the perfect setting for exchanging ideas.
We will continue our monthly discussion group series next month when we focus on mobile robotics. Learn more about Oxford Global’s discussion group series and our other events here.