How labs become AI ready with advanced informatics infrastructures

Bhanot, Unjulie

How labs become AI ready with advanced informatics infrastructures

20

SHARES

Share via

Posted: 27 March 2020 | Unjulie Bhanot (IDBS) | No comments yet

To reach the full potential of artificial intelligence (AI) in the pharma industry, it is essential that companies harmonise their data to remove the need for unnecessary human intervention. This article outlines how companies are working towards digital maturation and the obstacles they need to overcome to ensure the value of informatics is met.

As an industry that has often been hesitant to adopt new technologies and trends, it is becoming widely accepted that digital transformation is imperative for biopharmaceutical organisations to keep pace with scientific and technological innovation. In a survey conducted by Deloitte with MIT Sloan Management Review, it was found that only 20 percent of biopharma companies considered themselves to be “digitally maturing,” while the largest proportion of companies were in the midst of developing digital capabilities.¹ However, the industry cannot do this alone and by partnering with existing and upcoming technology partners, digital implementation and automation is being embraced in drug discovery and development.

Underpinning the development journey of a candidate drug are the processes that contribute to its development and the data generated as these processes are run. From the moment a cell produces a therapeutic of choice, its suitability, expansion and rate of therapeutic production are monitored throughout the development lifecycle; not to mention the analysis at each step to confirm the correct formation and sufficient concentrations of the molecule. Given the need to make critical decisions at each development step, organisations generate vast amounts of high-value scientific and process data that are often stored in disparate locations and systems in both unstructured and structured formats. This requires an investment to manage, maintain and utilise.

As generation investments are implemented within biopharma R&D – such as high-throughput systems (HTS), process analytical technology (PAT) tools and continuous processing platforms – with an aim to accelerate time-to-market by reducing production lead times and helping to lower operating costs, organisations are inevitably faced with the conundrum of huge volumes and a variety of data, at increasing velocities.

AI technologies can help identify patterns, highlight unique observations and implement learnings sooner”

Despite the industry being open to small-scale automation to perform certain processes more efficiently, such implementation has resulted in the inefficient use of scientists’ time. Many now devote a large amount of time to extracting data into human-readable formats, contextualising it with process and metadata and then analysing it to draw conclusions. To achieve this, scientists must master the necessary skills to decide which datasets are most relevant, what will provide maximum insight and what should be taken forward for analysis and reporting. More R&D organisations are now making the investment to ensure they hire or train personnel to be confident analysing these large volumes of data.² Instruments such as small‑scale multi-parallel bioreactors that can surface data over several vessels and a series of timepoints across an entire run, or plate-based cell culturing robots that can associate dilutions, movement, genealogy and viability parameters to individual cultures in 96-well or 384-well plates, over several timepoints, can generate an overwhelming amount of data. This is further amplified by image data per well. Some scientists are at a loss with the amount of image, numerical and text data that such instruments can reconcile. Innovation in technology has therefore not necessarily led to a change in the scientist’s tasks. For such technologies to be truly beneficial, they must be complemented by the ability to quickly and easily surface, analyse and gain true value from the data.

These deep pools of diverse data must be capitalised on and this is perhaps where AI holds most value. For example, by enhancing the relationship between scientists and data, AI technologies can help identify patterns, highlight unique observations and implement learnings sooner. In turn, this will enable strategic insight and an ability to make opportune decisions about the viability of a product in the market.

AI tools such as machine learning could not only remove a degree of algorithmic analysis for the scientist, but the process of repeatedly being subjected to such large volumes of data could be used to judge and predict patterns for future experiments, enhancing predictive modelling. For AI tools to be applied, volume is not a constraint, but having data in structured, aligned and consistent formats can accelerate success rates and remove the need for human intervention. It can also ensure the insights being made are reliable and repeatable.

Experiments, instruments and supplementary business processes all generate surplus data in a multiplicity of formats. To manage this, organisations may resort to implementing a data management strategy. However, without a holistic view of how the global business will need to share and make use of this data, not only will this data remain siloed and difficult to extract, but with a lack of standardisation, organisations are faced with inconsistencies and irregularities in critical data.

For AI to be considered implementable, it is critical to lay the foundations of harmonised data structures to facilitate well-constructed and contextualised data. This might entail a matter as simple as agreeing standard naming conventions and material nomenclature across an organisation or choosing the same terms to define a value across multiple functional groups (eg, variations for “de-ionised water” such as deionised water, deionized water, de-ionized water, de-ionised water, DI water, DI etc).

Kickstarting such an initiative is usually driven by the paradigm shift from paper to glass; however, this alone is not enough. Breaking down organisational barriers to agree ontologies and data structures allows for data to be recorded consistently, repeatably and in line with the latest ALCOA+ principles.³ Deploying a digital platform that functions on the premise that data is recorded vicariously as part of performing a process step and does not need manual duplication into a different system, sets organisations on the right path for considering AI tools.

Complementary to this are systems that can seamlessly associate process data capture with result data by integrating directly with the data source – be it single point data instruments such as pH meters and balances, or complex chromatography data systems with hundreds of rows of data and custom fields. Not only does this ensure integrity of the data in question, but it facilitates contextualised data capture without the need for human intervention.

Though AI tools such as natural language processing (NLP) can also be applied to unstructured textual data, the diversity of data that is encountered within biopharma development means it is unlikely to provide significant benefit. As with any analysis, the data output is only as good as the data put in.

A recent success story that illustrates the impact that AI can have on reducing therapeutic development and time to value is that of the DSP- 1181 drug, released for Phase 1 clinical trials⁴ within 12 months of starting the drug discovery process. This was a joint venture between Sumitomo Dainippon Pharma Co., Ltd and Exscientia Ltd, where AI tools were used to screen potential compounds against a plethora of parameters.

For AI to be considered implementable, it is critical to lay the foundations of harmonised data structures to facilitate well constructed and contextualised data”

In the same vein, a pertinent collaboration between Benevolent AI and AstraZeneca began in April 2019,⁵ with plans to use AI and machine learning to determine potential new drugs for chronic kidney disease (CKD) and idiopathic pulmonary fibrosis (IPF). By bringing together pharma and clinical data from AstraZeneca and Benevolent AI’s platform, the organisations hope to improve time to market of a new drug by understanding the core mechanisms of these diseases and identify suitable targets sooner.

With over 100 drugs in the pipeline making use of existing AI tools⁶ across different medical conditions, there is a shift in the industry to take on these tools – limited somewhat to newer startup organisations in the market. For larger enterprise pharma, it is often beneficial to partner with such startups for their newer ventures. Vas Narasimhan, CEO of Novartis, explains that for historical ventures it can take “years just to clean the datasets. I think people underestimate how little clean data there is out there, and how hard it is to clean and link the data.”⁷

As the uptake of AI continues to increase in the clinical world (for diagnosing illnesses and analysing patient data), it is not unreasonable to expect that the pharma world will reflect this shift in approach. It is estimated that the market for AI in biopharma is set to grow to over $3.8 billion by 2025 and that drug discovery will contribute to almost 80 percent of this with a CAGR of 52 percent (from 2018).⁸ In order to support this shift, it is imperative that organisations look to digitally mature by implementing the proper tools that will ensure data is structured, contextualised and reliable. This will enable them to harness the true potential of their valuable data – and ultimately enable the delivery of high-quality therapeutics to patients faster.

About the author

Unjulie Bhanot has worked in the biologics R&D informatics space for over six years and currently works as the IDBS Solution Owner for Biologics Development in the Strategy team. Prior to joining IDBS Unjulie worked as an R&D scientist at both Lonza Biologics and UCB and later went on to manage the deployment IDBS’ E-WorkBook Suite within the analytical services department at Lonza Biologics, UK.

References

Survey finds biopharma companies lag in digital transformation: It is time for a sea change in strategy – Deloitte, 04 October 2018, Greg Reh, Mike Standing. https://www2.deloitte.com/us/en/ insights/industry/life-sciences/digital-transformation-biopharma. html#endnote-sup-2
7 Data Challenges in the Life Sciences – Technology Networks, Jack Rudd, 02 May 2017. https://www.technologynetworks.com/ informatics/lists/7-data-challenges-in-the-life-sciences-288265
ALCOA to ALCOA Plus for Data Integrity – PharmaGuidelines, Ankur Choudhary https://www.pharmaguideline.com/2018/12/ alcoa-to-alcoa-plus-for-data-integrity.html
Sumitomo Dainippon Pharma and Exscientia Joint Development New Drug Candidate Created Using Artificial Intelligence (AI) Begins Clinical Trial, January 30, 2020. https://www.exscientia. ai/news-insights/sumitomo-dainippon-pharma-and-exscientiajoint- development
AstraZeneca starts artificial intelligence collaboration to accelerate drug discovery, April 30,2019. https://benevolent.ai/ news/astrazeneca-starts-artificial-intelligence-collaboration-toaccelerate- drug-discovery
102 Drugs in the Artificial Intelligence in Drug Discovery Pipeline, BenchSci Blog, Simon Smith, Last Updated Jan 31, 2020. https://blog.benchsci.com/drugs-in-the-artificial-intelligence-indrug- discovery-pipeline
Novartis CEO Who Wanted To Bring Tech Into Pharma Now Explains Why It’s So Hard – Forbes, David Shaywitz, Jan 16,2019. https://www.forbes.com/sites/davidshaywitz/2019/01/16/novartisceo- who-wanted-to-bring-tech-into-pharma-now-explains-whyits- so-hard/#5ac5288d7fc4
Intelligent biopharma: Forging the links across the value chain – Deloitte, Mark Steedman, Karen Taylor, Francesca Properzi, Hanno Ronte, John Haughey, 03 October 2019. https://www2. deloitte.com/global/en/insights/industry/life-sciences/rise-ofartificial- intelligence-in-biopharma-industry.html?icid=dcom_ promo_featured|global;en

Related people
Vas Narasimhan

Cookie	Type	Duration	Description
cookielawinfo-checkbox-advertising-targeting	persistent	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertising & Targeting".
cookielawinfo-checkbox-analytics	persistent	1 year	This cookie is set by GDPR Cookie Consent WordPress Plugin. The cookie is used to remember the user consent for the cookies under the category "Analytics".
cookielawinfo-checkbox-necessary	persistent	1 year	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	persistent	1 year	This cookie is set by GDPR Cookie Consent WordPress Plugin. The cookie is used to remember the user consent for the cookies under the category "Performance".
PHPSESSID	session	1 year	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	persistent	1 year	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
zmember_logged	session	1 year	This session cookie is served by our membership/subscription system and controls whether you are able to see content which is only available to logged in users.

Cookie	Type	Duration	Description
advanced_ads_browser_width	persistent	1 month	This cookie is set by Advanced Ads and measures the browser width.
advanced_ads_page_impressions	persistent	2 years	This cookie is set by Advanced Ads and measures the number of previous page impressions.
advanced_ads_pro_server_info	persistent	1 month	This cookie is set by Advanced Ads and sets geo-location, user role and user capabilities. It is used by cache busting in Advanced Ads Pro when the appropriate visitor conditions are used.
advanced_ads_pro_visitor_referrer	persistent	1 year	This cookie is set by Advanced Ads and sets the referrer URL.
bscookie	persistent	2 years	This cookie is a browser ID cookie set by LinkedIn share Buttons and ad tags.
IDE	persistent	2 years	This cookie is set by Google DoubleClick and stores information about how the user uses the website and any other advertisement before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.
li_sugr	persistent	3 months	This cookie is set by LinkedIn and is used for tracking.
UserMatchHistory	persistent	1 month	This cookie is set by Linkedin and is used to track visitors on multiple websites, in order to present relevant advertisement based on the visitor's preferences.
VISITOR_INFO1_LIVE	persistent	5 months	This cookie is set by YouTube. Used to track the information of the embedded YouTube videos on a website.

Cookie	Type	Duration	Description
bcookie	persistent	2 years	This cookie is set by LinkedIn. The purpose of the cookie is to enable LinkedIn functionalities on the page.
GPS	persistent	30 minutes	This cookie is set by YouTube and registers a unique ID for tracking users based on their geographical location
lang	session	1 year	This cookie is set by LinkedIn and is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
lidc	persistent	1 day	This cookie is set by LinkedIn and used for routing.
lissc	persistent	11 months	This cookie is set by LinkedIn share Buttons and ad tags.
vuid	persistent	2 years	We embed videos from our official Vimeo channel. When you press play, Vimeo will drop third party cookies to enable the video to play and to see how long a viewer has watched the video. This cookie does not track individuals.
wow.anonymousId	persistent	2 years	This cookie is set by Spotler and tracks an anonymous visitor ID.
wow.schedule	persistent	20 minutes	This cookie is set by Spotler and enables it to track the Load Balance Session Queue.
wow.session	persistent	20 minutes	This cookie is set by Spotler to track the Internet Information Services (IIS) session state.
wow.utmvalues	persistent	20 minutes	This cookie is set by Spotler and stores the UTM values for the session. UTM values are specific text strings that are appended to URLs that allow Communigator to track the URLs and the UTM values when they get clicked on.
_ga	persistent	2 years	This cookie is set by Google Analytics and is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. It stores information anonymously and assign a randomly generated number to identify unique visitors.
_gat	persistent	1 minute	This cookies is set by Google Universal Analytics to throttle the request rate to limit the collection of data on high traffic sites.
_gid	persistent	1 day	This cookie is set by Google Analytics and is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visited in an anonymous form.

Cookie	Type	Duration	Description
cf_ob_info	persistent	1 minute	This cookie is set by Cloudflare content delivery network and, in conjunction with the cookie 'cf_use_ob', is used to determine whether it should continue serving “Always Online” until the cookie expires.
cf_use_ob	persistent	1 minute	This cookie is set by Cloudflare content delivery network and is used to determine whether it should continue serving “Always Online” until the cookie expires.
free_subscription_only	session	1 year	This session cookie is served by our membership/subscription system and controls which types of content you are able to access.
ls_smartpush	persistent	1 month	This cookie is set by Litespeed Server and allows the server to store settings to help improve performance of the site.
one_signal_sdk_db	persistent	Until cleared	This cookie is set by OneSignal push notifications and is used for storing user preferences in connection with their notification permission status.
YSC	session	1 year	This cookie is set by Youtube and is used to track the views of embedded videos.

Recommended

How labs become AI ready with advanced informatics infrastructures

About the author

References

Leave a Reply Cancel reply

Recommended

How labs become AI ready with advanced informatics infrastructures

About the author

References

SAFIRE: AI predicted ADME for drug discovery: leveraging BioPrint

Development of a new and promising antimalarial agent

Translating ‘nature’s cues’ into breakthrough immunotherapies

Part three: pragmatic guidelines to getting the best out of LLMs

Treating HCC with genetically modified NK-cell therapy

Leave a Reply Cancel reply