Headlines: October 12th, 2018

Neo4j’s Emil Eifrem reports on how Germany is using graph technology in combination with AI to make connections in research that no-one else is doing. Is this something the UK’s health sector leaders should also be doing?

Diabetes is one of the most widespread diseases worldwide, and increases not only of type 2 diabetes in our ageing population but also of type 1 diabetes will present major challenges to the NHS in the coming years – type 2 diabetes in children has risen 40% in three years amid Britain’s obesity epidemic, for instance.

Health policymakers are using all the tools they can summon to try and help. In Germany, for example, the country’s national Centre for Diabetes Research (DZD) is looking to investigate the causes of the disease and, through new scientific findings, develop effective prevention and treatment measures to halt the emergence or progression of diabetes. DZD is an instructive example of what can happen with diabetes research when a new way of tackling the problem is explored.

A ‘master database’ to consolidate diabetes information

Based in Munich, DZD brings together experts from across the Federal Republic to develop effective prevention and treatment measures for diabetes across multiple disciplines, and to see what treatment the latest biomedical technologies may offer citizens dealing with the condition. In order to better understand diabetes’ causes, its scientists examine the disease from as many different angles as they can.

DZD’s researchers, then, combine basic research data sources – genetics, epigenetics, metabolic pathways – with data from clinical studies. Connecting this highly heterogeneous data is a challenge, but necessary in order to answer biomedical questions across disciplines.

Recently, its IT leadership decided it needed a better way of connecting this research data from various disciplines, locations and species. Besides connecting data sources, it wanted an easy-to-understand visualisation of data and easy querying so that scientists benefit from it. The result is a ‘master database’ to consolidate this information, and provide its 400-strong team of scientist peers with a holistic view of available information, enabling them to gain valuable insights into the causes and progression of diabetes.

In search of a suitable data tool to build such a system on, Dr Alexander Jarasch, the Centre’s Head of Bioinformatics and Data Management, drew on experience gleaned from previous work on a project at Munich’s Helmholtz Zentrum. That had used a graph database – a positive experience that prompted him to test graph technology at DZD, specifically Neo4j’s graph software. Dr. Jarasch has thus offered his colleagues a new internal tool, DZDconnect, built in graph software that sits as a layer over the various relational databases linking different DZD systems and data silos. DZDconnect is not fully implemented yet, but DZD staffers can already access metadata from clinical studies in the prototype – and are particularly impressed by the visualisation and the easy querying it’s made possible.

‘The more detailed the information, the easier it is to identify relationships and patterns’

Many researchers wonder if graph databases (technology that powered the Paradise Papers investigation, among other intriguing examples of cracking big data problems) could help in the prevention, early diagnosis and treatment of major illnesses and so save lives. Why? Because not only is graph technology ideally suited to depicting hidden relationships and discovering unknowns at big data scale, it is also able to handle dynamic and constantly evolving data – something that medical thinkers say is vital with scientific or bioinformatics analysis research.

“With graph technology we were able to combine and query data across various locations,” Dr. Jarasch enthuses, adding that, “Even though only part of the data has been integrated, queries have already shown interesting connections, which will now be further researched by our scientists.”

In the long term, as much DZD data as possible should be integrated into graph database, Jarasch believes, noting that the next step is to see how human data from clinical research will be complemented with highly standardised data from animal models, such as mice, to find communalities or other insights.

It’s not just graph software that is being employed. AI techniques like Machine Learning in combination with graph software will play a key role going forward, says DZD, with a particular area of interest being building a system able to ‘read’ scientific texts and integrate them into the database ready for analysis.

The promise is that the more detailed the information, the easier it is to identify relationships and patterns – which could really help in cracking the diabetes problem. The kind of innovative data management and analysis approach DZD is pioneering could well be the way forward in precision medicine, prevention and treatment of diabetes – and, perhaps, other diseases.

Given that the NHS needs as much help combating diseases as possible, could graph technology’s innate ability to discover relationships between data points have an important role to play in fighting not just diabetes, but many other problems? The DZD example does seem to suggest this is a pathway worth exploring.

The author is co-founder and CEO of Neo4j, the world’s leading graph database (http://neo4j.com/)