The International Organization for Standardization (ISO) has published Phenopackets, the first clear computational way to responsibly share individual patient traits, removing a major barrier to research on disease diagnosis, treatment, and mechanism discovery.
The standard was initially developed by the Global Alliance for Genomics and Health (GA4GH) and championed at ISO under the Canadian Mirror Committee to ISO/TC215/SC1 Genomics informatics, and supported by the Standards Council of Canada.
“We finally have the very first standard for phenotype data available worldwide,” said University of Colorado professor Melissa Haendel, a GA4GH contributor who launched the Phenopacket idea eight years ago.
“Having this ISO standard will encourage software developers, infrastructure developers, healthcare systems to consider Phenopackets as a method for sharing patient-level information — securely and in a deidentified way — that can be useful for everything from rare to infectious diseases, and addressing many kinds of public health questions,” Haendel said.
The standard, “ISO 4454 Genomics informatics — Phenopackets: A format for phenotypic data exchange,” was published on 6 July 2022.
Phenopackets debuted at ISO thanks to the leadership of GA4GH and through the support of Canada’s National Member Body, the Standards Council of Canada (SCC), and its Innovation Initiative.
Beyond furthering the UN Sustainable Development goals for good health and innovation, Phenopackets has the potential to benefit a range of Canadian and global organisations, from Google to the Canadian Institute of Health Research to the Red Cross.
A “Phenopacket” is a packet of data — typically a file — that humans and computers can read. It describes a person’s phenotype, a term for the way our genes manifest in our bodies, such as hair colour, hormone levels, or a bad reaction to a drug.
For more than a millennium, physicians have used the same technology for recording and sharing phenotypes: the written note.
While handwritten scrawl has morphed into sophisticated electronic health records, fundamentally, little has changed. Clinicians may record two patients’ identical phenotypes in vastly different ways that are difficult to compare.
For bioinformaticians studying how genes affect our chances of cancer, rare disease, and illness, this freewheeling flexibility is a problem.
“If somebody gives you a piece of paper with a bunch of scribbled stuff and says, ‘Do research with that,’ you’re going to go, ‘Well I don’t know what that is!’ You have to read it, understand it, extract all the data, and make sense of it in your head. And that takes time,” said Julius Jacobsen, a bioinformatics software developer at Queen Mary University of London who co-leads the GA4GH team working on Phenopackets.
“But the Phenopacket provides a sense of how all the bits fit together, like a blank form. All someone has to do is fill in the pre-existing fields, and then they can give you a nice piece of structured information which anyone can understand,” said Jacobsen.
So in 2019, GA4GH — the world’s standards organisation for genomics — approved the Phenopacket standard for storing phenotypic data in predictable formats that computers everywhere can read.
Canadian innovation leads Phenopackets to ISO
A few months later, the newly-formed ISO Genomics Informatics subcommittee met in the South Korean city of Daegu. The group chose Phenopackets as one of its very first standards to develop, working in tandem with GA4GH contributors updating the original version of the standard. (Phenopackets v2 was adopted for the ISO standard after being approved in February.)
To officially propose Phenopackets to ISO, GA4GH Work Stream manager Lindsay Smith, who is based at the Ontario Institute for Cancer Research in Toronto, collaborated with the Canadian Mirror Committee to ISO/TC215/SC1, with the support of SCC.
Through programs like its Innovation Initiative, SCC helps innovators to commercialise technologies and facilitates their participation on national and international standardisation committees for the benefit of economic growth and the health and safety of Canadians.
“Finding ways to advance health technologies has been an important area of interest for the Innovation Initiative,” said Chantal Guay, CEO of SCC. “Developing an ISO standard is key to aligning perspectives internationally and promoting shared health information across the world.”
Many rounds of reviews from ISO experts in Japan, India, Canada, the U.S., and Korea ensured that Phenopackets would work in diverse healthcare systems.
“In Japan, a network of 14 major biobanks has implemented Phenopackets, with 920,000 samples and 250,000 genomic and omics data provided by 490,000 individuals,” said Soichi Ogishima, a member of the ISO project team that reviewed Phenopackets.
“Researchers can access phenotypic information with GA4GH’s Phenopackets standard for capturing clinical data and integrating them with genomic data to develop precision medicine,” added Ogishima, a professor of genomic informatics at Tohoku University in Sendai, Japan.
The future of clinical data sharing
The humble Phenopacket has the potential to transform the treatment of common disease.
“Asthma, inflammatory bowel disease, schizophrenia, and other complex conditions are unlikely to be one disease. But it’s been difficult to divide these diseases into groups that respond to specific treatments. One reason is because everybody uses their own formats, so you cannot combine data,” said Peter Robinson, a computational biologist at the Jackson Laboratory who co-leads the GA4GH Phenopackets development team.
“By using Phenopackets, we’ll be able to improve precision medicine for individuals by being able to compare and cluster patients based upon their individual characteristics,” he said.
The standard could also improve study and diagnosis of rare disease.
“There should be a tool for patients to share their information as Phenopackets,” said Haendel. “Right now, there are rare disease patients all over social media sharing free text that could be structured in such a way that we could mine it as data — for example, to identify patients who have the same condition around the world.”
Patient matchmaking would get easier with a database of cases described in the Phenopacket format.
“Many journals in human genetics are willing to consider cajoling or requiring authors to submit Phenopackets together with case reports. Usually if you find a new disease gene, you’ll describe ten patients, but none of that information is accessible at the patient level,” said Robinson.
In June, Robinson, Jacobsen, Haendel, Smith, and collaborators published an article in Nature Biotechnology outlining how Phenopackets lets researchers and clinicians exchange patient characteristics more effectively — and link those data to genomic information.
Connected standards improve patient care
While the Phenopacket schema is still available free of cost from GA4GH, ISO publication significantly broadens its reach. Beyond Japanese biobanks, databases like the widely-used BioSamples have already implemented Phenopackets. Electronic health record vendors and national health systems are considering the standard.
As an added benefit, any organisation that adopts Phenopackets can easily link to other powerful clinical and research tools from the GA4GH Genomic Data Toolkit.
Going forward, there are plans to build Phenopackets into standards for sharing electronic health records, such as Fast Healthcare Interoperability Resources (FHIR) by the Health Level 7 (HL7) organisation.
“Phenopackets was chosen as one of HL7’s Vulcan Accelerator projects. Accelerator projects try to improve how clinical studies are designed, conducted and reported by advancing the implementation of research-ready standards. A project to represent Phenopackets in the FHIR standard is underway to make sure that this schema — that’s now an ISO standard — can also be used in the context of HL7,” said Haendel.
Phenopackets may be the first GA4GH standard published by ISO, but it will not be the last. Currently, the ISO Genomics Informatics subcommittee is reviewing a proposed standard for genomic surveillance systems — such as the public health systems that track Covid-19 variants spreading around the world. GA4GH standards feature prominently within the requirements.
When standards development organisations align their work, everyone benefits.
“The ISO publication of Phenopackets exemplifies the benefits of standards coordination. When different standards-setting bodies collaborate, it amplifies the impact of all our standards. Truly global standards expand responsible data sharing and bring the benefits of precision medicine to more patients and their families,” said Peter Goodhand, Chief Executive Officer of GA4GH.