Tim Stratton works for Thermo Fisher Scientific as a manager of library technologies. He received his education from University of Minnesota. Currently, Tim is responsible for mzCloud development and is based in Austin, Texas.
mzCloud is a state of the art mass spectral database that assists analysts in identifying compounds in areas such as life sciences, metabolomics, pharmaceutical research, toxicology, forensic investigations, environmental analysis, food control and various industrial applications. mzCloud™ features a freely searchable collection of high resolution/accurate mass spectra using a new third generation spectra correlation algorithm.
Tim thanks for taking the time to talk to me about the spectra library termed mzCloud.
When did the initial idea to create such library originate and what drove this development?
A lot of this started about five years ago. We were keeping an eye on metabolomics as a tool used for biomarker discovery and biological research. One of the biggest bottlenecks was the identification of the detected components in biological matrices. Of course there were already some libraries available -- like the National Institute of Standards and Technology (NIST) library which is critical for gas chromatography analysis and the NIST MS/MS library which was very useful for LC-MS -- but we wanted to be able to go beyond what was available. We knew that Orbitrap HRAM data, especially MSn data, was extremely powerful for identifying unknowns but the vast majority of libraries were only MS2. Additionally, we looked at what had come before -- what worked and what could have been improved -- and that is what really drove our early planning.
How did you start this project and who were the initial contributors?
For us, it was pretty easy to know our "core team." We already had a strong advocate and collaborator with the team at HighChem. They already had quite a bit of experience with Orbitrap HRAM data and MSn analysis, and Thermo Fisher Scientific had already established a strong working relationship with team over a period of 15 years so they were a natural choice. Internally, we reached out to many of our collaborators who are key opinion leaders in pharmaceutical biomarker research. We also realized early on that this same challenge -- identifying unknowns -- was prevalent in many research fields. We spoke with experts in research fields such as environmental, forensics, agriculture and food research and got their input on the current tools, and determined what worked and what didn't. Of course there came a time when we had to start the work of acquiring data and building the libraries. Luckily, since we are Thermo Fisher Scientific, we had access to a huge supply of Fisher Scientific authentic standards to begin to building the library.
Recently mzCloud has been implemented in Compound Discoverer 3.0. Tell me about the importance of this functionality.
Compound Discoverer (CD) is our small molecule research software. It's quite unique in that it is a "workflow software." Meaning, users can customize the data processing workflow by building them with processing steps called "nodes." As you mentioned, searching the mzCloud library is one of those nodes. While anyone can go to mzCloud.org and search spectra data online, being able to do this in a high capacity batch manner is important. Within CD we can do searches for identification where we're trying to say we know what a peak is, but we can also do searches for similarity. This is a very useful capability since even the biggest libraries still have only a tiny number of possible chemicals in them so obtaining a true ID is difficult, but we can still get useful information. Structurally similar molecules will have some similarity in fragmentation so by searching the unknowns and providing similarity matches we can give researchers clues about the potential structure of real unknowns.
I hear that customers using CD 3.0 love the mzLogic functionality, why is that?
It goes back to that point about getting more information on our unknowns with tools like similarity searching. mzLogic is a new approach to try to provide useful information on, let’s call them "semi-unknown," compounds that do not have reference fragment spectra in a library but are chemicals that exist in a database somewhere so their structure is known. mzLogic combines a search of chemical databases using accurate mass or elemental composition, where we may get tens or hundreds of candidates, with the fragmentation substructure information that an mzCloud similarity search gives us. This means that, using that real reference spectral library data on substructures, mzLogic will sort our hundreds of database candidates into those that are most structurally relevant.
Now that you have your own lab running to continue to build the mzCloud what is next on your road map?
More data, more data, more data. That's always a primary focus for us. We have a joke that "if it ionizes, we will put it in the library" meaning we want to have data on a huge variety of compounds so we have relevant information for every researcher. We always have a long roadmap of new capabilities and tools we are working on, but we always want to hear back from people that use the library on ways we can do a better job. We're always on the lookout for new collaborations to expand the library even more.
Tim, thank you for the insightful information on mzCloud and mzLogic, any final thoughts you would like to share with me today?
We'd like to let people know that all the tools we use to build mzCloud -- the curation toolkit that allows us to do advanced spectral curation, noise removal, annotation, and recalibration, along with the tools for local library hosting and searching -- will be available for researchers to use to build their own private proprietary libraries later this year. People will be able to build the same level of quality into their own HRAM MSn libraries. Lastly, though I've mentioned it before, we're always looking for feedback from users and potential collaborators to expand the library with new and unique chemistry that can help researchers around the world reach a state of "Know more Unknowns / No more Unknowns."