Research project involving the UOC creates pioneering open library to identify biomolecules
en-GBde-DEes-ESfr-FR

Research project involving the UOC creates pioneering open library to identify biomolecules


Researchers at the Universitat Oberta de Catalunya (UOC) and the Institute of Photonic Sciences (ICFO) have created a Raman spectral database that is accessible and open to the scientific community with 140 of the main types of biomolecules, including nucleic acids, proteins, lipids and carbohydrates. Raman spectroscopy is a technique that makes it possible to analyse the chemical composition and molecular structure of materials through the interaction of light with matter – specifically through the phenomenon of Raman scattering, which was discovered by the physicist Chandrasekhara Venkata Raman in 1928.

The study, Open Raman spectral library for biomolecule identification, published as open access in the journal Chemometrics and Intelligent Laboratory Systems, was led by Marcelo Terán, a data engineer and researcher in the UOC's Artificial Intelligence for Human Well-being (AIWELL) group, in collaboration with David Masip and David Merino, fellow researchers in the group, and the scientists José Javier Ruiz and Pablo Loza-Alvarez, at the ICFO.

"One of the limitations of the potential of Raman spectroscopy in biomedical applications to date has been the lack of open spectral data for biomolecules. That is why we set out to create an accessible, standardized and useful library for the scientific community, which will act as the basis for future research and clinical applications," said Terán, who is in the fourth year of his doctoral studies with the AIWELL group.

In the project, the researchers implemented two search algorithms that proved to be 100% accurate in both top-ten identification of molecules, e.g. collagen, and in the identification of the type of molecule, e.g. protein, in measurements of pure biomolecules when replicating the results of previous studies.

Open biomedical data for progress in medicine

"Raman spectroscopy can be used to analyse the chemical composition of samples in a non-invasive way, which is very valuable in the field of medicine. This database can facilitate the precise identification of biomolecules and, in the future, it will contribute to studying how their presence varies in biological processes such as cancer," said Terán. "The availability of high-quality biomedical data is essential for progress in the development of AI-based solutions. This need was the starting point for the research."

The researchers collected data from Raman spectra of biomolecules from the leading articles published in the field, and developed an algorithm using classical computer vision techniques to extract the data automatically. One of the challenges in this project was the limited amount of spectral data published in open-access format, which they overcame using experimental validations. "Our work provides a tool that can help identify molecular composition based on its Raman spectrum in an objective, fast and standardized way. This identification is currently carried out by visual analysis of the main peaks in the spectra, and is compared with the references in the literature. Our tool can streamline this process while providing a standard solution that reduces human bias during analysis," said Terán, a doctoral student affiliated to the UOC's eHealth Centre.

A database destined to grow with contributions from the community

Looking ahead, the researchers hope that the scientific community will contribute to expanding the database, so that it becomes a leading collaborative Raman spectral library of biomolecules.

"It is still unusual for scientific articles to share data openly, especially in the field of Raman spectroscopy. This lack of access to data limits biomedical research considerably. If AI is to be successfully applied, it needs large volumes of reliable and accessible data, and this is where open science projects play a key role," said Terán.

The aim is that as the database expands, it will boost the training of artificial intelligence models in the field of molecular analysis of biological samples. This will create opportunities for new applications in the diagnosis and monitoring of diseases.

For years, the UOC has been a benchmark in the field of open science. It supports this type of work from its Open Science Office and all the knowledge produced by the university can be found in open-access format in the O2 institutional repository.

This UOC research aligns with its Ethical and human-centred technology and Planetary health and well-being research missions, and contributes to the UN's Sustainable Development Goal (SDG) 3, Good Health and Well-being.

Terán, M., Ruiz, J. J., Loza-Alvarez, P., Masip, D., & Merino, D. (2025). Open Raman spectral library for biomolecule identification. Chemometrics and Intelligent Laboratory Systems, 264, 105476. https://doi.org/10.1016/j.chemolab.2025.105476
Regions: Europe, Spain
Keywords: Science, Life Sciences, Health, Well being, Applied science, Artificial Intelligence

Disclaimer: AlphaGalileo is not responsible for the accuracy of content posted to AlphaGalileo by contributing institutions or for the use of any information through the AlphaGalileo system.

Testimonios

We have used AlphaGalileo since its foundation but frankly we need it more than ever now to ensure our research news is heard across Europe, Asia and North America. As one of the UK’s leading research universities we want to continue to work with other outstanding researchers in Europe. AlphaGalileo helps us to continue to bring our research story to them and the rest of the world.
Peter Dunn, Director of Press and Media Relations at the University of Warwick
AlphaGalileo has helped us more than double our reach at SciDev.Net. The service has enabled our journalists around the world to reach the mainstream media with articles about the impact of science on people in low- and middle-income countries, leading to big increases in the number of SciDev.Net articles that have been republished.
Ben Deighton, SciDevNet
AlphaGalileo is a great source of global research news. I use it regularly.
Robert Lee Hotz, LA Times

Trabajamos en estrecha colaboración con...


  • e
  • The Research Council of Norway
  • SciDevNet
  • Swiss National Science Foundation
  • iesResearch
Copyright 2025 by DNN Corp Terms Of Use Privacy Statement