Towards Automated Meta-analysis of Biomedical Texts in the Field of Cell-based Immunotherapy


  • D.A. Devyatkin Federal Research Centre «Computer Science and Control» RAS, 9 60-let Oktyabrya av., Moscow, 119333 Russia
  • A.I. Molodchenkov Federal Research Centre «Computer Science and Control» RAS, 9 60-let Oktyabrya av., Moscow, 119333 Russia
  • A.V. Lukin Federal Research Centre «Computer Science and Control» RAS, 9 60-let Oktyabrya av., Moscow, 119333 Russia
  • Y.S. Kim Institute of Biomedical Chemistry, 10 Pogodinskaya str., Moscow, 119121 Russia
  • A.A. Boyko Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, 16/10 Miklukho-Maklaya str., Moscow, 117997 Russia
  • P.A. Karalkin Hertsen Moscow Oncology Research Center – branch of National Medical Research Radiological Center, 3 2-nd Botkinsky proezd, Moscow, 125284 Russia
  • J.-H. Chiang National Cheng Kung University, Tainan City, Taiwan
  • G.D. Volkova Moscow State Technological University «STANKIN», 1 Vadkovsky lane, Moscow, 127994 Russia
  • A.Yu. Lupatov Institute of Biomedical Chemistry, 10 Pogodinskaya str., Moscow, 119121 Russia



cancer; cell-based immunotherapy; text mining; automated meta-analysis


Cell-based immunotherapy is a promising approach for the treatment of chronic infections, autoimmune disorders, and malignant tumors. There are many strategies of cell-based immunotherapy of cancer; these include injection of various immune effector cells, propagated and «trained» in a cell culture. Alternatively, cells presenting tumor antigens on their surface in a form recognized by the immune system can be used to achieve a therapeutic effect. The research results in this field are presented in thousands of texts, and their manual analysis is very complicated. We have developed an approach for automated text analysis in this area of biomedical science. Here we present the first results of the automated analysis of the data extracted from abstracts of scientific articles available in PubMed. These results demonstrate the associations between types of tumors and the most commonly used methods of their cell-based immunotherapy.


  1. Palucka, K., & Banchereau, J. (2013). Dendritic-cell-based therapeutic cancer vaccines. Immunity, 39(1), 38-48. DOI
  2. Lupatov, A. Yu., Karalkin, P. A., Boyko, A. A., & Yarygin, K. N. (2018). Autotransplantation of T-lymphocytes as a tool for antigen-specific immunotherapy of oncological diseases. Vestnik Transplantologii i Iskusstvennykh Organov, 20(3), 95-104. DOI
  3. Krallinger, M., Rabal, O., Lourenço, A., Oyarzabal, J., & Valencia, A. (2017). Information retrieval and text mining technologies for chemistry. Chemicalreviews, 117(12), 7673-7761. DOI
  4. Tsuruoka, Y., Tateishi, Y., Kim, J.-D., Ohta, T., McNaught, J., Ananiadou, S., & Tsujii, J. (2005). Developing a robust part-of-speech tagger for biomedical text. Advances in Informatics, 3746, 382−392.>/li>
  5. Miyao, Y., & Tsujii, J. (2008). Feature forest models for probabilistic HPSG parsing. computational linguistics, 34(1), 35−80. DOI
  6. Hina, S., Atwell, E., & Johnson, O. (2010). Secure information extraction from clinical documents using snomed ct gazetteer and natural language processing . International conference for internet technology and secured transactions. IEEE, 1-5.
  7. Aronson, A. R., & Lang, F. M. (2010). An overview of MetaMap: historical perspective and recent advances. Journal of the American Medical Informatics Association, 17(3), 229–236. DOI
  8. Jagannatha, A. N., & Yu, H. (2016). Structured prediction models for RNN based sequence labeling in clinical text. Proceedings of the conference on empirical methods in natural language processing. NIH Public Access, 2016, 856–865.
  9. Mika, S., & Rost, B. (2004). Protein names precisely peeled off free text. Bioinformatics, 20(1), i241−i247. DOI
  10. McDonald, R., & Pereira, F. (2005). Identifying gene and protein mentions in text using conditional random fields. BMC Bioinformatics, 6(1), S6. DOI
  11. Zeng, D., Sun, D., Lin, L., & Liu, B. (2017). LSTM-CRF for drug-named entity recognition. Entropy, 19 (6), 283. DOI
  12. Wang, Y., Liu, S., Afzal, N., Rastegar-Mojarad, M., Wang, L., Shen, F., Kingsbury, P., & Liu, H. (2018). A comparison of word embeddings for the biomedical natural language processing. Journal of Biomedical Informatics, 87, 12-20. DOI
  13. Shelmanov, A. O., Smirnov, I. V., & Vishneva, E. A. (2015). Information extraction from clinical texts in Russian. Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference “Dialogue”, 14(21), 537-549.
  14. Yadav, M., & Goyal, N. (2015). Comparison of open source crawlers – a review. International Journal of Scientific and Engineering Research, 2229(5518), 1544-1551.
  15. Larionov, D., Shelmanov, A., Chistova, E., & Smirnov, I. (2019). Semantic role labeling with pretrained language models for known and unknown predicates. Proceedings of Recent Advances of Natural Language Processing, 620-630.
  16. Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules. Proceedings of the 20th Very Large Data Base Conference, 487-499.
  17. Zaki, M. J. (2000). Scalable algorithms for association mining. IEEE Transactions on Knowledge and Data Engineering, 12(3), 372–390. DOI
  18. Rosenberg, S. A., Yang, J. C., Sherry, R. M., Kammula, U. S., Hughes, M. S., Phan, G. Q., & Dudley M. E. (2011). Durable complete responses in heavily pretreated patients with metastatic melanoma using T-cell transfer immunotherapy. Clinical Cancer Research, 17(13), 4550-4557. DOI
  19. Radvanyi, L.G., Bernatchez, C., Zhang, M., Fox, P.S., Miller, P., Chacon, J., & Hwu, P. (2012). Specific lymphocyte subsets predict response to adoptive cell therapy using expanded autologous tumor-infiltrating lymphocytes in metastatic melanoma patients. Clinical Cancer Research, 18(24), 6758-6770. DOI
  20. Kochenderfer, J. N., Wilson, W. H., Janik, J. E., Dudley, M. E., Stetler-Stevenson, M., Feldman, S. A, & Rosenberg, S. A. (2010). Eradication of B-lineage cells and regression of lymphoma in a patient treated with autologous T cells genetically engineered to recognize CD19. Blood, 116(20), 4099-4102. DOI
  21. Flach, P. (2012). Machine learning: the art and science of algorithms that make sense of data. Book, Cambridge University Press.
  22. Lin, C., Miller, T., Dligach, D., Bethard, S., & Savova, G. (2019) A BERT-based universal model for both within-and cross-sentence clinical temporal relation extraction. Proceedings of the 2nd Clinical Natural Language Processing Workshop, 65-71. anthology/W19-1908
  23. Pang, N., Qianm L., Lyu, W., & Yang, J-D. (2019) Transfer learning for scientific data chain extraction in small chemical corpus with BERT-CRF model. arXiv preprint arXiv:1905.05615
  24. Hakala, K., Kaewphan, S., Salakoski, T., & Ginter, F. (2016) Syntactic analyses and named entity recognition for PubMed and PubMed Central—up-to-the-minute. Proceedings of the 15th Workshop on Biomedical Natural Language Processing, 102-107.
  25. Lupatov, A. Y., Yarygin, K. N., Panov, A. I., Suvorov, R. E., Shvets, A. V., Volkova, G. D. (2015). Assessment of dendritic cell therapy effectiveness based on the feature extraction from scientific publications. Proceedings of the International Conference on Pattern Recognition Applications and Methods (ICPRAM), 2, 270-276. DOI
  26. Boyko, A. A., Kaidina, A. M., Kim, Y. C., Lupatov, A. Yu., Panov, A. I., Suvorov, R. E., Shvets, A. V. (2016). A framework for automated meta-analysis: dendritic cell therapy case study. 8th International Conference on Intelligent Systems (IEEE), 8, 160-166. DOI



How to Cite

Devyatkin, D., Molodchenkov, A., Lukin, A., Kim, Y., Boyko, A., Karalkin, P., Chiang, J.-H., Volkova, G., & Lupatov, A. (2019). Towards Automated Meta-analysis of Biomedical Texts in the Field of Cell-based Immunotherapy. Biomedical Chemistry: Research and Methods, 2(3), e00109.