Intelligent Multimodal Neural Network Monitoring Service for the Surveillance Area

Main Article Content

Razil Rustemovich Minneakhmetov

Abstract

The article presents an approach to the development of an intelligent multimodal monitoring service for the surveillance area using large neural network models. The proposed solution is capable of analyzing heterogeneous data – video streams, environmental sensor signals (temperature, humidity, etc.), and event logs – to obtain a complete picture of what is happening. The main tools used are large language and visual models (for example, LLaMA, MiniCPM‑V, etc.) deployed locally using the Ollama platform, which provides autonomous and secure information processing without the need to transfer data to the cloud. A prototype system has been developed that works offline and is capable of detecting critical situations, abnormal deviations from the norm and contextually significant events in the observed area. The method of forming test scenarios and conducting a qualitative assessment of the model's performance using the metrics F1-measure, Precision, Recall on a set of various situations is described. The experimental results confirm the applicability of multimodal models for monitoring tasks: the prototype successfully recognizes complex patterns of behavior and demonstrates the potential of large models in building adaptive and scalable surveillance systems.

Article Details

How to Cite
Minneakhmetov, R. R. “Intelligent Multimodal Neural Network Monitoring Service for the Surveillance Area”. Russian Digital Libraries Journal, vol. 29, no. 1, Feb. 2026, pp. 123-44, doi:10.26907/1562-5419-2026-29-1-123-144.

References

1. Onsu M.A., Lohan P., Kantarci B., Syed A., Andrews M., Kennedy S. Leveraging Multimodal Large Language Models Assisted by Instance Segmentation for Intelligent Traffic Monitoring [Electronic resource] // arXiv. 2025. Available at: https://arxiv.org/abs/2502.11304 (accessed: 15.05.2025).
2. Ferrara E. Large Language Models for Wearable Sensor-Based Human Activity Recognition, Health Monitoring, and Behavioral Modeling // Sensors. 2024. Vol. 24, No. 15. Article 5045.
3. Suh S., Rey V.F., Lukowicz P. Tasked: Transformer-Based Adversarial Learning for Human Activity Recognition Using Wearable Sensors // Knowledge-Based Systems. 2023. Vol. 260. Article 110143.
4. Nauchnyy servis v seti Internet: trudy XXVI Vserossiyskoy nauchnoy konferentsii (September 22–25, 2025, online). Moscow: Keldysh Institute of Applied Mathematics, 2025 (in press).
5. Nath N.D., Behzadan A.H., Paal S.G. Deep Learning for Site Safety: Real-Time Detection of Personal Protective Equipment // Automation in Construction. 2020. Vol. 112. Article 103085.
6. Gupta S. Deep Learning-Based Human Activity Recognition Using Wearable Sensor Data // International Journal of Information Management Data Insights. 2021. Vol. 1. Article 100046.
7. Uçar A., Karakoşe M., Kırımça N. Artificial Intelligence for Predictive Maintenance Applications: Key Components, Trustworthiness, and Future Trends // Applied Sciences. 2024. Vol. 14, No. 2. Article 898.
8. Wu Z., Zhao J., Shen H. Smart Home Automation Based on Human Activity Recognition: A Survey // Future Generation Computer Systems. 2023. Vol. 137. P. 41–57.
9. Han S., Yuan S., Trabelsi M. LogGPT: Log Anomaly Detection via GPT [Electronic resource] // arXiv. 2023. Available at: https://arxiv.org/pdf/2309.14482
(accessed: 15.05.2025).
10. Sharma R., Patel N. Deep Learning-Based Anomaly Detection in Surveillance Videos // Journal of Visual Communication and Image Representation. 2022. Vol. 86. Article 103624.
11. Özüağ S., Ertuğrul Ö. Enhanced Occupational Safety in Agricultural Machinery Factories: Artificial Intelligence-Driven Helmet Detection Using Transfer Learning and Majority Voting // Applied Sciences. 2024. Vol. 14. Article 11278. https://doi.org/10.3390/app142311278.
12. Li X., Chen Y., Hu L. Real-Time Workplace Activity Recognition Using Deep Learning Models // IEEE Transactions on Industrial Informatics. 2023. Vol. 19, No. 2. P. 1520–1532.
13. Wu Z., Zhao J., Shen H. Smart Home Automation Based on Human Activity Recognition: A Survey // Future Generation Computer Systems. 2023. Vol. 137. P. 41–57.
14. Ollama [Electronic resource]. Available at: https://ollama.com/ (accessed: 30.03.2025).
15. Ollama API Documentation [Electronic resource]. Available at: https://github.com/ollama/ollama/blob/main/docs/api.md (accessed: 30.03.2025).
16. Sahoo P., Singh A.K., Saha S., Jain V., Mondal S., Chadha A. A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications [Electronic resource] // arXiv. 2024. Available at: https://arxiv.org/pdf/2402.07927 (accessed: 15.05.2025).
17. Ollama Python Library [Electronic resource]. Available at: https://github.com/ollama/ollama-python (accessed: 30.03.2025)
18. ISO 8601-1:2019 Standard [Electronic resource]. Available at: https://www.iso.org/obp/ui/#iso:std:iso:8601:-1:ed-1:v1:en (accessed: 30.03.2025).
19. OpenAI ChatGPT-4o-mini [Electronic resource]. Available at: https://chatgpt.com/ (accessed: 30.03.2025).
20. Ollama Gemma3:12B Model [Electronic resource]. Available at: https://ollama.com/library/gemma3:12b (accessed: 30.03.2025).
21. Ollama LLaVA:13B Model [Electronic resource]. Available at: https://ollama.com/library/llava:13b (accessed: 30.03.2025).
22. Ollama Llama3.2-Vision:11B Model [Electronic resource]. Available at: https://ollama.com/library/llama3.2-vision (accessed: 30.03.2025).
23. Ollama MiniCPM-V:8B Model [Electronic resource]. Available at: https://ollama.com/library/minicpm-v (accessed: 30.03.2025).
24. Ollama Qwen2.5-VL:7B Model [Electronic resource]. Available at: https://ollama.com/library/qwen2.5vl (accessed: 16.01.2026).
25. Ollama Mistral-Small-3.2 Model [Electronic resource]. Available at: https://ollama.com/library/mistral-small3.2 (accessed: 16.01.2026).
26. Hand D.J., Christen P. F*: An Interpretable Transformation of the Measure // Journal of Classification. 2021. Vol. 38, No. 1. P. 3–17.
27. Scikit-learn F1-Score [Electronic resource]. Available at: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html (accessed: 30.03.2025).