Post-Correction of Weak Transcriptions by Large Language Models in the Iterative Process of Handwritten Text Recognition

Main Article Content

Valerii Pavlovich Zykov

Abstract

This paper addresses the problem of accelerating the construction of accurate editorial annotations for handwritten archival texts within an incremental training cycle based on weak transcription. Unlike our previously published results, the present work focuses on integrating automatic post-correction of weak transcriptions using large language models (LLMs). We propose and implement a protocol for applying LLMs at the line level in a few-shot setup with carefully designed prompts and strict output format control (preservation of pre-reform orthography, protection of proper names and numerals, prohibition of structural changes to lines). Experiments are conducted on the corpus of diaries by A.V. Sukhovo-Kobylin. As the base recognition model, we use the line-level variant of the Vertical Attention Network (VAN). Results show that LLM post-correction–exemplified by the ChatGPT-4o service–substantially improves the readability of weak transcriptions and significantly reduces the word error rate (in our experiments by about −12 percentage points), without degrading the character error rate. Another service tested, DeepSeek-R1, demonstrated less stable behavior. We discuss practical prompt engineering, limitations (context length limits, risk of “hallucinations”), and provide recommendations for the safe integration of LLM post-correction into an iterative annotation pipeline to reduce expert annotators’ workload and speed up the digitization of historical archives.

Article Details

How to Cite
Zykov, V. P., and L. M. Mestetskiy. “Post-Correction of Weak Transcriptions by Large Language Models in the Iterative Process of Handwritten Text Recognition”. Russian Digital Libraries Journal, vol. 28, no. 6, Dec. 2025, pp. 1385-14, doi:10.26907/1562-5419-2025-28-6-1385-1414.

References

1. Penskaya E.N., Kuptsova O.N. (2024) The Invisible Quantity. A.V. Sukhovo-Kobylin: Theater, Literature, Life. Moscow: HSE Publishing House, 2024. 472 p. (In Russ.)
2. Mestetsky L.M., Smirnova V.S. Line segmentation in images of handwritten documents // Proceedings of the International Conference on Computer Graphics and Vision (Grafikon-2025). Yoshkar-Ola: Volga State Technological University, 2025. (In Russ.)
3. Mestetskiy L.M., Zykov V.P. Incremental markup of 19th-century handwritten ar-chival diaries // Software & Systems. 2025. Vol. 38, No. 4. https://doi.org/10.15827/0236-235X.152. (In Russ.)
4. Coquenet D., Chatelain C., Paquet T. End-to-end Handwritten Paragraph Text Recognition Using a Vertical Attention Network // IEEE Transactions on Pattern Analysis and Machine Intelligence. 2023. Vol. 45, No. 1. P. 508–524. https://doi.org/10.1109/TPAMI.2022.3144899
5. Boltunova E.M., Laptev A.K. Handwriting recognition and data mining: Possibilities of neural network technologies (based on admiral Fyodor Lutke's diary) // Imagology and Comparative Studies. 2025. No. 23. P. 358–379. https://doi.org/10.17223/24099554/23/17. (In Russ.)
6. Brown T.B., Mann B., Ryder N., Subbiah M. et al. Language Models are Few-Shot Learners // Advances in Neural Information Processing Systems (NeurIPS). 2020. Vol. 33. P. 1877–1901.
7. Marti U.-V., Bunke H. The IAM-database: an English sentence database for offline handwriting recognition // International Journal on Document Analysis and Recognition (IJDAR). 2002. Vol. 5, No. 1. P. 39–46. https://doi.org/10.1007/s100320200071
8. Sánchez J., Romero V., Toselli A. H., Vidal E. ICFHR2016 competition on handwritten text recognition on the READ dataset // Proceedings of the 15th International Conference on Frontiers in Handwriting Recognition (ICFHR 2016). 2016. P. 630–635.
9. Shi B., Bai X., Yao C. An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition // IEEE Transactions on Pattern Analysis and Machine Intelligence. 2017. Vol. 39, No. 11. P. 2298–2304. https://doi.org/10.1109/TPAMI.2016.2646371
10. Graves A., Fernández S., Gomez F., Schmidhuber J. Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks // Proceedings of the 23rd International Conference on Machine Learning (ICML 2006). 2006. P. 369–376. https://doi.org/10.1145/1143844.1143891
11. Coquenet D., Chatelain C., Paquet T. SPAN: A Simple Predict & Align Network for Handwritten Paragraph Recognition // Document Analysis and Recognition – ICDAR 2021. Lecture Notes in Computer Science, Vol. 12823. Springer, 2021. P. 70–84. https://doi.org/10.1007/978-3-030-86334-0_5
12. Yousef M., Bishop T.E. OrigamiNet: Weakly-Supervised, Segmentation-Free, One-Step, Full Page Text Recognition by Learning to Unfold // Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2020). 2020. P. 14710–14719. https://doi.org/10.1109/CVPR42600.2020.01472
13. Li M., Lv T., Chen J., Cui L., Lu Y., Florencio D., Zhang C., Li Z., Wei F. TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models // Proceedings of the AAAI Conference on Artificial Intelligence. 2023. Vol. 37, No. 12. P. 14216–14224.
14. Potanin M., Dimitrov D., Shonenkov A., Bataev V., Karachev D., Novopoltsev M., Chertok A. Digital Peter: New Dataset, Competition and Handwriting Recognition Methods // Proceedings of the 6th International Workshop on Historical Document Imaging and Processing. ACM, 2021. P. 43–48. https://doi.org/10.1145/3476887.3476892
15. Lakshminarayanan B., Pritzel A., Blundell C. Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles // Advances in Neural Information Processing Systems (NeurIPS). 2017. Vol. 30. P. 6402–6413.