AI Could Reveal Secrets of Thousands of Handwritten Documents – From Medieval Manuscripts to Hieroglyphics

Over the last ten years, researchers have gradually been working out how to teach computers to read handwritten documents. As in most machine learning, a computer is fed training data: in this case, images of handwriting and details of what it says. It then learns how the marks on each page correspond to letters. It learns that that half circle is a “c”, that that short vertical stroke is an “i” and that it might therefore be “rice” that you wrote on your shopping list, for example.

How it does this no one is quite sure – machine learning is often a black box. But it seems likely it is at least partly learning which characters are likely to occur in sequence, thus determining that you are unlikely to want to be shopping for “qvjx”, however much the word might look like that.

This technology has been applied to handwriting from many countries and periods, from medieval manuscripts to 19th-century diaries (if not yet 21st-century shopping lists), in languages from Latin to Old French to Hebrew.

Because the technology works on the basis of image analysis, it is in theory applicable to any writing whatsoever, from Egyptian hieroglyphs to copperplate. Ten years after its initial development, some truly exciting consequences of the development of handwritten text recognition (HTR) techniques are becoming clear.

You can read more in an article in theconverstation web site at: https://tinyurl.com/4bf62k5h .