Let PROSAR-AIDA sort and analyse your incoming mail!
Content-based processing of freeform documents: start your automatic document classification and data extraction with PROSAR-AIDA.
While others are still talking about content management and knowledge management, PROSAR-AIDA is already responsible for the classification of 50% of incoming documents of private health insurance in Germany.
AIDA stands for Artificial Intelligence for Document Analysis. PROSAR-AIDA is a software module based on the OCR engine PROSAR, and is used for content-based document processing. of formatted and freeform documents.
After first performing an initial full-text OCR, PROSAR-AIDA provides two main
functions:
1. Recognising the type of document through content-based analysis
2. Finding relevant index data without having to know the position or the layout of the data on the page.
Customer benchmarks have demonstrated that the processing speed is significantly higher than any other FreeForm technology, thus allowing same day processing of documents in high volume installations.
PROSAR-AIDA is specially designed for high performance in environments with complex data requirements and high document volumes. Even with several hundred different document types and extensive data extraction (including tables), the processing times is typically around one second per document using standard PC hardware. This allows volumes between 2,000 and 3,000 images per hour to be processed on a single computer, if the documents are scanned at 200dpi instead of 300dpi, allowing scanners to be better utilised and requiring a lesser volume of archiving storage. Best of both worlds approach: full-text OCR with PROSAR is the basis of the system. PROSAR is an OCR/ICRprogram that will recognize text in digital images. The architecture is based on artificial neural networks. This combines maximum recognition with maximum performance. Document type identification is a hierarchical process, based on rules which assess the presence or absence of defined key texts. This all forms a basis for high performance, transparency and changeability, even if the set of rules is very complex.
The document type recognition is administrated using a graphical user interface, which presents the hierarchical structure to the user in an intuitive way. The defined hierarchy of the categories is presented as a tree, so that the user can open the folder requiring administration to gain access to the information present at the current hierarchical level (in a similar way in which directory trees are presented in a file system).
proven capabilities of up to 20.000 pages an hour on production level
powerful module for data extraction from freeform tables
These enterprises amongst others use PROSAR-AIDA successfully: