ActivePDF DocSight OCR
Adds a character filter, auto detect language, a file mask, plus a document confidence level.
Features
- Character Filter: Specify which characters are searchable in the resulting output PDF, by using the Character Filter option in the OCR Profiles General tab for the Searchable PDF (Image over Text) OCR Type. Search all characters (by default), numbers only, case-sensitive words, or punctuation.
- Auto Detect Language: OCR auto detects languages for word recognition. Use the Auto Detect Language check box in the OCR Profiles Character Recognition tab to automatically recognize the language in your input document.
Note: Install the corresponding language font locally for auto detect to work. For example, if OCR detects the document's language as Japanese, OCR requires a Japanese font to correctly process the document characters. - File Mask: Create a filter to ignore a file during processing. Enter a file name, such as Thumbs.db, and OCR ignores that file during conversion, but processes all other files in the Input folder.
Note: The text box is for a specific, single file name; for example, generic syntax such as *.txt does not mask all .txt files. - Document Confidence Level: When Debug is enabled, the logging results now display the confidence level for the entire document as a percentage. It also includes the number of suspicious characters out of the total number of characters in the document.
Fixes
- Fixed - 18002 - The Arabic letters display correctly.
- Fixed - 19326 - OCR remote conversion for .NET works as expected.