20 likes | 42 Views
Simple OCR action for Alfresco. OCR is a very useful feature for any Alfresco Enterprise Content Management System or Software. Configure it in Alfresco Community Edition.
E N D
How Configuring OCR in Alfresco OCR (Optical Character Recognition) is the recognition of printed or written text characters by a computer. It recognizes the characters from the images or scanned documents, and that makes the images (which contain text) searchable. OCR is a very useful feature for ECM product or software. In this blog, we will see how we can configure it in Alfresco Community Edition. We have tested this with Alfresco versions 5.1.f and 5.2.e. It should also work with other nearby versions. Read the blog: OCR in Alfresco [Video] Prerequisites: 1. Alfresco Community / Enterprise Edition installed and running 2. Basic knowledge of Alfresco administration Steps to Configure Tesseract: Note: Here we have some Code section in this all 7 steps. Click here for original source of: Configuring OCR in Alfresco 1. Download Tesseract and install 2. Stop the alfresco tomcat server 3. Download the Linux /Windows context file and place at 4. Place ocr.bat(Windows) and ocr.sh(Linux) at <ALFRESCO-HOME>/ 5. If the current user does not have read or execute permissions on ocr.sh then give it. 6. Add following properties in the alfresco-global.properties file located at 7. Start tomcat server Note: Existing files in alfresco will not be OCRed, you have to upload new image files to test. Important: 1. Make sure you are passing correct arguments in the context file (Entries in context files will be different for Windows and Linux). 2. Check whether your .bat or .sh commands are properly working or not 3. Verify that tesseract creates text file for the image file To verify that go to the directory where tesseract is installed and run the following command tesseract ./<image file-name> ./<text file-name> -l eng If the text file is created with content in it, your tesseract is working. Call: India +91 9925144200, USA +1 (732) 927-5544; Email: sales@contcentric.com
Comment here, if your contents are still not searchable. We are happy to know your ECM challenges, as we love solving them Contact us! Call: India +91 9925144200, USA +1 (732) 927-5544; Email: sales@contcentric.com