Tesseract-Ocr/grails installation over amazon EC2.

by Vivek Mishra


Tesseract-Ocr/grails installation over amazon EC2.

Before we proceed:

Before proceeding to installation, first thing is to choose right AMI instance for your installation. There are number of cloud instance available to go ahead. I prefer to look for an EBS backed AMI instance (stop/save my instance), which should provide you a built-in platform with already installed:

1)      Java 6

2)      64 bit Centos machine.

I tried it so many ec2 instances, and finally settled down with right Image instance ami-4d42a924.

What do i need to install?

To setup grails and tesseract-ocr on your machine what you need is to install:

1)      Leptonica

2)      Tesseract-ocr

3)      Grails

I will cover them in details during installation steps.

Installation steps

   Svn:

1)      Connect to your cloud instance via root.

2)      Verify if svn is installed or not, type “svn” on command prompt. If you get something like “svn is not found or installed”. Then go to step 3.

3)      Type  yum install svn to install svn

Java:

1)      Type java –version to check for version of installed java. If sun jdk is not installed (to know it type echo $JAVA_HOME)? Go to step 2.

2)      Download jdk-6u27-ea-bin-b03-linux-amd64-27_may_2011-rpm.bin(For more details, please refer here).

3)      Once download is complete, execute

  • chmod +x jdk-6u27-ea-bin-b03-linux-amd64-27_may_2011-rpm.bin
  • ./jdk-6u27-ea-bin-b03-linux-amd64-27_may_2011-rpm.bin
  • ln -s /usr/java/jdk1.6.0_27/bin/java /usr/bin/java(if soft link already exists then execute rm -rf /usr/bin/java  and rm -rf /usr/bin/javac before executing this command)
  • ln -s /usr/java/jdk1.6.0_27/bin/javac /usr/bin/javac

Leptonica:

To install leptonica, please execute given below command sequentially:

  • mkdir leptonica
  • cd leptonica
  • wget http://www.leptonica.com/source/leptonlib-1.67.tar.gz
  • tar -zxvf leptonlib-1.67.tar.gz
  • cd leptonlib-1.67
  • ./configure
  • Make
  • make install
  • yum list(to verify list of installed softwares)
  • yum install gcc gcc-c++ make(to verify if c++ compiler is installed or not)
  • yum install aclocal  (to verify if it is installed or not)
  • yum install automake (to verify if it is installed or not)
  • yum install libtoolize (to verify if it is installed or not)
  • yum install libtool (to verify if it is installed or not)
  • yum install libjpeg-devel libpng-devel libtiff-devel zlib-devel

Tesseract-ocr:

  • mkdir tesseract
  • cd tesseract/
  • svn checkout http://tesseract-ocr.googlecode.com/svn/trunk/ tesseract-ocr
  • ./runautoconf
  • mkdir m4
  • ./configure
  • Make
  • make install
  • tesseract (if it displays  output other than “command not found”, means tesseract is successfully configured)

Grails:

References: