Fred's ImageMagick Scripts



    Licensing:

    Copyright © Fred Weinhaus

    My scripts are available free of charge for non-commercial (non-profit) use, ONLY.

    For use of my scripts in commercial (for-profit) environments or non-free applications, please contact me (Fred Weinhaus) for licensing arrangements. My email address is fmw at alink dot net.

    If you: 1) redistribute, 2) incorporate any of these scripts into other free applications or 3) reprogram them in another scripting language, then you must contact me for permission, especially if the result might be used in a commercial or for-profit environment.

    Usage, whether stated or not in the script, is restricted to the above licensing arrangements. It is also subject, in a subordinate manner, to the ImageMagick license, which can be found at: http://www.imagemagick.org/script/license.php

    Please read the Pointers For Use on my home page to properly install and customize my scripts.

TEXTDESKEW


Unrotates (deskews) an image containing text.

Download Script

last modified: September 11, 2021



USAGE: textdeskew [-n number] [-r radius ] [-m metric] [-a attenuate] [-l logval] [-c color] [-t trimval] [-p padval] infile outfile1 [outfile2]
USAGE: textdeskew [-h|-help]

-n ... number ...... stopping number of maxima; integer>1; default=20
-r ... radius ...... radius for masking out peaks; integer>0; default=7
-m ... metric ...... compare metric; any valid IM metric; default=rmse
-a ... attenuate ... attenuation percent for the brightness near the
.................... center of the FFT spectrum; 0<integer<=100; default=85
-l ... logval ...... value for log when converting the FFT magnitude to
.................... spectrum; integer>0; default=1000000
-c ... color ....... background color for deskewed image; any valid IM
.................... color is allowed; default=white
-t ... trimval ..... trim fuzz value; 0<=float<=100; default=no trim
-p ... padval ...... padding added to output; integer>=0; default=no padding

An optional output image may be specified which will be the masked FFT spectrum showing the locations of the peaks that define the line from which the unrotation angle is determined.

PURPOSE: To unrotate (deskews) an image containing text.

DESCRIPTION: TEXTDESKEW unrotates (deskews) an image containing many rows of text. If the meta data contain orientation information, then the image will be auto-oriented first. The image is converted to the FFT spectrum image and a list of peaks orthogonal to the rows of text are located. From the list of peak coordinates a least square fit to a line is computed and the orientation of the line is used to get the unrotation angle. The approach will not be able to determine whether the text is right side up or up side down. However, it should properly correct for rotations in the range of -90<=rotation<90. Note that +90 will be upside down, since lines rotated by 180 degrees are undistinguishable. The more lines of text, the better the script works. If there are too few lines of text, the the script may fail.

ARGUMENTS:

-n number ... NUMBER of peaks to be found. Values are integers>1. The default=20.

-r radius ... RADIUS value for masking out peaks. Values are integers>0. The default=20.

-m metric ... METRIC is the compare metric used to locate the maxima. Any valid IM metric may be used. The default=rmse. This option is only needed for Imagemagick versions less than 6.8.6-10.

-a attenuate ... ATTENUATE is the attenuation percent for the brightness near the center of the FFT spectrum. This is used since the center of the FFT spectrum is brighter than the outer regions. Attenuation linearly ramps from the center to no attenuation at the edges of the spectrum. Attenuating the center allows a longer linear distribution of found peaks, which helps to better define the orientation line. However, it may introduce more off-linear false peaks. Shorter lengths of text may require a different attenuation. Values are 0<integer<=100. 100 means no center attenuation. The default=85.

-l logval ... LOGVAL is the value for log when converting the FFT magnitude to the spectrum. Values are integers>0. The default=1000000.

-c color ... COLOR is the background color for deskewed image. Any valid IM color is allowed. The default=white.

-t trimval ... TRIMVAL is the trim fuzz value (percent) for trimming the output to its minimum bounding box. Values are 0<=floats<=100. The default=no trimming.

-p padval ... PADVAL is the padding added to the output. Values are integers>=0. The default=no padding.

An optional output image may be specified which will be the masked FFT spectrum showing the locations of the peaks that define the line from which the unrotation angle is determined.

REQUIREMENTS: IM 6.5.4-7 or higher and the FFTW delegate library.

NOTE: Some versions of Imagemagick around 6.8.8-5 may not work due to a bug.

CAVEAT: No guarantee that this script will work on all platforms, nor that trapping of inconsistent parameters is complete and foolproof. Use At Your Own Risk.


EXAMPLES

Original

Arguments:
(defaults)

Rotation=-2.0065 deg

spectrum image



Original
(source)

Arguments:
(defaults)

Rotation=6.0031 deg

Original
(source)

Arguments:
(defaults)

Rotation=7.9739 deg



Original
(source)

Arguments:
(defaults)

Rotation=-1.2059 deg



Original

Arguments:
(defaults)

Rotation=89.2605 deg



What the script does is as follows:

  • Auto-orients the image, if possible
  • Creates an FFT spectrum image
  • Locates the major peaks in the spectrum
  • Does a least square fit to a line from the coordinates of the peaks
  • Throws out outliers
  • Does another least square fit to a line from the coordinates of the peaks
  • Computes the needed rotation angle to rectify the image based on the angle of the line
  • Rotates the image to deskew/unrotate it

See the script for details