Difference between revisions of "Tesseract"

From Free Software Directory
Jump to: navigation, search
(Improved description, added categories.)
m (category)
 
(6 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 
{{Entry
 
{{Entry
 
|Name=Tesseract
 
|Name=Tesseract
|Short description=Optical Character Recognition: turn an image to text
+
|Short description=Optical character recognition engine
|Full description=OCR can be used to e.g. scan books and turn them into text, which is more flexible and smaller in terms of file size.
+
|Full description='''Tesseract''' is an optical character recognition (OCR) engine with very high accuracy. It supports many languages, output text formatting, hOCR positional information and page layout analysis. Several image formats are supported through the [[Leptonica|Leptonica library]]. It can also detect whether text is monospaced or proportional.
|Homepage URL=http://code.google.com/p/tesseract-ocr
+
 
 +
This package contains an OCR engine - <code>libtesseract</code> and a command line program - <code>tesseract</code>.
 +
|Homepage URL=https://github.com/tesseract-ocr/tesseract
 +
|Is High Priority Project=No
 +
|VCS checkout command=git clone git://github.com/tesseract-ocr/tesseract.git
 
|Computer languages=C, C++
 
|Computer languages=C, C++
 +
|Documentation note=OCR can be used to e.g. scan books and turn them into text, which is more flexible and smaller in terms of file size.
 +
|Decommissioned or Obsolete=No
 
|Related projects=Clara OCR, Ocre, Hocr, GOCR, WeOCR, GNU Ocrad
 
|Related projects=Clara OCR, Ocre, Hocr, GOCR, WeOCR, GNU Ocrad
 
|Keywords=text, graphics, ocr
 
|Keywords=text, graphics, ocr
|Last review by=Calinou
+
|Version identifier=4.1.1
|Last review date=2014/06/07
+
|Version date=2019/12/26
|Submitted by=mviinama
+
|Version status=stable
 +
|Version download=https://github.com/tesseract-ocr/tesseract/releases/tag/4.1.1
 +
|Last review by=Genium
 +
|Last review date=2020/04/11
 
|Submitted date=2013-04-11
 
|Submitted date=2013-04-11
|Status=
+
|Accepts cryptocurrency donations=No
 +
|Test entry=No
 
|Is GNU=No
 
|Is GNU=No
|Paid Support=
 
|Software categories=optical character recognition
 
 
}}
 
}}
 
{{Project license
 
{{Project license
|License=Apache License v2.0
+
|License=Apache2.0
 +
|License verified by=Genium
 +
|License verified date=2020/04/11
 +
|License note=https://github.com/tesseract-ocr/tesseract/blob/master/LICENSE
 +
}}
 +
{{Resource
 +
|Resource audience=R (Ref)
 +
|Resource URL=https://cran.r-project.org/web/packages/tesseract
 +
}}
 +
{{Resource
 +
|Resource audience=Python (Ref)
 +
|Resource URL=https://pypi.org/project/tesseract
 +
}}
 +
{{Resource
 +
|Resource audience=Ruby (Ref)
 +
|Resource URL=https://rubygems.org/gems/tesseract
 +
}}
 +
{{Resource
 +
|Resource audience=Debian (Ref)
 +
|Resource URL=https://tracker.debian.org/pkg/tesseract
 +
}}
 +
{{Resource
 +
|Resource audience=GitHub
 +
|Resource kind=VCS Repository Webview
 +
|Resource URL=https://github.com/tesseract-ocr/tesseract
 
}}
 
}}
 
{{Software category
 
{{Software category
 
|Business=productivity
 
|Business=productivity
 +
|Interface=command-line, library
 
|Programming-language=C, C++
 
|Programming-language=C, C++
|Text-creation=documentation-tool
+
|Runs-on=Windows, BSD, OS X, Android, GNU/Linux, GNU/Hurd, iOS
 
|Use=reading, science, text-creation
 
|Use=reading, science, text-creation
 +
|Works-with=images
 
}}
 
}}
 
{{Featured}}
 
{{Featured}}

Latest revision as of 18:08, 11 April 2020


[edit]

Tesseract

https://github.com/tesseract-ocr/tesseract
Optical character recognition engine

Tesseract is an optical character recognition (OCR) engine with very high accuracy. It supports many languages, output text formatting, hOCR positional information and page layout analysis. Several image formats are supported through the Leptonica library. It can also detect whether text is monospaced or proportional.

This package contains an OCR engine - libtesseract and a command line program - tesseract.





Licensing

License

Verified by

Verified on

Notes

License

Apache2.0

Verified by

Genium

Verified on

11 April 2020




Leaders and contributors

Resources and communication

AudienceResource typeURI
GitHubVCS Repository Webviewhttps://github.com/tesseract-ocr/tesseract
Python (Ref)https://pypi.org/project/tesseract
Ruby (Ref)https://rubygems.org/gems/tesseract
Debian (Ref)https://tracker.debian.org/pkg/tesseract
R (Ref)https://cran.r-project.org/web/packages/tesseract


Software prerequisites




Entry







"R (Ref)" is not in the list (General, Help, Bug Tracking, Support, Developer) of allowed values for the "Resource audience" property.


"Python (Ref)" is not in the list (General, Help, Bug Tracking, Support, Developer) of allowed values for the "Resource audience" property.


"Ruby (Ref)" is not in the list (General, Help, Bug Tracking, Support, Developer) of allowed values for the "Resource audience" property.


"Debian (Ref)" is not in the list (General, Help, Bug Tracking, Support, Developer) of allowed values for the "Resource audience" property.


"GitHub" is not in the list (General, Help, Bug Tracking, Support, Developer) of allowed values for the "Resource audience" property.







Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the page “GNU Free Documentation License”.

The copyright and license notices on this page only apply to the text on this page. Any software or copyright-licenses or other similar notices described in this text has its own copyright notice and license, which can usually be found in the distribution or license text itself.