SwiftyTesseract
public class SwiftyTesseract
A class that performs optical character recognition with the open-source Tesseract library
-
Sets a
Stringof characters that will only be recognized. This does not filter values.Example: setting a whiteList of “ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz” with an image containing digits may result in “1” being recognized as “I” and “2” being recognized as “Z”. Set this value only if it is 100% certain the characters that are defined will only be present during recognition.
This may cause unpredictable recognition results if characters not defined in whiteList are present. If removal and not replacement is desired, filtering the recognition string is a better option.
Declaration
Swift
public var whiteList: String? { get set } -
Sets a
Stringof characters that will not be recognized. This does not filter values.Example: setting a blackList of “0123456789” with an image containing digits may result in “1” being recognized as “I” and “2” being recognized as “Z”. Set this value only if it is 100% certain that the characters defined will not be present during recognition.
This may cause unpredictable recognition results if characters defined in blackList are present. If removal and not replacement is desired, filtering the recognition string is a better option
Declaration
Swift
public var blackList: String? { get set } -
Preserve multiple interword spaces
Declaration
Swift
public var preserveInterwordSpaces: Bool? { get set } -
Minimum character height
Declaration
Swift
public var minimumCharacterHeight: Int? { get set } -
The current version of the underlying Tesseract library
Declaration
Swift
lazy public private(set) var version: String? { get set }
-
Creates an instance of SwiftyTesseract using standard RecognitionLanguages. The tessdata folder MUST be in your Xcode project as a folder reference (blue folder icon, not yellow) and be named “tessdata”
Declaration
Swift
public convenience init( languages: [RecognitionLanguage], dataSource: LanguageModelDataSource = Bundle.main, engineMode: EngineMode = .lstmOnly )Parameters
languagesLanguages of the text to be recognized
dataSourceThe LanguageModelDataSource that contains the tessdata folder - default is Bundle.main
engineModeThe tesseract engine mode - default is .lstmOnly
-
Convenience initializer for creating an instance of SwiftyTesseract with one language to avoid having to input an array with one value (e.g. [.english]) for the languages parameter
Declaration
Swift
public convenience init( language: RecognitionLanguage, dataSource: LanguageModelDataSource = Bundle.main, engineMode: EngineMode = .lstmOnly )Parameters
languageThe language of the text to be recognized
dataSourceThe LanguageModelDataSource that contains the tessdata folder - default is Bundle.main
engineModeThe tesseract engine mode - default is .lstmOnly
-
Declaration
Swift
public convenience init( language: RecognitionLanguage, bundle: Bundle = .main, engineMode: EngineMode = .lstmOnly ) -
Declaration
Swift
public convenience init( languages: [RecognitionLanguage], bundle: Bundle = .main, engineMode: EngineMode = .lstmOnly ) -
Declaration
Swift
public enum Error : Swift.Error
-
Takes a UIImage and passes resulting recognized UTF-8 text into completion handler
Declaration
Swift
@available(*, deprecated, message: "use performOCR(on:﹚ or performOCRPublisher(on:﹚") public func performOCR(on image: UIImage, completionHandler: (String?) -> ())Parameters
imageThe image to perform recognition on
completionHandlerThe action to be performed on the recognized string
-
Creates a cold publisher that performs OCR on a provided image upon subscription
Declaration
Swift
@available(iOS 13.0, *) public func performOCRPublisher(on image: UIImage) -> AnyPublisher<String, Swift.Error>Parameters
imageThe image to perform recognition on
Return Value
A cold publisher that emits a single
Stringon success or anErroron failure.
-
Takes an array UIImages and returns the PDF as a
Dataobject. If using PDFKit introduced in iOS 11, this will produce a valid PDF Document.Throws
SwiftyTesseractErrorDeclaration
Swift
public func createPDF(from images: [UIImage]) throws -> DataParameters
imagesArray of UIImages to perform OCR on
Return Value
PDF
Dataobject -
This method must be called after
performOCR(on:). Otherwise calling this method will result in failure.Declaration
Swift
public func recognizedBlocks(for level: ResultIteratorLevel) -> Result<[RecognizedBlock], Swift.Error>Parameters
levelThe level which corresponds to the granularity of the desired recognized block
Return Value
On success, an array of
RecognizedBlocks in the coordinate space of the image.
View on GitHub
SwiftyTesseract Class Reference