SwiftyTesseract
public class SwiftyTesseract
A class that performs optical character recognition with the open-source Tesseract library
-
Sets a
String
of characters that will only be recognized. This does not filter values.Example: setting a whiteList of “ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz” with an image containing digits may result in “1” being recognized as “I” and “2” being recognized as “Z”. Set this value only if it is 100% certain the characters that are defined will only be present during recognition.
This may cause unpredictable recognition results if characters not defined in whiteList are present. If removal and not replacement is desired, filtering the recognition string is a better option.
Declaration
Swift
public var whiteList: String? { get set }
-
Sets a
String
of characters that will not be recognized. This does not filter values.Example: setting a blackList of “0123456789” with an image containing digits may result in “1” being recognized as “I” and “2” being recognized as “Z”. Set this value only if it is 100% certain that the characters defined will not be present during recognition.
This may cause unpredictable recognition results if characters defined in blackList are present. If removal and not replacement is desired, filtering the recognition string is a better option
Declaration
Swift
public var blackList: String? { get set }
-
Preserve multiple interword spaces
Declaration
Swift
public var preserveInterwordSpaces: Bool? { get set }
-
Minimum character height
Declaration
Swift
public var minimumCharacterHeight: Int? { get set }
-
The current version of the underlying Tesseract library
Declaration
Swift
lazy public private(set) var version: String? { get set }
-
Creates an instance of SwiftyTesseract using standard RecognitionLanguages. The tessdata folder MUST be in your Xcode project as a folder reference (blue folder icon, not yellow) and be named “tessdata”
Declaration
Swift
public convenience init( languages: [RecognitionLanguage], dataSource: LanguageModelDataSource = Bundle.main, engineMode: EngineMode = .lstmOnly )
Parameters
languages
Languages of the text to be recognized
dataSource
The LanguageModelDataSource that contains the tessdata folder - default is Bundle.main
engineMode
The tesseract engine mode - default is .lstmOnly
-
Convenience initializer for creating an instance of SwiftyTesseract with one language to avoid having to input an array with one value (e.g. [.english]) for the languages parameter
Declaration
Swift
public convenience init( language: RecognitionLanguage, dataSource: LanguageModelDataSource = Bundle.main, engineMode: EngineMode = .lstmOnly )
Parameters
language
The language of the text to be recognized
dataSource
The LanguageModelDataSource that contains the tessdata folder - default is Bundle.main
engineMode
The tesseract engine mode - default is .lstmOnly
-
Declaration
Swift
public convenience init( language: RecognitionLanguage, bundle: Bundle = .main, engineMode: EngineMode = .lstmOnly )
-
Declaration
Swift
public convenience init( languages: [RecognitionLanguage], bundle: Bundle = .main, engineMode: EngineMode = .lstmOnly )
-
Declaration
Swift
public enum Error : Swift.Error
-
Takes a UIImage and passes resulting recognized UTF-8 text into completion handler
Declaration
Swift
@available(*, deprecated, message: "use performOCR(on:﹚ or performOCRPublisher(on:﹚") public func performOCR(on image: UIImage, completionHandler: (String?) -> ())
Parameters
image
The image to perform recognition on
completionHandler
The action to be performed on the recognized string
-
Creates a cold publisher that performs OCR on a provided image upon subscription
Declaration
Swift
@available(iOS 13.0, *) public func performOCRPublisher(on image: UIImage) -> AnyPublisher<String, Swift.Error>
Parameters
image
The image to perform recognition on
Return Value
A cold publisher that emits a single
String
on success or anError
on failure.
-
Takes an array UIImages and returns the PDF as a
Data
object. If using PDFKit introduced in iOS 11, this will produce a valid PDF Document.Throws
SwiftyTesseractErrorDeclaration
Swift
public func createPDF(from images: [UIImage]) throws -> Data
Parameters
images
Array of UIImages to perform OCR on
Return Value
PDF
Data
object -
This method must be called after
performOCR(on:)
. Otherwise calling this method will result in failure.Declaration
Swift
public func recognizedBlocks(for level: ResultIteratorLevel) -> Result<[RecognizedBlock], Swift.Error>
Parameters
level
The level which corresponds to the granularity of the desired recognized block
Return Value
On success, an array of
RecognizedBlock
s in the coordinate space of the image.