Hi everyone, I'm working on implementing OCR funct...
# compose-ios
s
Hi everyone, I'm working on implementing OCR functionality in a Kotlin Multiplatform Mobile (KMM) app, targeting iOS. I'm using the Vision framework's
VNRecognizeTextRequest
to recognize text from camera frames captured via
AVCaptureSession
. However, I'm having trouble accessing the recognized text from the
VNRecognizedText
objects returned by the Vision framework in Kotlin Native. What I'm Trying to Achieve: • Capture camera frames using
AVCaptureVideoDataOutputSampleBufferDelegateProtocol
. • Process frames with
VNRecognizeTextRequest
to perform OCR. • Extract the recognized text from the results to display or process further. The Issue: I can't seem to access the recognized text from the
VNRecognizedText
objects. Here's what I've tried: 1. Using
.string
Property:
val recognizedText = topCandidate?.string
2. Using
.toString()
Method:
val recognizedText = topCandidate?.toString()
3. Casting to `NSString`:
val recognizedText = topCandidate as? NSString
Output: Returns object instance references like
<VNRecognizedText: 0x303aabb60>
, which isn't helpful. Here's the relevant portion of my code:
Copy code
@OptIn(ExperimentalForeignApi::class)
private fun processSampleBuffer(didOutputSampleBuffer: CMSampleBufferRef) {
    val pixelBuffer = CMSampleBufferGetImageBuffer(didOutputSampleBuffer) ?: return

    val request = VNRecognizeTextRequest { vnRequest, error ->
        if (error != null) {
            println("Failed to perform OCR: ${error.localizedDescription}")
            return@VNRecognizeTextRequest
        }

        val results = vnRequest.results as? List<VNRecognizedTextObservation> ?: emptyList()
        if (results.isEmpty()) {
            println("No text observations found.")
            return@VNRecognizeTextRequest
        }

        val recognizedStrings = results.map { observation ->
            val topCandidate = observation.topCandidates(1u).firstOrNull()
           
            val recognizedText = topCandidate?.string  // Unresolved reference
            // Also tried topCandidate?.text, topCandidate?.description, topCandidate?.toString()
            recognizedText ?: ""
        }

        val joinedText = recognizedStrings.joinToString("\n")
        Napier.d{"Recognized Text: $joinedText"}
    }

    request.recognitionLevel = VNRequestTextRecognitionLevelAccurate

    val handler = VNImageRequestHandler(
        cVPixelBuffer = pixelBuffer,
        orientation = 1u,
        options = emptyMap<Any?, Any?>()
    )

    try {
        handler.performRequests(listOf(request), error = null)
    } catch (e: Exception) {
        Napier.d{"Failed to perform OCR: ${e.message}"}
    }
}
How can I access the recognized text from a
VNRecognizedText
object in Kotlin Native?
f
Hello, if the kotlin way is not working, try to pass it with swift https://github.com/ttypic/swift-klib-plugin
👀 1
Write your code in swift (ObjC compatible) and add it to your iosMain code
Anyway,
VNRecognizedTextObservation.observation.topCandidates(1u)
return a list of
VNRecognizedText
https://developer.apple.com/documentation/vision/vnrecognizedtextobservation/topcandidates(_:)?language=objc
Copy code
val topCandidate = observation.topCandidates(1u).firstOrNull() as? VNRecognizedText
val recognizedText = topCandidate?.string // Unresolved reference
recognizedText
shouldn’t be null, is
topCandidate?.string
throw an exception ? or just be null?
s
i got it working , i decided to use objc types instead of kotlin
Copy code
val request = VNRecognizeTextRequest { vnRequest, error ->
            if (error != null) {
                Napier.d { "Failed to perform OCR: ${error.localizedDescription}" }
                scanningState.value = false
                isProcessingFrame = false
                return@VNRecognizeTextRequest
            }

            val resultsArray = vnRequest?.results as? NSArray

            if (resultsArray == null || resultsArray.count.toInt() == 0) {
                Napier.d { "No text observations found." }
                scanningState.value = false
                isProcessingFrame = false
                return@VNRecognizeTextRequest
            }

            val recognizedStrings = mutableListOf<String>()

            for (i in 0 until resultsArray.count.toInt()) {
                val observation =
                    resultsArray.objectAtIndex(i.toULong()) as? VNRecognizedTextObservation
                if (observation != null) {
                    val topCandidatesArray = observation.topCandidates(1u) as NSArray
                    if (topCandidatesArray.count > 0uL) {
                        val candidate = topCandidatesArray.objectAtIndex(0uL) as? NSObject

                        val recognizedText = candidate?.valueForKey("string") as? String ?: ""

                        recognizedStrings.add(recognizedText)
                    }
                }
            }
            val joinedText = recognizedStrings.joinToString("\n")

            dispatch_async(dispatch_get_main_queue()) {
                capturedTexts.value = joinedText
                isProcessingFrame = false
                scanningState.value = false
            }
        }
f
Oh, great. I guess, It means the kotlin cinterop has failed in this case.
f
Sorry I got confused here, @Solomon Opoku.
i decided to use objc types instead of kotlin
Does that means that there are two ways to access iOS ObjectiveC types? 🤔
😄 1
f
Yes, it is. And there are more 😄 .
it’s a kind of introspection, accessing the attribute of a class by their name.
s
@Fernando yeah you can cast the types directly as objc types if you know what values they're returning in swift and work with them . you can refer to the code sample i shared as an example
f
Wow! get you. Thanks Solomon, thanks François
150 Views