Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support react-native-vision-camera v3 #32

Open
wants to merge 58 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
568b366
feat: update ios side to v3 of camera vision
Oct 7, 2023
f5aee7c
feat: update android side to camera vision v3
Oct 9, 2023
11eead5
feat: update react-native side
Oct 9, 2023
3e52622
feat: update the WritableNativeMap to HashMap
Oct 9, 2023
e887b20
feat: bump dependencies
Oct 10, 2023
15f4baf
chore: change word's preview
Oct 10, 2023
2454a78
chore: move projects
Oct 10, 2023
60c1e50
Bump version
Oct 10, 2023
1309557
fix: @types/react-native
Oct 10, 2023
5bfd212
chore: update readme
Oct 10, 2023
afbba74
chore: type scanOCR return
Oct 10, 2023
630972c
refactor: use createRunInJsFn
Oct 11, 2023
94dc64b
chore: unassign team
Oct 11, 2023
72c3289
feat: change package name
ismaelsousa Oct 25, 2023
1ddeff7
fix: put version back
ismaelsousa Oct 25, 2023
7a4af0f
fix: pkg
ismaelsousa Oct 25, 2023
887b193
fix: visibility for public
ismaelsousa Oct 25, 2023
d32cfcc
feat: add access
ismaelsousa Oct 25, 2023
f21364c
fix: put my nickname
ismaelsousa Oct 25, 2023
b13e600
Release 2.0.0
ismaelsousa Oct 25, 2023
eb79ab0
fix: include VisionCameraOcr.podspec file to npm package
ismaelsousa Nov 7, 2023
fed2baf
Release 2.0.1
ismaelsousa Nov 7, 2023
907b80f
chore: bump vision camera to 3.6.4
ismaelsousa Nov 7, 2023
d6896dd
fix: add deps to external package.json
ismaelsousa Nov 7, 2023
376576e
Release 2.1.0
ismaelsousa Nov 7, 2023
fc79a91
fix: FrameProcessorPlugin constructor
ismaelsousa Nov 7, 2023
7edddbd
Release 2.1.1
ismaelsousa Nov 7, 2023
5211de9
fix: registration module
ismaelsousa Dec 22, 2023
6f2544e
feat: add symbols
ismaelsousa Dec 22, 2023
9571748
Release 2.1.2-0
ismaelsousa Dec 22, 2023
c2ce46a
Release 2.1.2
ismaelsousa Dec 22, 2023
fbd96b6
feat: add type to symbols
ismaelsousa Jan 3, 2024
7df46f5
Merge branch 'v2' into feat/get-symbols-from-element
ismaelsousa Jan 3, 2024
607dcb4
Merge pull request #1 from ismaelsousa/feat/get-symbols-from-element
ismaelsousa Jan 3, 2024
9efcce6
Release 2.2.0
ismaelsousa Jan 3, 2024
bad0120
type fix
kevinranks Jan 11, 2024
c83c5b9
Merge pull request #1 from joincarbon/kevinranks-patch-1
kevinranks Jan 11, 2024
968459b
Merge pull request #2 from joincarbon/v2
ismaelsousa Jan 12, 2024
86d6ca0
bump vision camera to 3.7.1
ismaelsousa Jan 12, 2024
e8ac0dc
Release 2.2.1
ismaelsousa Jan 12, 2024
7ec9124
iOS - Fix incompatible block type and 'init' method unavailability in…
7emretelli Jan 21, 2024
56091b2
Merge pull request #3 from 7emretelli/vision-camera-3.7.1-ios
ismaelsousa Jan 21, 2024
b685849
Update VisionCameraOcr to version 2.2.1
ismaelsousa Jan 21, 2024
c333869
Release 2.2.2-0
ismaelsousa Jan 21, 2024
0223e42
fix: android registration plugin for vision camera v3.8
danieloprado Jan 22, 2024
54aab6c
Merge pull request #4 from danieloprado/fix-android-vision-camera-3.8
ismaelsousa Jan 22, 2024
d1eee93
Update VisionCameraOcr to version 2.2.2-0
ismaelsousa Jan 22, 2024
c3cabc9
Release 2.2.2
ismaelsousa Jan 22, 2024
75a673a
Update VisionCameraOcrPackage.kt
danieloprado Jan 24, 2024
53f6a58
Merge pull request #5 from danieloprado/fix-android-vision-camera-3.8
ismaelsousa Jan 30, 2024
efda84c
Release 2.2.3
ismaelsousa Jan 31, 2024
81bf21c
Fix constructor argument in OCRFrameProcessorPlugin
ismaelsousa Feb 22, 2024
97fd1f8
Refactor OCRFrameProcessorPlugin initialization
ismaelsousa Feb 22, 2024
8f8f653
Release 2.2.4
ismaelsousa Feb 22, 2024
ab66fd1
Update vision camera and remove unused code
ismaelsousa Feb 25, 2024
50ecd0c
Release 2.3.0
ismaelsousa Feb 25, 2024
8be34e9
Refactor OCRFrameProcessorPlugin initialization
ismaelsousa Feb 25, 2024
a57b3cb
Release 2.3.1
ismaelsousa Feb 25, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 5 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,10 @@
<div align="right">
<img align="right" src="docs/demo.gif">
</div>

# vision-camera-ocr

A [VisionCamera](https://github.com/mrousavy/react-native-vision-camera) Frame Processor Plugin to preform text detection on images using [**MLKit Vision** Text Recognition](https://developers.google.com/ml-kit/vision/text-recognition).

<img style='width:200px;' src="docs/demo.gif">

## Installation

```sh
Expand All @@ -17,14 +16,7 @@ Add the plugin to your `babel.config.js`:

```js
module.exports = {
plugins: [
[
'react-native-reanimated/plugin',
{
globals: ['__scanOCR'],
},
],

plugins: [['react-native-worklets-core/plugin']],
// ...
```

Expand All @@ -33,7 +25,7 @@ module.exports = {
## Usage

```js
import { labelImage } from "vision-camera-image-labeler";
import {scanOCR} from 'vision-camera-ocr';

// ...
const frameProcessor = useFrameProcessor((frame) => {
Expand All @@ -56,7 +48,7 @@ const frameProcessor = useFrameProcessor((frame) => {
```

The text object closely resembles the object documented in the MLKit documents.
https://developers.google.com/ml-kit/vision/text-recognition#text_structure
<https://developers.google.com/ml-kit/vision/text-recognition#text_structure>

```
The Text Recognizer segments text into blocks, lines, and elements. Roughly speaking:
Expand All @@ -68,8 +60,6 @@ a Line is a contiguous set of words on the same axis, and
an Element is a contiguous set of alphanumeric characters ("word") on the same axis in most Latin languages, or a character in others
```



## Contributing

See the [contributing guide](CONTRIBUTING.md) to learn how to contribute to the repository and the development workflow.
Expand Down
4 changes: 2 additions & 2 deletions vision-camera-ocr.podspec → VisionCameraOcr.podspec
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ require "json"
package = JSON.parse(File.read(File.join(__dir__, "package.json")))

Pod::Spec.new do |s|
s.name = "vision-camera-ocr"
s.name = "VisionCameraOcr"
s.version = package["version"]
s.summary = package["description"]
s.homepage = package["homepage"]
Expand All @@ -16,5 +16,5 @@ Pod::Spec.new do |s|
s.source_files = "ios/**/*.{h,m,mm,swift}"

s.dependency "React-Core"
s.dependency "GoogleMLKit/TextRecognition", "2.2.0"
s.dependency "GoogleMLKit/TextRecognition", "3.1.0"
end
14 changes: 9 additions & 5 deletions android/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,16 @@ buildscript {
def kotlin_version = rootProject.ext.has('kotlinVersion') ? rootProject.ext.get('kotlinVersion') : project.properties['VisionCameraOcr_kotlinVersion']

repositories {
maven {
url "https://plugins.gradle.org/m2/"
}
google()
mavenCentral()

}

dependencies {
classpath 'com.android.tools.build:gradle:3.2.1'
classpath "com.android.tools.build:gradle:8.0.1"
// noinspection DifferentKotlinGradleVersion
classpath "org.jetbrains.kotlin:kotlin-gradle-plugin:$kotlin_version"
}
Expand Down Expand Up @@ -45,8 +49,8 @@ android {
disable 'GradleCompatible'
}
compileOptions {
sourceCompatibility JavaVersion.VERSION_1_8
targetCompatibility JavaVersion.VERSION_1_8
sourceCompatibility JavaVersion.VERSION_11
targetCompatibility JavaVersion.VERSION_11
}
}

Expand Down Expand Up @@ -129,6 +133,6 @@ dependencies {
implementation "org.jetbrains.kotlin:kotlin-stdlib:$kotlin_version"

implementation project(':react-native-vision-camera')
implementation 'com.google.android.gms:play-services-mlkit-text-recognition:18.0.0'
implementation "androidx.camera:camera-core:1.1.0-alpha08"
implementation "com.google.mlkit:text-recognition:16.0.0-beta4"
implementation "androidx.camera:camera-core:1.1.0"
}
4 changes: 2 additions & 2 deletions android/gradle.properties
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
VisionCameraOcr_kotlinVersion=1.5.30
VisionCameraOcr_kotlinVersion=1.7.20
VisionCameraOcr_compileSdkVersion=31
VisionCameraOcr_buildToolsVersion=30.0.0
VisionCameraOcr_buildToolsVersion=33.0.0
VisionCameraOcr_targetSdkVersion=31
VisionCameraOcr_ndkVersion=21.4.7075529
263 changes: 139 additions & 124 deletions android/src/main/java/com/visioncameraocr/OCRFrameProcessorPlugin.kt
Original file line number Diff line number Diff line change
@@ -1,124 +1,139 @@
package com.visioncameraocr

import android.annotation.SuppressLint
import android.graphics.Point
import android.graphics.Rect
import android.media.Image
import androidx.camera.core.ImageProxy
import com.facebook.react.bridge.WritableNativeArray
import com.facebook.react.bridge.WritableNativeMap
import com.google.android.gms.tasks.Task
import com.google.android.gms.tasks.Tasks
import com.google.mlkit.vision.common.InputImage
import com.google.mlkit.vision.text.Text
import com.google.mlkit.vision.text.TextRecognition
import com.google.mlkit.vision.text.latin.TextRecognizerOptions
import com.mrousavy.camera.frameprocessor.FrameProcessorPlugin

class OCRFrameProcessorPlugin: FrameProcessorPlugin("scanOCR") {

private fun getBlockArray(blocks: MutableList<Text.TextBlock>): WritableNativeArray {
val blockArray = WritableNativeArray()

for (block in blocks) {
val blockMap = WritableNativeMap()

blockMap.putString("text", block.text)
blockMap.putArray("recognizedLanguages", getRecognizedLanguages(block.recognizedLanguage))
blockMap.putArray("cornerPoints", block.cornerPoints?.let { getCornerPoints(it) })
blockMap.putMap("frame", getFrame(block.boundingBox))
blockMap.putArray("lines", getLineArray(block.lines))

blockArray.pushMap(blockMap)
}
return blockArray
}

private fun getLineArray(lines: MutableList<Text.Line>): WritableNativeArray {
val lineArray = WritableNativeArray()

for (line in lines) {
val lineMap = WritableNativeMap()

lineMap.putString("text", line.text)
lineMap.putArray("recognizedLanguages", getRecognizedLanguages(line.recognizedLanguage))
lineMap.putArray("cornerPoints", line.cornerPoints?.let { getCornerPoints(it) })
lineMap.putMap("frame", getFrame(line.boundingBox))
lineMap.putArray("elements", getElementArray(line.elements))

lineArray.pushMap(lineMap)
}
return lineArray
}

private fun getElementArray(elements: MutableList<Text.Element>): WritableNativeArray {
val elementArray = WritableNativeArray()

for (element in elements) {
val elementMap = WritableNativeMap()

elementMap.putString("text", element.text)
elementMap.putArray("cornerPoints", element.cornerPoints?.let { getCornerPoints(it) })
elementMap.putMap("frame", getFrame(element.boundingBox))
}
return elementArray
}

private fun getRecognizedLanguages(recognizedLanguage: String): WritableNativeArray {
val recognizedLanguages = WritableNativeArray()
recognizedLanguages.pushString(recognizedLanguage)
return recognizedLanguages
}

private fun getCornerPoints(points: Array<Point>): WritableNativeArray {
val cornerPoints = WritableNativeArray()

for (point in points) {
val pointMap = WritableNativeMap()
pointMap.putInt("x", point.x)
pointMap.putInt("y", point.y)
cornerPoints.pushMap(pointMap)
}
return cornerPoints
}

private fun getFrame(boundingBox: Rect?): WritableNativeMap {
val frame = WritableNativeMap()

if (boundingBox != null) {
frame.putDouble("x", boundingBox.exactCenterX().toDouble())
frame.putDouble("y", boundingBox.exactCenterY().toDouble())
frame.putInt("width", boundingBox.width())
frame.putInt("height", boundingBox.height())
frame.putInt("boundingCenterX", boundingBox.centerX())
frame.putInt("boundingCenterY", boundingBox.centerY())
}
return frame
}

override fun callback(frame: ImageProxy, params: Array<Any>): Any? {

val result = WritableNativeMap()

val recognizer = TextRecognition.getClient(TextRecognizerOptions.DEFAULT_OPTIONS)

@SuppressLint("UnsafeOptInUsageError")
val mediaImage: Image? = frame.getImage()

if (mediaImage != null) {
val image = InputImage.fromMediaImage(mediaImage, frame.imageInfo.rotationDegrees)
val task: Task<Text> = recognizer.process(image)
try {
val text: Text = Tasks.await<Text>(task)
result.putString("text", text.text)
result.putArray("blocks", getBlockArray(text.textBlocks))
} catch (e: Exception) {
return null
}
}

val data = WritableNativeMap()
data.putMap("result", result)
return data
}
}
package com.visioncameraocr

import android.annotation.SuppressLint
import android.graphics.Point
import android.graphics.Rect
import android.media.Image
import com.google.android.gms.tasks.Task
import com.google.android.gms.tasks.Tasks
import com.google.mlkit.vision.common.InputImage
import com.google.mlkit.vision.text.Text
import com.google.mlkit.vision.text.TextRecognition
import com.google.mlkit.vision.text.latin.TextRecognizerOptions


import com.mrousavy.camera.frameprocessor.Frame
import com.mrousavy.camera.frameprocessor.FrameProcessorPlugin
import com.mrousavy.camera.types.Orientation

class OCRFrameProcessorPlugin(options: MutableMap<String, Any>?) : FrameProcessorPlugin(options) {

private fun getBlockArray(blocks: MutableList<Text.TextBlock>): List<HashMap<String, Any?>> {
val blockArray = mutableListOf<HashMap<String, Any?>>()

for (block in blocks) {
val blockMap = HashMap<String, Any?>()

blockMap["text"] = block.text
blockMap["recognizedLanguages"] = getRecognizedLanguages(block.recognizedLanguage)
blockMap["cornerPoints"] = block.cornerPoints?.let { getCornerPoints(it) }
blockMap["frame"] = block.boundingBox?.let { getFrame(it) }
blockMap["boundingBox"] = block.boundingBox?.let { getBoundingBox(it) }
blockMap["lines"] = getLineArray(block.lines)

blockArray.add(blockMap)
}
return blockArray
}

private fun getLineArray(lines: MutableList<Text.Line>): List<HashMap<String, Any?>> {
val lineArray = mutableListOf<HashMap<String, Any?>>()

for (line in lines) {
val lineMap = hashMapOf<String, Any?>()

lineMap["text"] = line.text
lineMap["recognizedLanguages"] = getRecognizedLanguages(line.recognizedLanguage)
lineMap["cornerPoints"] = line.cornerPoints?.let { getCornerPoints(it) }
lineMap["frame"] = line.boundingBox?.let { getFrame(it) }
lineMap["boundingBox"] = line.boundingBox?.let { getBoundingBox(it) }
lineMap["elements"] = getElementArray(line.elements)

lineArray.add(lineMap)
}
return lineArray
}

private fun getElementArray(elements: MutableList<Text.Element>): List<HashMap<String, Any?>> {
val elementArray = mutableListOf<HashMap<String, Any?>>()

for (element in elements) {
val elementMap = hashMapOf<String, Any?>()

elementMap["text"] = element.text
elementMap["cornerPoints"] = element.cornerPoints?.let { getCornerPoints(it) }
elementMap["frame"] = element.boundingBox?.let { getFrame(it) }
elementMap["boundingBox"] = element.boundingBox?.let { getBoundingBox(it) }
elementArray.add(elementMap)

}
return elementArray
}

private fun getRecognizedLanguages(recognizedLanguage: String): List<String> {
return listOf(recognizedLanguage)
}

private fun getCornerPoints(points: Array<Point>): List<HashMap<String, Int>> {
val cornerPoints = mutableListOf<HashMap<String, Int>>()

for (point in points) {
val pointMap = hashMapOf<String, Int>()
pointMap["x"] = point.x
pointMap["y"] = point.y
cornerPoints.add(pointMap)
}
return cornerPoints
}

private fun getFrame(boundingBox: Rect?): HashMap<String, Any> {
val frame = hashMapOf<String, Any>()

if (boundingBox != null) {
frame["x"] = boundingBox.exactCenterX().toDouble()
frame["y"] = boundingBox.exactCenterY().toDouble()
frame["width"] = boundingBox.width()
frame["height"] = boundingBox.height()
frame["boundingCenterX"] = boundingBox.centerX()
frame["boundingCenterY"] = boundingBox.centerY()
}
return frame
}

private fun getBoundingBox(boundingBox: Rect?): HashMap<String, Any> {
val box = hashMapOf<String,Any>()

if (boundingBox != null) {
box["left"] = boundingBox.left
box["top"] = boundingBox.top
box["right"] = boundingBox.right
box["bottom"] = boundingBox.bottom
}

return box
}

override fun callback(frame: Frame, params: Map<String, Any>?): Any? {
val result = hashMapOf<String, Any>()

val recognizer = TextRecognition.getClient(TextRecognizerOptions.DEFAULT_OPTIONS)

@SuppressLint("UnsafeOptInUsageError")
val mediaImage: Image? = frame.image
val orientation = Orientation.fromUnionValue(frame.orientation)

if (mediaImage != null && orientation!= null) {
val image = InputImage.fromMediaImage(mediaImage, orientation.toDegrees())
val task: Task<Text> = recognizer.process(image)
try {
val text: Text = Tasks.await(task)
result["text"] = text.text
result["blocks"] = getBlockArray(text.textBlocks)
} catch (e: Exception) {
return null
}
}

return hashMapOf("result" to result)
}
}
Loading