Hi i am attempting to create an android application that use kotlinlang #android

Hi, i am attempting to create an android applicati...

Tower Guidev2

08/21/2023, 2:47 PM

Hi, i am attempting to create an android application that uses mlkit to extract text from the camera and detect article book identifiers using REGEX the identifiers i am interested in are DOI, ISBN, ArXivID, ISSN etc.. i managed to get doi detection working fine using this code...

not kotlin but kotlin colored 5

K 1

Tower Guidev2

08/21/2023, 2:47 PM

private const val _doiRegex_ = "10.\\d{4,9}/[-._;()/:A-Z0-9]+" +

"|10.1002/[^\\s]+" +

"|10.\\d{4,9}/[-._;()/:A-Z0-9]+\$" +

"|10.\\d{4}/\\d+-\\d+X?(\\d+)\\d+<[\\d\\w]+:[\\d\\w]*>\\d+.\\d+.\\w+;\\d" +

"|10.1021/\\w\\w\\d++" +

"|10.1207/[\\w\\d]+\\&\\d+_\\d+"

private val _digitalObjectIdentifierPattern_: Pattern = Pattern.compile(_doiRegex_, Pattern._CASE_INSENSITIVE_)

Tower Guidev2

08/21/2023, 2:49 PM

textRecognizer.process(inputImage) .addOnSuccessListener { visionText: Text -> val detectedText: String = visionText.text if (detectedText._isNotBlank_()) { val digitalObjectIdentifierMatcher: Matcher = digitalObjectIdentifierPattern.matcher(detectedText) when { digitalObjectIdentifierMatcher.find() -> { val doi = digitalObjectIdentifierMatcher.group() if (doi.length > 0L) onDetectedTextUpdated(doi) } } } } .addOnCompleteListener { continuation._resume_(Unit) }

Tower Guidev2

08/21/2023, 2:50 PM

however i cannot get ISBN detection to work using this REGEX and similar approach

Tower Guidev2

08/21/2023, 2:51 PM

private const val _isbnRegex_ ="^(?:ISBN(?:-1[03])?:? )?(?=[0-9X]{10}$|(?=(?:[0-9]+[- ]){3})[- 0-9X]{13}$|97[89][0-9]{10}$|(?=(?:[0-9]+[- ]){4})[- 0-9]{17}$)(?:97[89][- ]?)?[0-9]{1,5}[- ]?[0-9]+[- ]?[0-9]+[- ]?[0-9X]$"

private val _isbnPattern_: Pattern = Pattern.compile(_isbnRegex_, Pattern._CASE_INSENSITIVE_)

Tower Guidev2

08/21/2023, 2:55 PM

the visiionText.text can be multiline containing new line characters the regex very rarely matches anything and when it does work it misses the trailing digits what i would like to achieve is to detect the isbns with the ISBN-1X(:) prefix however only extract the actual ISBN "number" is this possible in one step? what is wrong with my approach?

Chrimaeon

08/21/2023, 4:06 PM

Your question is still not Kotlin related but more of a “how do regular expression” work question. A small hint to get the “number”: look for “capturing groups in regular expressions”

4 Views

Open in Slack

Previous Next