Anyone know a faster and more efficient way to che...
# codereview
t
Anyone know a faster and more efficient way to check if a given word is a shuffled version of another given word? Here's my current code:
Copy code
fun String.checkIfShuffledVersionOfAns(ans: String): Boolean {
    val listOfChars = this.toCharArray()
    val shuffledListOfStrings = mutableListOf<String>()

    var iterations = 0

    while (iterations < 100000) {
        listOfChars.shuffle()
        if (!shuffledListOfStrings.distinct().contains(listOfChars.concatToString())) {
            shuffledListOfStrings.add(listOfChars.concatToString())
        }
        iterations++
    }

    for (string in shuffledListOfStrings.distinct()) {
        if (string == this) return true
    }

    return false
}
e
Copy code
sorted() == ans.sorted()
🤯 1
5
p
Lol, and here I was thinking my solution was slick
Copy code
fun String.checkIfShuffledVersionOfAns2(ans: String): Boolean {
    return this == ans || this.groupingBy { it }.eachCount() == ans.groupingBy { it }.eachCount()
}
e
well, it's the anagram finder that everybody should have learned in CS 101, I can't really take credit for the idea. https://rosettacode.org/wiki/Anagrams
t
@ephemient what I was thinking of doing was first to check whether the two strings are equivalent in size, and if they are, I would create a dictionary with the letters from A-Z and checking whether the two strings have the exact values for each letter. Great solution though! Much simpler than what I was intending on doing.
p
that's essentially what my code does -
groupingBy
returns a
Group
object, and
eachCount
gives you a
Map<Char, Int>
e
kotlin.collections.sorted isn't an extension on String, but it is on iterable containers including CharArray
t
k thanks
e
for reference,
Copy code
toCharArray().apply { sort() } contentEquals ans.toCharArray().apply { sort() }
would be more efficient as it doesn't need to construct
List<Char>
but honestly that's a super minor compared to the improvement from original bogosort-like implementation
😂 2
m
To anyone who feels baffled by how simple solutions to such problems can be I can only strongly recommend to participate in https://adventofcode.com. 🙂
c
i wonder how much faster a version that does not sort the whole string would be
t
And if you ARE going to participate in Advent of Code, there’s an #advent-of-code room here where some of us get together and discuss solutions each day.
e
if you know the inputs only have a small number of symbols, you can use radix sort instead of comparison sort
or as a super simplified example, if you can assume that only the letters a-z are used, and strings are never longer than 255, then
Copy code
val counts = ByteArray(26)
for (char in this) counts[char - 'a']++
for (char in ans) count[char - 'a']--
return counts.all { it == 0 }
would do the job with less overhead than either the sorting or counting implementations in many cases
1
it really depends on the shape of the input, whether different trade offs make sense
c
for a general purpose solution that is faster than sorting you would need a sorting algorithm that sorts both strings at the same time and can detect that they differ without finishing the sorting. I really don’t know enough about sorting algorithms to know what sorting algorithm would work best for that
hmm ok your counting solution is probably hard to beat
ok thinking of it sorting is a very cpu expensive way of counting characters in a string so it may be the solution that needs the least custom code, but its never the simplest solution nor the fastest solution
t
This is why I love threads where @ephemient is in them. I always learn something. 🙂
j
I expect you might shave off a smidge by splurging on a bigger byte array so you don't need to subtract 'a' every time 😁
c
or use a map to make it work for as many chars as you want
e
my point was that ByteArray, or some other packed data structure, will work efficiently if you have a compact set of symbols. of course Map<Char, Int> works, but the constant factors will make it worse than in-place sorting for "word"-sized inputs
c
I don’t think thats true. sorting is more work than counting chars
but sure it will probably depend on your map implementation. but with a primitives map like trove my intuition is its faster than sorting
e
benchmark:
results on my machine:
Copy code
main summary:
Benchmark          Mode  Cnt   Score   Error  Units
Bench.arrayCount  thrpt    5  19.111 ± 0.452  ops/s
Bench.mapCount    thrpt    5   1.300 ± 0.006  ops/s
Bench.sorting     thrpt    5   9.599 ± 0.056  ops/s
Bench.troveCount  thrpt    5   4.074 ± 0.018  ops/s
it will depend on the data, but at least for word-like text (relatively short) it's clear that sorting wins over map-baseed solutions, even primitive maps, although counting with a fixed small alphabet is faster yet. full project attached for reproducibility
c
pretty cool. you are right! I added an optimized version of the trove version but its still slower:
Copy code
object TroveCountMutableInt {
    fun String.isPermutationOf(other: String): Boolean {
        val map = MutableMap()
        for (c in this) map.add(c, 1)
        for (c in other) map.add(c, -1)
        val iter = map.valueCollection().iterator()
        while (iter.hasNext()) {
            if (iter.next() != 0)
                return false
        }
        return true
    }
}

class MutableMap : TCharIntHashMap() {
    fun add(key: Char, value: Int) {
        var idx = insertKey(key)

        if (idx < 0) {
            idx = -idx - 1
            _values[idx] += value
        } else {
            _values[idx] = value
            postInsertHook(consumeFreeSlot)
        }
    }
}