Would someone be able to explain this code line by line for kotlinlang #announcements

Would someone be able to explain this code line by...

Dominick

02/26/2021, 7:20 AM

Would someone be able to explain this code line by line (for the first two lines, the findAll line and the return lines) for me?

Copy code

private const val WORD_PATTERN = """[A-Za-z][A-Za-z']*"""
private val WORD_REGEX = Regex(WORD_PATTERN)

fun top3(s: String): List<String> {
    val words = WORD_REGEX.findAll(s).map{ it.groupValues[0].toLowerCase() }
    val occurrences = mutableMapOf<String, Int>()
    for (word in words) {
        occurrences[word] = (occurrences[word] ?: 0) + 1
    }
    return occurrences.toList()
        .sortedByDescending{ it.second }
        .map{ it.first }
        .take(3)
}

nope 1

Zach Klippenstein (he/him) [MOD]

02/26/2021, 7:48 AM

This probably isn’t how I’d write this – some of the choices obscure (what I believe to be) the intent unnecessarily. This snippet calculates the frequency of each word in a string and returns the three most repeated words.

Copy code

private const val WORD_PATTERN = """[A-Za-z][A-Za-z']*"""
private val WORD_REGEX = Regex(WORD_PATTERN)

This uses a triple-quote string to quote a regular expression. Triple quotes are usually used for strings that include a lot of backslashes, double quotes, or newlines since they don’t need to be escaped. A more idiomatic way to write this in Kotlin would be to simply call

WORD_PATTERN.toRegex()

Copy code

val words = WORD_REGEX.findAll(s).map{ it.groupValues[0].toLowerCase() }

I believe

findAll

returns a list of `Match`es that represent distinct matches of the regular expression in the string. A single regex match can contain multiple “groups”. If the regex uses parentheses, those will each be a group (e.g.

(a)b

the

(a)

will form a group). However, every match has at least one group: the entire match itself. So

groupValues[0]

just means the whole matched substring. It’s converting it to lowercase so that the same words with different cases are counted as the same word.

Copy code

val occurrences = mutableMapOf<String, Int>()

This creates a read-only variable of type

MutableMap<String, Int>

. The instance referenced by the variable can’t change, but the map itself can. It’s typical to use

val

with

Mutable*

collections.

Copy code

for (word in words) {
    occurrences[word] = (occurrences[word] ?: 0) + 1
}

Iterates over the list of all the words and counts occurrences. For each word, it looks up the current count for that word in the map adds 1, and puts the new count back in the map. If the word does not exist in the map yet, 0 is used as the initial value. A slightly more concise way to write this would be to use the

fold

operator over the list, although there might be an even more concise way to do it using one of the grouping operators.

Copy code

return occurrences.toList()

Since

occurrences

is a

Map<String, Int>

, this converts it into a

List<Pair<String, Int>>

. For every key/value pair in the map, the returned list contains a

Pair

Copy code

.sortedByDescending{ it.second }

This takes the list of pairs and returns a new list of pairs, but the new list will be sorted by the second value in the pair (the

Int

word count). It’s sorted in descending order, so the largest value comes first.

Copy code

.map { it.first }

This transforms the

List<Pair<String, Int>>

to a

List<String>

– the new list only contains the first value of each pair, which is the lowercase word.

Copy code

.take(3)

This takes the incoming list and returns a new list that is at most 3 elements in size, containing up to the first 3 elements of the original list. Since the list is sorted, these are the words with the three biggest frequencies.

❤️ 2

nkiesel

02/26/2021, 8:36 AM

nice explanation. slightly shorter version:

Copy code

fun String.topNwords(n: Int = 3) = Regex("""[a-zA-Z][a-zA-Z']*""")
    .findAll(this)
    .map { it.value.toLowerCase() }
    .groupingBy { it }
    .eachCount()
    .entries
    .sortedByDescending { it.value }
    .take(n)
    .map { it.key }

☝️ 2

❤️ 1

Zach Klippenstein (he/him) [MOD]

02/26/2021, 3:18 PM

I thought there was something like

eachCount

but it was too late and I was too lazy to google it last night 😂

Dominick

02/26/2021, 3:23 PM

Thank you guys for explaining it to me! It was a codewars solution and I didn't really understand it much. I appreciate the help 🙂

5 Views

Open in Slack

Previous Next