kotlin intensifies purple robot face Curious about <https k kotlinlang #announcements

:kotlin-intensifies-purple::robot_face: Curious ab...

Alina Dolgikh [JB]

02/17/2025, 1:11 PM

K🤖 Curious about how AI models handle Kotlin? We put DeepSeek-R1, several OpenAI models, and others to the test using Kotlin-specific benchmarks. Here’s what we discovered: ✅ How well different models respond to Kotlin-related questions ✅ How the models compare in accuracy and reasoning ✅ Where they succeed – and where they struggle 👉 Read the full analysis: https://kotl.in/ai-models

kodee happy 27

👍 5

eygraber

02/17/2025, 4:45 PM

Why wasn't Grok included? I used ChatGpt, Claude, Gemini, and Grok concurrently for a few months, and Grok was consistently the best, especially with Kotlin related prompts. I still check in on the others every once in a while to see if there are any improvements, but I've switched to using Grok most of the time.

👀 1

Vera

02/18/2025, 3:35 PM

Thank you for sharing your opinion! On Kotlin_QA, Grok does get a pretty good 8.43/10. But on KotlinHumanEval, it's only 70.19%. It crashes 14.91% on tests, and 11.8% on compilation, most often on deprecated toLowerCase()/toUpperCase() and forgotten imports.

👍 1

eygraber

02/20/2025, 5:08 AM

Well it looks like Grok 3 (released today) fixes the toLowerCase/toUpperCase issue 😅

darkmoon_uk

03/08/2025, 9:52 AM

I'd sooner see Mistral 's models given attention than Grok, but that's perhaps more about the association than the models themselves. 🇪🇺 💪 The recent release of their claimed state-of-the-art Codestral model would make this interesting.

59 Views

Open in Slack

Previous Next