Roberto Leinardi
12/16/2024, 1:25 PMplatform.linux.gettext
in Kotlin/Native, but I always get a corrupted string back: the first 8 characters are replaced with unexpected data. Has anyone successfully used gettext
, or noticed any issues with it?
I’ve opened an issue to track this problem: KT-73948: Corrupted String Returned by gettext in Kotlin/Native.
Any insights would be greatly appreciated!Alexander Hinze
12/16/2024, 1:28 PMAlexander Hinze
12/16/2024, 1:29 PMRoberto Leinardi
12/16/2024, 1:32 PMtoKStringFromUtf8()
, it doesn't fix it unfortunately, I will update the ticket with this infoAlexander Hinze
12/16/2024, 1:33 PMRoberto Leinardi
12/16/2024, 1:34 PMAlexander Hinze
12/16/2024, 3:33 PMAlexander Hinze
12/16/2024, 3:43 PMRoberto Leinardi
12/16/2024, 3:47 PMfun main() {
// Set the locale to the user's default environment locale
setlocale(LC_ALL, null)
// You might need to call bindtextdomain if you had translations, but here it's just demonstration
bindtextdomain("messages", ".")
textdomain("messages")
val rawString = "Lorem ipsum dolor sit amet, consectetur adipiscing elit."
val ptr = gettext(rawString)
// Dump the raw bytes as hex to see what we actually got from gettext
if (ptr != null) {
print("Raw bytes from gettext: ")
var i = 0
while (true) {
val c = ptr.get(i)
if (c.toInt() == 0) break
// Print each character as a hex byte
print("\\x${c.toUByte().toString(16).uppercase()}")
i++
}
println()
}
val translated = ptr?.toKString() ?: rawString
println("gettext result = \"$translated\"")
}
output:
> Task :samples:playground:runDebugExecutableLinuxX64
Raw bytes from gettext: \xDC\xC\xDF\x6E\x87\x7A\x10\xC5\x73\x75\x6D\x20\x64\x6F\x6C\x6F\x72\x20\x73\x69\x74\x20\x61\x6D\x65\x74\x2C\x20\x63\x6F\x6E\x73\x65\x63\x74\x65\x74\x75\x72\x20\x61\x64\x69\x70\x69\x73\x63\x69\x6E\x67\x20\x65\x6C\x69\x74\x2E
gettext result = "��n�z�sum dolor sit amet, consectetur adipiscing elit."
Alexander Hinze
12/16/2024, 5:48 PMRoberto Leinardi
12/19/2024, 8:44 AMephemient
12/20/2024, 7:46 PM$ ./gradlew linkLinuxX64DebugExecutable
$ gdb build/bin/linuxX64/debugExecutable/gettext.kexe
(gdb) break dcgettext
Breakpoint 1 at 0x2f1030
(gdb) run
Thread 1 "gettext.kexe" hit Breakpoint 1, __GI___dcgettext (domainname=0x0, msgid=0x353988 "Lorem ipsum dolor sit amet, consectetur adipiscing elit.", category=5) at ./intl/dcgettext.c:47
(gdb) finish
Value returned is $1 = 0x353988 "Lorem ipsum dolor sit amet, consectetur adipiscing elit."
(gdb) watch *0x353988
(gdb) continue
Thread 1 "gettext.kexe" hit Hardware watchpoint 2: *0x353988
Old value = 1701998412
New value = 1463964853
tcache_put (tc_idx=3, chunk=0x353970) at ./malloc/malloc.c:3177
(gdb) where
#0 tcache_put (tc_idx=3, chunk=0x353970) at ./malloc/malloc.c:3177
#1 _int_free (av=0x7ffff7e25c60 <main_arena>, p=0x353970, have_lock=have_lock@entry=0) at ./malloc/malloc.c:4477
#2 0x00007ffff7cebf1f in __GI___libc_free (mem=<optimized out>) at ./malloc/malloc.c:3385
#3 0x0000000000260c0a in kfun:kotlinx.cinterop.nativeMemUtils#freeRaw(kotlin.native.internal.NativePtr){} (_this=0x30d530 <__unnamed_2033>, mem=0x353980)
at /mnt/agent/work/b5c630f73501b353/kotlin/kotlin-native/Interop/Runtime/src/native/kotlin/kotlinx/cinterop/NativeMem.kt:127
#4 0x0000000000260a3c in kfun:kotlinx.cinterop.nativeMemUtils#free(kotlin.native.internal.NativePtr){} (_this=0x30d530 <__unnamed_2033>, mem=0x353980)
at /mnt/agent/work/b5c630f73501b353/kotlin/kotlin-native/Interop/Runtime/src/native/kotlin/kotlinx/cinterop/NativeMem.kt:115
#5 0x000000000025e0b5 in kfun:kotlinx.cinterop.nativeHeap#free(kotlin.native.internal.NativePtr){} (_this=0x30d3c8 <__unnamed_1994>, mem=0x353980)
at /mnt/agent/work/b5c630f73501b353/kotlin/kotlin-native/Interop/Runtime/src/main/kotlin/kotlinx/cinterop/Utils.kt:33
#6 0x00000000002afa79 in kfun:kotlinx.cinterop.NativeFreeablePlacement#free(kotlin.native.internal.NativePtr){}-trampoline ()
at /mnt/agent/work/b5c630f73501b353/kotlin/kotlin-native/Interop/Runtime/src/main/kotlin/kotlinx/cinterop/Utils.kt:21
#7 0x000000000025df6c in kfun:kotlinx.cinterop#free__at__kotlinx.cinterop.NativeFreeablePlacement(kotlinx.cinterop.NativePointed){} (_this=0x30d3c8 <__unnamed_1994>, pointed=0x353980)
at /mnt/agent/work/b5c630f73501b353/kotlin/kotlin-native/Interop/Runtime/src/main/kotlin/kotlinx/cinterop/Utils.kt:27
#8 0x000000000025e9d0 in kfun:kotlinx.cinterop.ArenaBase#clearImpl(){} (_this=0x7ffff6bcc0a0)
at /mnt/agent/work/b5c630f73501b353/kotlin/kotlin-native/Interop/Runtime/src/main/kotlin/kotlinx/cinterop/Utils.kt:94
so it's getting freed at the end of the arenaephemient
12/20/2024, 7:47 PMrawString
for the duration of the gettext
call
• gettext
doesn't find a match in its catalog, so it returns the input string
• Kotlin frees the temporary C string
• now you have a dangling pointerephemient
12/20/2024, 7:50 PMgettext
APIs cannot be safe to use with automatic string conversion due to this. if you make your own bindings with noStringConversion
then you'd have control over when it gets freedAlexander Hinze
12/22/2024, 1:16 PMephemient
12/22/2024, 1:50 PMnoStringConversion
. but if you're using platform.linux
then the decision has already been made for youAlexander Hinze
12/22/2024, 1:51 PMRoberto Leinardi
12/23/2024, 4:02 PMnoStringConversion = g_dcgettext g_dgettext g_dngettext g_dpgettext
), and it seems to work fine:
fun main() = memScoped {
// Set the locale to the user's default environment locale
setlocale(LC_ALL, null)
// Bind text domain to current directory
bindtextdomain("messages", ".")
textdomain("messages")
// Prepare your raw string in a stable C buffer
val rawString = "Lorem ipsum dolor sit amet, consectetur adipiscing elit."
val cString = rawString.cstr.ptr
// Also prepare the domain name as a C string
val domainName = "messages".cstr.ptr
// Call dgettext with noStringConversion
val resultPtr = g_dgettext(domainName, cString)
// Convert the raw pointer to a Kotlin String
val translated = resultPtr?.toKString() ?: rawString
println("dgettext result = \"$translated\"")
}
However, I’m wondering if the current behavior of platform.linux.gettext
is still considered a bug. As it stands, it provides a broken binding for this very popular library. Should this be addressed in Kotlin/Native directly?ephemient
12/23/2024, 5:02 PMephemient
12/23/2024, 5:06 PM