The new `kotlinx.fuzz`. The example provide is they used it to keep `kotlinx.datetime` and `java.tim...
u
The new
kotlinx.fuzz
. The example provide is they used it to keep
kotlinx.datetime
and
java.time
in sync. Which is rather exotic case. How would you use fuzzing in a typical app? Or is it a library thing? Asking mostly, since I know of fuzzing but in context of memory unsafe languages, to surface vulnerabilities. JVM is memory safe ..so..does it matter?
e
it's useful for discovering unexpected states of your program, whether that be memory-related or not
u
what do you mean by unexpected? do people expect, aka assert something? or is simply to crash the goal?
e
could be anything you want to test
u
I dont know yet šŸ˜€ Im looking if it would be useful for my app, which I guess is "normal"
but I did like the monkey tester on android, i.e. the concept of run everything to see if it breaks
e
in the example, they want
for all string inputs, kotlin.time.Duration parsing is the same as java.time.Duration parsing
and it is unreasonable to actually test "all string inputs", so they let the framework try to probe for examples
u
but in practise it did a lot of nothing so I ditched it and this sounds similar
e
in that case, it's more akin to property testing, for which I've used https://github.com/JetBrains/jetCheck (but it's not multiplatform)
u
yea but keeping 2 apis in sync is very very niche case
e
a coverage-based fuzzer will be able to hit more cases
u
so youd simply provide it with entry point to the whole app? (say mobile app)
e
no
one particular case that jetCheck is specialized for is to make sure that any sequence of commands to move between states in your program, always result in a state that maintains invariants (whatever you define)
but perhaps you also want to test "should always return success or error and never throw an exception on any input" for some specific function in your program
u
okay but the fuzzer, thats a class/function level thing then?
Im assuming jetcheck is not a fuzzer
e
depending on who you ask, fuzzers and property-based testing are similar, overlapping, or the same thing
the goal is to find inputs that exercise edge cases in your code
most (although not all) property tests use random inputs, and many (although not all) are able to take a failing test case and continue mutating it to find the smallest failing test case (to make it easier to debug)
most (although not all) fuzzers use random inputs, and many (although not all) use code coverage to guide which inputs are more interesting to continue testing
u
say I have some sort of deeplink router that takes a deeplink, thats from outside the app would that be a good candidate for fuzzing? how shoukd then such fuzz test look like, given the router is basically a giant when statement? just that no exceptions are thrown? or rather, how would you define unexpected behavior here, other than not resulting in the desired destination or no-op if unknown (but I guess that what unit tests are for?)
-- okay so that implies I figure out my compute budget and once its spent I assume im good?
e
it's not the suitable for everything. but a few things I'm using jetCheck for, • testing that our geodesic distance function always returns a reasonable result, even for edge cases that our unit tests don't cover • simulate a user inserting and deleting text, and testing that our TextWatchers (which are used to modify a TextEdit while the user is typing) do not end up in invalid states the first one is something that kotlinx.fuzz could do pretty well too, although the search space is small enough that it's a bit overkill (in practice you can focus 90% of your energy around the edge cases that are known to be tricky, like going across the poles or across the antimeridian)
u
in the first case, does that mean you assert the resulting distance being say < earth circumference .. or something like that?
2) is invalid an explicit state type/enum? or just a combination of properties that should never happen? so again a simply an assert? (since I like to throw
error("Should never happen")
in such cases)
e
1 includes things like "is not too much longer than a linear distance (which is easy to correctly calculate)" and "follows triangle inequality" (so we can use other values to judge the accuracy of unknown ones)
2 includes things like "doesn't delete the very thing the user is typing as long as it's allowed by the filters" and "doesn't end up with text that would be rejected by filters"
u
and technically is easy as
assertThat(distance).isLessThan(..)
?
since I was reading up on jazzer and their asserts are like "doesnt so IO" etc
which is somehow magically available
e
cases where you're parsing things with unbounded cardinality are more interesting to fuzz
u
I see. but the actual way to define a goal, is via the same asserts I'd use in unit tests right? So say in my deeplinking toy case - where noop or some bacstack is expected, and never throws mean unexpected would only mean if I threw right? so simply
assertThat(true).isTrue()
at the end?
e
well, unexpected could mean many different things
for example, perhaps you have a part of your route that should be numeric, and your parser allows some value there that it shouldn't (non-numeric, integer overflow, whatever). so it shouldn't have returned success, but does. a fuzzer or property tester could help find that, if you define the expectations correctly
u
okay so maybe to generalize unit tests check that desired behavior happens and fuzzers is the inverse, so to check that undesired behavior doesnt happen?
e
I would put it more that unit tests are typically written to cover a fixed set of inputs, and fuzzers are used to discover interesting inputs that were not covered by unit tests
if you do find a bug with a fuzzer, you should probably convert that case into a unit test to prevent recurrence in the future
if you look in any compiler's test suite, you'll see that they're full of those types of test cases
u
btw oss fuzz, they claim so many security vulnerabilities surfaced I mean.. I do see how a fuzzer coukd stumble upon such combination of inputs but how would one even notice that? i'd need to run some sort of atestation or.. idk?
I also see mentions log4shell being preventable.. but how do you even check for the remote execution oportunity?
seems to me that check is more difficult bit rather than the input search space exploration
c
but how would one even notice that?
For example, "no matter what the input is, there is never an internet connection". You could do that at the OS-level by removing the right for internet access, which would throw an exception in your code, and then you run the fuzzer looking for that particular exception
u
How would do that? docker?
c
For example, yeah. But generally, any kind of firewall I guess
e
or security managers inside the JVM, before they were deprecated…
but in the log4j case, it turns out that certain tags in the text would cause classloading to happen, and replace the text, so either checking that loaded classes are within the expected set, or that the log text is exactly what it was sent in, for all fuzzed text, could have caught this