Hi could someone please help me find the Arrow way of doing kotlinlang #arrow

Hi, could someone please help me find the Arrow 'w...

kartoffelsup

12/15/2019, 12:12 PM

Hi, could someone please help me find the Arrow 'way' of doing Applicative:

*>

and

<*

? -> https://hackage.haskell.org/package/base-4.12.0.0/docs/Prelude.html#v:-42--62- I'm trying to implement/follow

https://www.youtube.com/watch?v=N9RUqGYuGfw▾

in Arrow.

👀 1

Jannis

12/15/2019, 12:22 PM

We only have

*>

in Monad for some reason. It's called

followedBy

there. Imo we really need to move it to

Apply

instead. There is no version of

<*

in arrow afaik. https://hackage.haskell.org/package/base-4.12.0.0/docs/src/GHC.Base.html#%2A%3E has he implementation in terms of

ap

,or

<*>

as it's called in haskell land and it should be easy to port them to kotlin I think.

Miguel Coleto

12/15/2019, 12:23 PM

I think you can do this kind of stuff with

Alternative

because that is what is used in haskell to do applicative parsing

Miguel Coleto

12/15/2019, 12:24 PM

But we have problems in Arrow with laziness in Applicative

kartoffelsup

12/15/2019, 12:24 PM

Thanks! I'll try to implement it via Alternative 😄

Jannis

12/15/2019, 12:25 PM

It depends on the operator you want. Haven't seen the video, but I guess you need both. And yes applicative is pretty broken if you want to use recursive code 😕

kartoffelsup

12/15/2019, 12:26 PM

https://github.com/tsoding/haskell-json/commit/bafd97d96b792edd3e170525a7944b9f01de7e34#diff-98a3ddc5c474fd14c4ee9401c03197a1R80 this is where I'm stuck at at the moment

Jannis

12/15/2019, 12:30 PM

You definitly do need applicative there. It's probably best to re-implement those methods given the implementation in haskell. We do need these methods on the

Apply

typeclass...

Jannis

12/15/2019, 12:31 PM

Sadly parsers won't be a strength of arrow any time soon (at least not in the style of haskell because of laziness). I've come pretty far writing something equal to https://github.com/mrkkrp/megaparsec but later failed because of laziness and also stack-overflows 😕

kartoffelsup

12/15/2019, 12:40 PM

Do you have a pointer to where I can find the Haskell Implementation? 😄 Or some sort of doc on how to implement these?

Jannis

12/15/2019, 12:44 PM

https://hackage.haskell.org/package/base-4.12.0.0/docs/src/GHC.Base.html#%2A%3E This? ^^ They are implemented in terms of

liftA2

which also has a default implementation. In short just follow the functions from

*>

down to

liftA2

they are all in a way implemented with

<*>

, if you need help with it just ask, haskell code full of operators is not easy to read 😄

kartoffelsup

12/15/2019, 12:45 PM

thanks, I'll try my best 😄

Jannis

12/15/2019, 1:49 PM

Created an issue so that we eventually implement it in arrow 🙂 https://github.com/arrow-kt/arrow/issues/1839

kartoffelsup

12/15/2019, 1:53 PM

Cool thanks 🙂

kartoffelsup

12/15/2019, 1:55 PM

I did it like this (stole the liftA2 form Scala: https://scalaz.github.io/scalaz/scalaz-2.9.1-6.0.4/doc.sxr/scalaz/Applicative.scala.html#39541) but frankly I don't know if this is correct :d It seems to work with the json parser

Copy code

fun <A, B, C> Kind<ForParser, A>.liftA2(a: Kind<ForParser, A>, b: Kind<ForParser, B>, f: (A, B) -> C)
        : Parser<C> {
    return b.ap(a.fix().map(f.curry())).fix()
}

fun <A, B> Kind<ForParser, A>.leftWins(right: Kind<ForParser, B>): Parser<A> = liftA2(this, right) { a, _ -> a }

fun <A, B> Kind<ForParser, A>.rightWins(right: Kind<ForParser, B>): Parser<B> = liftA2(this, right) { _, b -> b }

Jannis

12/15/2019, 1:57 PM

that is an alias for

Apply.map

btw ^^

Jannis

12/15/2019, 1:58 PM

And yes this will work and is actually a better implementation than the haskell one (at least for kotlin and scala that is)

kartoffelsup

12/15/2019, 2:01 PM

Ah, so liftA2 == Apply.map?

Jannis

12/15/2019, 2:01 PM

Would be great if you could add a pr for that,

leftWins

followedBy

and rightWins needs a good name

Jannis

12/15/2019, 2:01 PM

only in scala and kotlin

Jannis

12/15/2019, 2:01 PM

the haskell one is different afaik

Jannis

12/15/2019, 2:02 PM

actually nvm

Jannis

12/15/2019, 2:02 PM

it's the same

Jannis

12/15/2019, 2:04 PM

Haskell just has laziness optimizations for

*>

and

<*

so they are implemented a little different there. That is something kotlin and scala cannot have anyway

kartoffelsup

12/15/2019, 3:06 PM

I wanted to PR it after I finished the jsonParser but I'm totally stuck now since Alternative.many and Alternative.some recurse infinitely. I guess I have to provide either of those for it to work. But I can't get the many implementation to work.. 😞

Jannis

12/15/2019, 3:07 PM

Yes, and that needs to be documented somewhere. I've fallen in that trap once as well

Jannis

12/15/2019, 3:13 PM

The reason we cannot get both implemented in a default way is because ap always evaluates its arguments, so we'd need a lazy ap variant.

Jannis

12/15/2019, 3:15 PM

For the parser it's probably best to define many using generateSequence or something similar

kartoffelsup

12/15/2019, 3:16 PM

I've pushed the code to a repo but I'm still stuck at the many implementation. I'm doing it wrong and am totally lost now 😄 https://github.com/kartoffelsup/haskell-inspired-arrow-json-parser/blob/master/src/main/kotlin/io/github/kartoffelsup/json/typeclass.kt#L41

kartoffelsup

12/15/2019, 3:33 PM

Can I traverse this:

Copy code

val seq: SequenceK<Tuple2<String, A>>

into

Copy code

val seq2: Tuple2<String, SequenceK<A>>

? I guess I need a Applicative for A which I can't provide 😞 doesn't seem to work this way

Jannis

12/15/2019, 3:33 PM

Hmm:

Copy code

runParser.fold({
  (input toT emptySequence()).some()
}, { (r, a) -> many().runParser(r).fold({ r toT sequenceOf(a) }, { (r2, xs) -> r2 toT (sequenceOf(a) + xs) }) })

This might work. Wrote this here in slack, so no idea about types etc, but the idea here is to skip the call to some and only recurse with many. This is still not great because it will always fully evaluate the whole thing till failure but it might work. ^^ What

many

does is it basically re-runs a parser untill it fails and collects the results along the way. The first fold checks if the current iteration succeeds and the second one if the recursive rest succeeded or failed

Jannis

12/15/2019, 3:35 PM

Yes you could traverse the sequence, but that is not actually what you need because it would use the tuple applicative instance and that combines the first part of the tuple

Jannis

12/15/2019, 3:35 PM

Also note: traversing sequences always fully evaluates them, never do that on infinite sequences

Jannis

12/15/2019, 3:36 PM

Which is another short-coming of not having a lazy ap

kartoffelsup

12/15/2019, 3:36 PM

Oh god 😄 This is really killing my OOP 'brain' here

kartoffelsup

12/15/2019, 3:37 PM

Thanks for the implementation suggestion, I'll try If I can get that to work

Jannis

12/15/2019, 3:40 PM

All of this threw me off as well when messed around with a parser encoding as well. I know the struggles 😄 The haskell one is so concise and easy to read, but that doesn't translate too well to kotlin because of laziness. Raul and I were talking about an idea to auto-gen lazy variants for functions without changing source code a while back and that might come to arrow-meta someday (not planned atm because theres more important things first, also I honestly don't know if thats even possible 😄)

Jannis

12/15/2019, 3:44 PM

Btw this will blow the stack for longer parsers and there is no easy way around it. For simple things it'll be alright, but as soon as you get complicated/long input or you use a recursive/long parser with many/some it will stop working. Both the laziness and the stackoverflows can be solved with eval and trampolines, but using those manually is painful so I chose to wait for meta with my parser experiments after reaching the point where everything else worked 😕

kartoffelsup

12/15/2019, 3:45 PM

Mh, that's some bad news 😞 Up to this point everything worked so well 😄

Jannis

12/15/2019, 3:46 PM

You can get it to work, but there is hard limits on how much input it can handle or how complicated the parser can be 😞 It's a good learning experience anyway 👍

kartoffelsup

12/15/2019, 3:47 PM

Thanks for the fix 🙂 just tested it with your many implementation and if I understood correctly it should work for small sequences 😄

Jannis

12/15/2019, 4:41 PM

Btw there is a small mistake in your implementation 😉 left a comment for that. Github is pretty quick in sending out emails for every mention 😄

kartoffelsup

12/15/2019, 4:42 PM

Oh maybe that's what's giving me stackoverflows? 😄

kartoffelsup

12/15/2019, 4:44 PM

Thanks! Now I'm stuck with StackOverflows 😄 I'm starting to lose my mind.

Jannis

12/15/2019, 4:44 PM

Fun ^^ You have just discovered how unbelievably good laziness actually is

😄 1

kartoffelsup

12/15/2019, 4:45 PM

Copy code

at io.github.kartoffelsup.json.JsonParserKt.stringParser(JsonParser.kt:56)
	at io.github.kartoffelsup.json.JsonParserKt.jsonNull(JsonParser.kt:71)
	at io.github.kartoffelsup.json.JsonParserKt.jsonValue(JsonParser.kt:115)
	at io.github.kartoffelsup.json.JsonParserKt.jsonArray(JsonParser.kt:108)

Guess it's time to throw the towel 😢

kartoffelsup

12/15/2019, 4:45 PM

It overflows because I call jsonValue() in jsonArray() apparentl

kartoffelsup

12/15/2019, 5:01 PM

Thanks for your help, this was fun 🙂 Maybe if I study more I can also solve the SOEs 😄

jimn

12/15/2019, 5:09 PM

how far along is this?

kartoffelsup

12/15/2019, 5:10 PM

The code is here: https://github.com/kartoffelsup/haskell-inspired-arrow-json-parser

jimn

12/15/2019, 5:11 PM

that video makes it undeniably clear that not having a haskell background disqualifies one from simply improving thier kotlin FP

jimn

12/15/2019, 5:12 PM

text wrangler brings me back to commmodore 64 on-disk assembler

jimn

12/15/2019, 5:19 PM

i was under the impression that a c++ and STL background made me somehow challenged in outlook but clearly i never had to worry if my c++ or java library could only be satisfied by double penetration

🙊 3

kartoffelsup

12/15/2019, 5:27 PM

I found the video very intriguing, although I didn't like the 'penetration' explanations of fmap and <*> :D

💯 1

jimn

12/15/2019, 5:38 PM

im running your repo and i see the stack overflow, but however you might be able to benefit from incredible instantiation gymnastics of betterparse json parser to double check your attempt

jimn

12/15/2019, 5:38 PM

that was an education (for me ) in the kotlin type system well beyond anything published on the web right now

jimn

12/15/2019, 5:41 PM

there are some potentially runnable candidates in the forks of betterparse https://github.com/h0tk3y/better-parse/network

Jannis

12/15/2019, 6:44 PM

Btw the stackoverflow is caused by constructing the parser, not yet running it. That stackoverflow can be removed by writing something like a custom ap or flatMap that takes an Eval or a function to a parser rather than a parser directly. That will lead to annoying code, but will make constructing the parser stack-safe. The next problem is that running it will end up blowing the stack. Solving that is a lot harder.

arrow 1

kartoffelsup

12/15/2019, 8:52 PM

Do you have more info on how to do it? I can't wrap my head around it :'(

Jannis

12/15/2019, 9:18 PM

Not atm. This is the commit in which I removed the parser I wrote. It has

lazyAp

as an operator. You can find the parser code in the KParsec file and a parser with it in the Parser file. The point of this was to parse kotlin data class show output into a generic format, diff it with other show output and pretty print it for awesome looking test failure reports. I'll redo this when I have a good solution for a stacksafe execution of that parser. https://github.com/1Jajen1/propCheck/pull/26/commits/2f45b8d6ee33c222f1e3547bd714a3c07e7b2a96

Jannis

12/15/2019, 9:18 PM

The parser works a bit different tho

Jannis

12/15/2019, 9:19 PM

It's "slightly" more complicated because it has different goals than the example parser from the video

Jannis

12/15/2019, 9:19 PM

I can look through your code tomorrow and see if I can come up with a better example if thats still relevant then 🙂

Jannis

12/15/2019, 9:21 PM

If you have questions on how this works, just ask 😄 It's basically a port of megaparsec from haskell, and that library is pretty advanced, I tried porting it to better understand it, just looking at it made me very confused in the beginning especially because of the cps encoding

Jannis

12/15/2019, 9:22 PM

But if you substitute

with

String

and

M<A>

with

Option<A>

you have almost the same thing

Jannis

12/15/2019, 9:25 PM

This failed because it builds up a large function chain and that overflows quickly for complex input. When I get around to it I'll rework this with either

Trampoline

calls, explicit stack jumps by counting stack depth or if arrow-meta has some options by using those, but this parser code is definitly coming back 💯

Jannis

12/15/2019, 9:32 PM

I am also using quite a few tricks with monad comprehensions to make code lazy, but for your parser, since it's not monadic, a lazy ap method should work. Don't get overwhelmed by all the code and focus just that

kartoffelsup

12/15/2019, 9:58 PM

Thanks! I'll take a stab at it tomorrow. Really appreciate your help. :)

kartoffelsup

12/21/2019, 9:11 AM

for those who were interested, I've managed to at least fix the SOE when constructin the parsers and finished implementing it. It works now (for small objects 🙂)

🎉 1

kartoffelsup

12/21/2019, 9:12 AM

the concept of

ap

is really hard to grasp for me

Jannis

12/21/2019, 10:29 AM

ap is definitly weird. The concept of applicative is definitly easier to understand by looking at `liftA2`/`Applicative.map` . You have a

Kind<F, A>

and a

Kind<F, B>

an a function to combine them

(A) -> (B) -> C

and then create a

Kind<F, C>

. Looking back at

ap

, it is basically the same thing, but the function is already partially applied inside

. As for semantics, it is again best to look at

Applicative.map

: Applicative allows to combine independent `F`'s, some thing

Functor

can't do.

💯 1

kartoffelsup

12/22/2019, 12:58 AM

FYI this doesn't even get to SOE with large JSON Objects, it runs out of memory first 😄 Since the charParser parses each char seperately and returns a new string without the parsed char (head). I tested it with a 180mb json object and it almost immediately OOMs (even with 10gb heap :D) but despite it being very slow and not able to parse bigger objects, it was a fun exercise. Thanks again for your help 🙂

Jannis

12/22/2019, 1:02 AM

Yikes. Out of memory is unexpected 😄 My parser encoding worked using a continuation passing style encoding, that rarely allocates, if at all, so it's really efficient, just not stack safe on the jvm. Building parsers and writing combinator libs is always lots of fun and especially learning how to use applicative and alternative to full effect is ^^

Jannis

12/22/2019, 1:02 AM

Filling out a 10gb heap is weird tho, like why? 😄

Jannis

12/22/2019, 1:03 AM

I mean even a few strings should not cause that, that must be thousands

kartoffelsup

12/22/2019, 1:03 AM

I think it's due to the charParser always setting a new string after parsing 1 char at a time

kartoffelsup

12/22/2019, 1:03 AM

and 180mb worth of json is a lot of chars

kartoffelsup

12/22/2019, 1:03 AM

and then it explodes quite quickly 😄

Jannis

12/22/2019, 1:03 AM

Did you look at a heap profile? This seems odd 😄

kartoffelsup

12/22/2019, 1:04 AM

https://github.com/kartoffelsup/haskell-inspired-arrow-json-parser/blob/master/src/main/kotlin/io/github/kartoffelsup/json/JsonParser.kt#L53 I think this is the culprit. I will probably check memory tomorrow, need to sleep soon 😄

Jannis

12/22/2019, 1:04 AM

Or is it literally every char == a new string which is old string minus 1?

kartoffelsup

12/22/2019, 1:04 AM

exactly ^

Jannis

12/22/2019, 1:08 AM

well there goes my idea of dropping cps style encoding to use a concrete one 😅 Trading stackoverflows for out of memory errors seems impractical as the former can actually be solved without major changes to the actual encoding...

kartoffelsup

12/22/2019, 1:10 AM

yeah I think the haskell version will probably oom as well with large objects, right? So I guess my deed is done 😄 I might also solve the todos that the original video added as extra fun exercises (error reporting, parsing floats as well, escaped string support) but I guess as long as the charParser is used, it will eventually run out of memory.

Jannis

12/22/2019, 1:15 AM

Well probably not because haskells strings are immutable lists of chars and dropping one there will not allocate at all, it's likely just a pointer forward. If you do the same using haskell text (which you should use in almost all cases!) then probably

Jannis

12/22/2019, 1:16 AM

I'd skip parsing floats, the reason it's left is because getting it right is hard. (In general actually writing a spec adhering json parser is hard, not even ones like helios, jackson etc do the full spec properly, but no-one cares about edge cases anyway :D)

kartoffelsup

12/22/2019, 1:16 AM

😄 I think error reporting might be a fun task, replacing Option with Either or something similar

Jannis

12/22/2019, 1:17 AM

That reminds me, if you use sequences instead of strings you get a similar effect, and since you won't do random access it might even be fine performance wise, maybe worth a try

kartoffelsup

12/22/2019, 1:17 AM

ooh, that sounds interesting. I will definitely check that out tomorrow. Thanks! 🙂

jimn

12/22/2019, 1:18 AM

ByteBuffer.wrap(String.toByteArray()) might get you more immutable memory from source input

Jannis

12/22/2019, 1:18 AM

Just keep one thing in mind: calling drop on a dropping sequence will mean the next time you iterate that sequence will start from the beginning, so it might get very slow ^^

kartoffelsup

12/22/2019, 1:19 AM

it's already very slow 😄

Jannis

12/22/2019, 1:19 AM

@jimn would that mean having to track the current position or can you slice and dice that buffer without allocation?

jimn

12/22/2019, 1:20 AM

bb.slice() creates a new bb with position 0..ladjustedlimit

jimn

12/22/2019, 1:20 AM

it is a view of parent

Jannis

12/22/2019, 1:21 AM

Can you also get a different starting point? If so then that is just the right thing to use...

jimn

12/22/2019, 1:21 AM

this lets you mmap the json input using RandomAccessfile .. assuming all is jdk ofcourse

Jannis

12/22/2019, 1:21 AM

Never used those before (haven't done much low level memory stuff at all, since haskell usually never needs it and arrow as well is pretty high-level)

jimn

12/22/2019, 1:21 AM

bb.position(n).slice().limit(x) is what i use a lot for bounding a fixed width field

Jannis

12/22/2019, 1:23 AM

that's cool. @kartoffelsup that might safe your parser 😄

kartoffelsup

12/22/2019, 1:23 AM

great 🙂 I will try that out thanks!

kartoffelsup

12/22/2019, 1:23 AM

but once I 'fix' the memory issue I will probably have to tackle those pesky stack overflows 😄

jimn

12/22/2019, 1:27 AM

i appreciate that you are building a strictly arrow json parser, that is the most practical possible before and after learning aid

Jannis

12/22/2019, 1:28 AM

They will probably come soon after 😄 I think raul mentioned that sooner or later arrow-meta is going to get a compose/andThen operator which at compile time tries to predict call depth and will attempt to guarantee stack safety. That is what I am waiting for with my parser attempt.

jimn

12/22/2019, 1:31 AM

i have built a currying bytebuffer java library based on enums being squeezed into varargs https://github.com/0xCopy/bbcursive/blob/master/src/main/java/bbcursive/Cursive.java#L21

Jannis

12/22/2019, 1:32 AM

@jimn parsing using applicatives/alternatives leads to super easy to use combinator libraries. Continuation style ones are also pretty fast and relatively easy to write if it weren't for stacksafety problems on the jvm. For now the jvm really hurts those encodings in both performance and feasibility. And I really don't want to go down to low level hacking ...

Jannis

12/22/2019, 1:33 AM

Btw my java is not strong enough to parse that ^-^

jimn

12/22/2019, 1:35 AM

i wake up every day and stare at the monad tutorial and it gels a little bit more. as near as i can tell ByteBuffer is from a sun c++ programmer who very much missed STL iterators and wrote something comparable to bypass java gc/heap. you can write very close pure C in java using just ByteBuffer and writing your own malloc/free

jimn

12/22/2019, 1:35 AM

when i did this while profiling an OO poker engine the the speedup was 30x

jimn

12/22/2019, 1:37 AM

actually writing free is a waste, it is very condusive to haskell to just consume a slab of ByteBuffer from a single forward iterator and burn from front-to-back and then do it again

Jannis

12/22/2019, 1:40 AM

I usually try the exact opposite: Stay away from c-like stuff as far as possible 😄 Again I haven't really worked with anything that needed low-level code and usually in haskell performance is solved assisting the compiler and not bypassing it (yields much better code anyway). Also performance is very much overrated in most application code, unless you are doing libraries, real-time stuff or large data analysis most high-level code will perform just fine.

Jannis

12/22/2019, 1:40 AM

Different problems lead to different solutions and approaches I guess 😉

jimn

12/22/2019, 1:41 AM

and then there was EJB

jimn

12/22/2019, 1:44 AM

for me this was just an accidental discovery from profiling , and maybe a hint of previous work in image formats.

kartoffelsup

12/22/2019, 1:33 PM

Finally made it to the real StackOverflowErrors: https://github.com/kartoffelsup/haskell-inspired-arrow-json-parser/commit/9db4c9702cb881ff8dbf6ad5e6e85b88802cc154 took me way longer that I'd like to admit :D

Jannis

12/23/2019, 1:21 AM

@kartoffelsup today I stumbled over https://github.com/arrow-kt/arrow/blob/7d9228afc16e451e666c563e9670f6422000a3f7/modules/core/arrow-core-data/src/main/kotlin/arrow/core/AndThen.kt randomly. That can actually solve both our parsers problems of stacksafety. Just write AndThen<String, Option<Tuple2<String, A>>> and composition should avoid stack overflows the same way IO does. I will definitely try this on my parsec port

👍 1

💯 1

kartoffelsup

12/23/2019, 1:54 AM

Nice, thanks! Will check it out tomorrow :)

kartoffelsup

12/23/2019, 10:33 AM

How do I chain these correctly?

Copy code

val runParser: AndThen<StringView, Option<Tuple2<StringView, A>>>

fun <B> ap(ff: ParserOf<(A) -> B>): Parser<B> = Parser(
    ff.fix().runParser.map { op ->
        op.flatMap { (input2: StringView, f: (A) -> B) ->
            // I need to ues input2 here or I'm stuck in an infinite loop
            // but calling the AndThen here leads to the same stackoverflow
            fix().runParser(input2).map { (input3: StringView, g: A) ->
                Tuple2(input3, f(g))
            }
        }
    }
)

Copy code

fun <B> ap(ff: ParserOf<(A) -> B>): Parser<B> = Parser(
    ff.fix().runParser.flatMap { op ->
        fix().runParser.map { op2 ->
            op.flatMap { (input2: StringView, f: (A) -> B) ->
                // infinite loop due to the result of ff (input2) being ignored
                op2.map { (input3: StringView, g: A) ->
                    Tuple2(input3, f(g))
                }
            }
        }
    }
)

Jannis

12/23/2019, 11:15 AM

Only on my phone so I can't test things. Firstly you are running the ff first, that will be a bit weird as your ap will now work right to left. Anyway, in StateT this gets done in a similar way to this:

Parser(this.runParser.map { (Str, a) -> val (out, f) = ff.runParser.invoke(Str); out tot f(a) }) why does

slack also pull its bs code block stuff on mobile. Also this ignores the option, but I think that's easy enough to add :)

kartoffelsup

12/23/2019, 11:31 AM

Thanks! But that's just a left to right version of my first sample. This also leads to SOE 😞 e: left to right ap unfortunately breaks the stringParser, due to it traversing from right to left (I don't know how traverse works tbh :D)

Copy code

internal fun stringParser(string: String): Parser<String> {
    val parser: Parser<ListK<Char>> = string
        .map(::charParser).k()
        .traverse(ParserApplicativeInstance) { it.fix() }.fix()

    return parser.map { it.s() }
}

Jannis

12/23/2019, 11:52 AM

That should be fixed in the next arrow version.

arrow 1

Jannis

12/23/2019, 1:16 PM

Btw for traverse the order always only depends on the applicative. However there was a bug which switched the arguments in

map2

. That has been fixed. Also there are now both tests which check the order of traverse against a left-right applicative and tests which check if the order ap is consistent with monad derived ones (although the pr to fix all of those instances is still open). If you want to proceed without using the snapshot build of arrow then just redefine

traverse

like this (for lists, and AP is any applicative)

foldRight(Eval.now(AP.just(emptyList()))) { v, acc -> Eval.later { f(v).ap(AP.run { acc.value().map { xs -> { x: A -> listOf(x) + xs } } }) } }

. That is almost the definition that is currently on master, the only difference is that it uses

lazyAp

to avoid iterating the entire list if the left argument to ap fails (if it is Either.Left for example). Or alternatively you could change the dep to the snapshot build and use

traverse

from there I think

👍 1

Jannis

12/23/2019, 1:17 PM

To clarify: The order in which a collection is traversed is always left to right. However the order of your results depends on what

ap

evaluates first.

kartoffelsup

12/23/2019, 1:25 PM

Thanks for the explanation :) I will try the snapshot version arrow

Jannis

12/23/2019, 1:28 PM

I'm quite interested to see if it'll work out 🙂 Also with latest master you don't need the explicit many definition anymore. I am trying to convert the parser I wrote now, hopefully this approach will work ^^

kartoffelsup

12/23/2019, 3:31 PM

Traverse does work now with left to right ap (on snapshot) but I still get stackoverflows. I think I need to somehow be able to map without actually invoking the AndThen. I think AndThen#invoke should only happen once at the end of the program.

Copy code

fun <B> lazyAp(ff: () -> ParserOf<(A) -> B>): Parser<B> = Parser(
    fix().runParser.map { op ->
        op.flatMap { (input, a) ->
            ff().fix().runParser(input).map { (out, f) ->
                out toT f(a)
            }
        }
    })

I need to replace runParser(input) but I don't know with what :D

Jannis

12/28/2019, 12:27 AM

Looking at your parser, are you sure it is not a stackoverflow on construction? Because in

jsonArray

for example you call

jsonValue

outside of a lazy code block which invokes it, which then calls

jsonArray

again etc. Invoking in

lazyAp

could be fine for most things. I got my parser running now (had to switch the encoding + usage of

AndThen

). It parses long-ish input and kept the exact same api. Not sure if it's stacksafe yet, probably need to do an extreme test (maybe with writing a json parser with it ^-^). I'll probably move it to a repo in a few days so I can work on it independent from propCheck, for now it is here: https://github.com/1Jajen1/propCheck/tree/rewrite-structure/src/main/kotlin/propCheck/pretty/parse The encoding is now a bit easier to read, but it is still a lot of code + nice error handling makes the code complex in a few places.

arrow 1

kartoffelsup

12/28/2019, 7:55 AM

According to the Stacktrace it does not seem to be the case: https://gist.github.com/kartoffelsup/fa7130b53ef1ae4b6c8c0b9a7ef0474e Maybe the 180mb Json is a bit rough 😄 but I think it is due to lazyAp invoking the AndThen which imho should only happen once in the entire app? But not sure, haven't used the AndThen construct before and am not really familiar with it. I'm currently on this branch: https://github.com/kartoffelsup/haskell-inspired-arrow-json-parser/tree/soe

Jannis

12/28/2019, 8:20 AM

True... Do you have a link to that json? Would love to test my version against it...

kartoffelsup

12/28/2019, 8:21 AM

https://github.com/zemirco/sf-city-lots-json/blob/master/citylots.json (I would recommend rightclick+save instead of opening it in the browser :D) e: nvm, github is smart :)

👍 1

Jannis

12/28/2019, 9:08 AM

Ran out of heap-space first ^-^ Probably need something similar to your StringView ... Good thing the input is abstract

😄 1

Jannis

12/28/2019, 10:35 AM

Well it does run, it is probably even stacksafe. However it fails for some input after consuming like 300 chars

Jannis

12/28/2019, 10:36 AM

I did also add some support for floats and negative numbers, but I am not sure if it works correctly...

Jannis

12/28/2019, 11:28 AM

It does parse, the parser works and it is stacksafe (I think after consuming 4k chars that is safe to assume). Now if it would finish it would be great 😄 It's way too slow for that amount of json, the problem is probably gc pressure or the encoding does not play nice on the jvm. For your parser, I still think you are in some construction overflow, where you run your parser, it reaches the json array parser and directly recurses without consuming input. With lazyAp and AndThen this may not be immediate but still overflows. Thing is, your encoding is equal to

StateT<ForOption, StringView, A>

which (according to arrows tests is stacksafe), that is why I think your parser should be able to do that as well.

StateT<M, I, A> = (S) -> Kind<M, Tuple2<S, A>> => (StringView) -> Option<Tuple2<StringView, A>>

. So technically you could just have

data class Parser<A>(val unParser: StateT<ForOption, StringView, A>)

and just use all methods from

StateT

(I think it even has an

Alternative

implementation). But still this won't help if your parser recurses before consuming input, because it will never short circuit in that case.

Jannis

12/28/2019, 10:01 PM

Well nvm, solved the slowness problem and back are the stackoverflows. It's now super fast, but also overflows. Since our datatypes are isomorphic to

StateT

(even the uses of

AndThen

) this could mean that long

StateT

computations are also stackunsafe with the wrong monad. Because my parser is monadic I can use a stacksafe monad to make it stacksafe again.

Eval

Trampoline

IO

all work. Your parser is a bit lost in that regard. All you can do is model it as

(S) -> Kind<M, Option<Tuple2<S, A>>>

where

is some stacksafe monad. This also means that

State

in arrow is stack-unsafe because it is defined with

Id

. Ticket time ^^

kartoffelsup

12/28/2019, 11:31 PM

Thanks for the analysis! This makes me wonder how Haskell does it :D

Jannis

12/28/2019, 11:51 PM

by optimizing everything to a minimal runtime (no needless wrappers, just functions and values) and ghc also executes code at runtime differently. Basically when you call a function you only get a thunk that will eventually be evaluated, this creates no stackframe and does no work at all. When evaluating a thunk ghc does indeed use a callstack, but since this usually continues lazily that won't be deep at all. You can actually overflow the stack in haskell afaik, but you have to try really really hard. So in short, laziness + smart execution of thunks = stacksafety in haskell. Sadly this is a concept that comes at a perf cost when used on the jvm 😕 Arrow does stacksafety mostly by doing two things: reifying whatever function call you have to data thats on the heap and counting stackframes to 127 (and then putting it on the heap). That will then be evaluated inside a loop, which basically never blows the stack. Sadly this is not quite as flexible

💯 1

kartoffelsup

01/03/2020, 2:57 PM

I'm trying it with Eval now, it's been running for 90 minutes 😄 but at least no stackoverflows yet e: alright I canceled it. Way too slow :D

Jannis

01/03/2020, 7:20 PM

I was using trampoline because I think that's slightly faster but still too long. It's for two reasons tho: trampoline/eval is slow but more importantly building general purpose parser combinators and parsing huge amounts of json is not the right thing to do. Specialized parsers will always be faster in that regard. Combinators are mostly useful for complex small-large datasets and most importantly easy to write parsers. Huge datasets and crazy speeds are hard to do with combinators

kartoffelsup

01/03/2020, 9:54 PM

Yeah, I mean who sends 180mb json around anyway :D parsing 11mb json with almost no deep nesting took 1 minute

Jannis

01/03/2020, 10:45 PM

You can probably optimize the json parser itself a bit for some cheap speedups, but tbh for such a small library that is pretty decent I'd say.

Jannis

01/03/2020, 10:45 PM

Btw if you are interested: https://github.com/1Jajen1/Kparsec is the full thing now, the combinators are basically done. It just lacks a few small things to be called ready. Atm I am doing error reporting (it accumulates errors just atm fine, I am just working on a pretty printer for it atm, but https://github.com/1Jajen1/kotlin-pretty has some weird issues around larger docs ^^) Since I won't get this done this holiday week, I am aiming for end of february to get this to a presentable state, same with the pretty printer and after that is a 1.0 release of propCheck ^^ Also the whole thing is like 800 loc and that is mostly kotlin/arrow boilerplate. Crazy how little code an almost fully featured parser-combinator library can be, haskell megaparsec is even less.

arrow 1

7 Views

Open in Slack

Previous Next