given escape analysis in recent years, a certain amount of heap code will be converted to very cheap stack allocations, but if you are talking about using a standard like hadoop for mapreduce, you wind up with a lot of boilkerplate configs and setup that is more like production code. my friend said i should check out
h2o.ai and rapidminer tools, but when i gave him the datasets he couldn't close the loop on a value-add from those tools without additional work. my usecases are pivot, grouping, and time resampling, for keras the last mile has to be numpy, so the sweet spot is doing the transforms using efficient zero-copy and parralellization and then structuring DL models in tensorflow at the end.