Hey <@U5F0TT0UX>! We’ve been talking a little bit ...
# science
y
Hey @altavir! We’ve been talking a little bit about on discourse about DataForge
a
Hi, I was just writing to you with a proposal to discuss it here.
What specific tasks do you have for remote execution? Is it filtering or something more complicated?
Just for the context, we are currently finalizing results of different work made by JBR interns this summer. The prototype allows to pass functions as parameters without interpreting the code or passing compiled bytecode. Just by the means of reverse call. I am not sure it is relevant in this case.
y
Sorry for my low latency 🙂 I’m answering between assignments.
The simplest transformation is linear scaling, but I already know from users of previous iterations of this project that it is not enough. Event-wise mathematical transformation of events seems to be relevant also, as well as windowed operations (looking back at X last events, or events in last Y seconds). For more complex usages I found state machines to be very useful, but this seems to be out scope. Step functions on the other hand will probably also be useful, and maybe later we’ll want to record readings and train classifiers based on them but this is far future ATM
a
But the question is whether it is required to do those transformations on-site. I will try to draw a diagram of how we going to solve it right now for devices.
y
I don’t understand what you mean by “on-site”, can you clarify or give me an example?
a
Sorry, a bit tied up with the beginning of a new teaching year. I will try to produce some pictures until the end of the week.
y
I’ve read about it a little bit, by on-site you mean as in the physical location where the devices are? In that case then yes, it’s both on-site and “real-time”.
a
OK, I finally got some time to draw pictures. Here is what you are trying to do. Stateful device controller with additional channel for configuration. The communication between the device and controller is on-site and therfore super-fast. Still, you need to create a custom communication channel for each device.
Now this is the general scheme we are currently aiming at. The idea is that we have scaleable central event bus and each device, analyzer and user has an input stream and an output stream. So when event leaves the device, it goes to analyzer which has custom logic and could be replaced as a whole, and the analyzer sends the event to the user. This way, we won't get real-time analyzer data, it will have high throughput, but higher latency, but it is stateless and scaleable.
The main question is how real time is real time and what are the requirements.
y
ok, I see. So we’re working for tangible interactions, meaning that our latency needs to be good enough such that users won’t feel it. I don’t have exact numbers and I guess it might be different from device to device (our auditory system, for example, is more sensitive to varied latency), but anything under 50ms will be good, and under 20ms will be f*ing amazing. For some interactions higher latency might not be a problem but probably not much higher then 100ms.
a
Well, the standard network latency is about 20 ms. And cold scripting could be worse. So I do not think that sending sctipts to be compiled on-site is a good idea. You probably need some kind of serializeable transformations.
y
Will scripts be compiled on every execution, or could there be a way to cache that? Script will change very rarely during an interaction
a
Scripts could be pre-compiled and then the starting time will be small. Still I would use it only if everything else fails
y
I see. So the data forge approach is writing modules that could be wired through the user application using some serialized representation?
a
Yes, since we are doing highly customizable and scaleable framework, we can't rely on scripts. We use event-based approach. Actually the same goes for polyglot communication solutions we are working right now. Passing byte-code or scripts is just not reliable.
y
ok! Thanks, I’m convinced! And I hope that I’ll get the time to test data forge soon
a
It is in a prototype stage yet, this is why we need all the input we can get.
Right now I recomment to create a serializeable form for your transformation like
Copy code
sealed class Transformation

class Scale(val scale: Double): Transformation
And pass it via regular json to your device controller.
y
and somewhere else I will have parser that will build computable expressions from that?
a
Yes, something like that. If you have only limited number of transformations, it would work the best. Of course if you will need to introduce new types of transformations without changing the client code, it will be much more complicated.
y
but I think for now it would be enough