Filipe Duarte
04/16/2021, 4:06 PMaltavir
04/16/2021, 5:32 PMFilipe Duarte
04/16/2021, 5:34 PMaltavir
04/16/2021, 5:35 PMFilipe Duarte
04/16/2021, 5:36 PMРолан
04/16/2021, 6:21 PMFilipe Duarte
04/16/2021, 6:32 PMaltavir
04/16/2021, 6:33 PMFilipe Duarte
04/16/2021, 6:35 PMaltavir
04/16/2021, 6:36 PMРолан
04/16/2021, 7:07 PMkdb
. If you don't want to learn q
, for the kind of data you are talking about, you can go a long way with python
combining numpy
, pandas
, numba
and pytorch
. For data storage you will be fine with h5
or pt
from pytorch
. Spark was not designed to deal with this kind of data. You should avoid databases like cassandra as well.Filipe Duarte
04/16/2021, 8:32 PMРолан
04/16/2021, 9:32 PMFilipe Duarte
04/16/2021, 9:35 PMРолан
04/16/2021, 9:49 PMq
to be honest, but you can indeed work with pyq
to call q
from python
, or vice-versa use embedpy
to call python
from q
.h5
is only 2 times faster than using pytorch .pt
, but reading from pytorch .pt
is 5 times faster reading .h5
files approximatelyFilipe Duarte
04/16/2021, 9:55 PMkdb+
and store data using h5
.. than using pytorch to load this data?Ролан
04/16/2021, 9:56 PMkdb
will be ways much faster than all this, I was talking about pure python/C++ hereFilipe Duarte
04/16/2021, 9:58 PMРолан
04/16/2021, 9:59 PMnumpy
is so useful in data analysisFilipe Duarte
04/16/2021, 10:03 PMРолан
04/16/2021, 10:07 PMFilipe Duarte
04/16/2021, 10:17 PMРолан
04/17/2021, 6:49 AMFilipe Duarte
04/17/2021, 1:35 PMaltavir
04/17/2021, 1:40 PMРолан
04/17/2021, 1:51 PMaltavir
04/17/2021, 1:55 PMРолан
04/17/2021, 1:57 PMaltavir
04/17/2021, 1:59 PMРолан
04/17/2021, 2:43 PMaltavir
04/17/2021, 2:50 PMFilipe Duarte
04/18/2021, 3:26 AMaltavir
04/18/2021, 5:58 AMFilipe Duarte
04/19/2021, 1:56 PMРолан
04/19/2021, 4:52 PMkdb
website. That stack was specifically designed to help people solve the problems you are working on.Filipe Duarte
04/19/2021, 6:28 PMkdb+