<@U65FK6QNB> Not to bloat the thread. There are se...
# datascience
a
@jimn Not to bloat the thread. There are several qualities of a data storage and data processing format: readability, size, indexed access, operation performance (row-based, column-based, tree-based) etc. It is not possible to have them all simultaniously. If you want fast random access especially in dynamic languages like python (where it is hard to do fixed memory width), you will have a lot of additional indeces. That is why it is so large. You just should not use python pandas for such tasks.