Pandas has a DataFrame.to_msgpack() method for serialising a dataframe to the MessagePack format.
It requires a file path or a 'buffer-like' object. If not provided, then it returns the data in a string representation.
My question is how to properly save this data as a buffer-like object without saving it as a string first?
#1
string_data = df.to_msgpack() # returns data as string
#2
memory_buffer = memory view(df.to_msgpack()) # creates a memory view from string
#3
df.to_msgpack('filename.msg') # return data as binary file
#4
memory_buffer = memoryview(b'')
df.to_msgpack(memory_buffer, append=True) # would this work?
In scenario 4, df.to_msgpack()
requires a buffer-like object, whereas memoryview()
requires an input parameter. So one would have to create an 'empty' memory view and then pass this to the to_msgpack()
method. Then append the data. Though I wonder if this will lead to artefacts when unpacking the data.
With scenario 2, is it correct to think that a memory view of a string would be equivalent to a byte-array?