1

我有两个文件。第一个文件包含数据模式。第二个文件包含数据。

数据模式描述行中的元素

+--------------------------+-------------+-------------+...+-------------+-------------+
| Count of elements in row | Type (el#1) | Size (el#1) |   | Type (el#N) | Size (el#N) |
+--------------------------+-------------+-------------+...+-------------+-------------+
|           u32            |    i32      |     u32     |   |     i32     |     u32     |
+--------------------------+-------------+-------------+...+-------------+-------------+

数据文件包含二维数组

+-----------+--------------+--------------+...+--------------+--------------+...
| Row count | (row 1) el#1 | (row 1) el#2 |   | (row 1) el#N | (row 2) el#1 |
+-----------+--------------+--------------+...+--------------+--------------+...
|    u32    |     sfc      |     sfc      |   |     sfc      |     sfc      |   
+-----------+--------------+--------------+...+--------------+--------------+...
*sfc - size from schema 

我编写了该代码并且它有效。但也许有更优雅的方法来解决这个问题。

bincode::deserialize(bytes_slice).unwrap()重复了好几次,这在我看来很奇怪

use serde::Deserialize;

#[derive(Deserialize)]
struct SchemaUnit {
    size: u32,
    m_type: i32
}

enum SchemaType {
    U16,
    U32,
    U64,
    F32,
    F64,
    String
}

enum Datum {
    U16(u16),
    U32(u32),
    U64(u64),
    F32(f32),
    F64(f64),
    String(String)
}

impl SchemaType {
    fn from(num: i32) -> Option<Self> {
        match num {
            1 => Some(SchemaType::U16),
            2 => Some(SchemaType::U32),
            3 => Some(SchemaType::U64),
            7 => Some(SchemaType::F32),
            8 => Some(SchemaType::F64),
            19 => Some(SchemaType::String),
            _ => None
        }
    }
}

struct Data;

impl Data {
    fn from(bytes: Vec<u8>, schema: Vec<SchemaUnit>) -> Vec<Vec<Datum>> {
        let mut result: Vec<Vec<Datum>> = vec![];
        let row_count: u32 = bincode::deserialize(&bytes[..4]).unwrap();
        
        let size_of_row: u32 = schema.iter().map(|s| s.size).sum();
        
        let mut start = 4;

        for i in 0..row_count {
            let tmp: Vec<Datum> = vec![];
            for schema_unit in schema.iter() {
                let element_size = schema_unit.size as usize;
                let bytes_slice = &bytes[start..start + element_size];
                // TODO: handle types that doesn't exists 
                let m_type = SchemaType::from(schema_unit.m_type).unwrap();
                tmp.push(match m_type {
                    SchemaType::U16 => Datum::U16(bincode::deserialize(bytes_slice).unwrap()),
                    SchemaType::U32 => Datum::U32(bincode::deserialize(bytes_slice).unwrap()),
                    SchemaType::U64 => Datum::U64(bincode::deserialize(bytes_slice).unwrap()),
                    SchemaType::F32 => Datum::F32(bincode::deserialize(bytes_slice).unwrap()),
                    SchemaType::F64 => Datum::F64(bincode::deserialize(bytes_slice).unwrap()),
                    SchemaType::String => Datum::String(bytes_to_str(bytes_slice))
                });
                start += element_size;
            }
            result.push(tmp);
        }
        result
    }
}
4

0 回答 0