0

Let's say we have one structure :

[StructLayout(LayoutKind.Explicit, Size=8)] // using System.Runtime.InteropServices;
public struct AirportHeader {
    [FieldOffset(0)]
    [MarshalAs(UnmanagedType.I4)]
    public int Ident; // a 4 bytes ASCII : "FIMP" { 0x46, 0x49, 0x4D, 0x50 }
    [FieldOffset(4)]
    [MarshalAs(UnmanagedType.I4)]
    public int Offset;
}

What I want to have : Both direct access to type string and int values, for the field Ident in this structure, without breaking the 8 bytes size of the structure, nor having to compute a string value each time from the int value.

The field Ident in that structure as int is interesting because I can fast compare with other idents if they match, other idents may come from datas that are unrelated to this structure, but are in the same int format.

Question : Is there a way to define a field that is not part of the struture layout ? Like :

[StructLayout(LayoutKind.Explicit, Size=8)]
public struct AirportHeader {
    [FieldOffset(0)]
    [MarshalAs(UnmanagedType.I4)]
    public int Ident; // a 4 bytes ASCII : "FIMP" { 0x46, 0x49, 0x4D, 0x50 }
    [FieldOffset(4)]
    [MarshalAs(UnmanagedType.I4)]
    public int Offset;
    
    [NoOffset()] // <- is there something I can do the like of this
    string _identStr;
    public string IdentStr {
        get { // EDIT ! missed the getter on this property
            if (string.IsNullOrEmpty(_identStr)) _identStr =
                System.Text.Encoding.ASCII.GetString(Ident.GetBytes());
            // do the above only once. May use an extra private bool field to go faster.
            return _identStr;
        }
    }
}

PS : I use pointers ('*' and '&', unsafe) because I need to deal with endianness (Local system, binary files/file format, network) and fast type conversions, fast arrays filling. I also use many flavours of Marshal methods (fixing structures on byte arrays), and a little of PInvoke and COM interop. Too bad some assemblies I'm dealing with doesn't have their dotNet counterpart yet.


TL;DR; For details only

The question is all it is about, I just don't know the answer. The following should answer most questions like "other approaches", or "why not do this instead", but could be ignored as the answer would be straightforward. Anyway, I preemptively put everything so it's clear from the start what am I trying to do. :)

Options/Workaround I'm currently using (or thinking of using) :

  1. Create a getter (not a field) that computes the string value each time :

    public string IdentStr {
        get { return System.Text.Encoding.ASCII.GetString(Ident.GetBytes()); }
        // where GetBytes() is an extension method that converts an int to byte[]
    }
    

    This approach, while doing the job, performs poorly : The GUI displays aircraft from a database of default flights, and injects other flights from the network with a refresh rate of one second (I should increase that to 5 seconds). I have around 1200 flights within a area, relating to 2400 airports (departure and arrival), meaning I have 2400 calls to the above code each second to display the ident in a DataGrid.

  2. Create another struct (or class), which only purpose is to manage data on GUI side, when not reading/writing to a stream or file. That means, read the data with the explicit layout struct. Create another struct with the string version of the field. Work with GUI. That will perform better on an overall point of view, but, in the process of defining structures for the game binaries, I'm already at 143 structures of the kind (just with older versions of the game datas; there are a bunch I didn't write yet, and I plan to add structures for the newest datas types). ATM, more than half of them require one or more extra fields to be of meaningful use. It's okay if I were the only one to use the assembly, but other users will probably get lost with AirportHeader, AirportHeaderEx, AirportEntry, AirportEntryEx, AirportCoords, AirportCoordsEx.... I would avoid doing that.

  3. Optimize option 1 to make computations perform faster (thanks to SO, there are a bunch of ideas to look for - currently working on the idea). For the Ident field, I guess I could use pointers (and I will). Already doing it for fields I must display in little endian and read/write in big endian. There are other values, like 4x4 grid informations that are packed in a single Int64 (ulong), that needs bit shifting to expose the actual values. Same for GUIDs or objects pitch/bank/yaw.

  4. Try to take advantage of overlapping fields (on study). That would work for GUIDs. Perhaps it may work for the Ident example, if MarshalAs can constrain the value to an ASCII string. Then I just need to specify the same FieldOffset, '0' in this case. But I'm unsure setting the field value (entry.FieldStr = "FMEP";) actually uses the Marshal constrain on the managed code side. My undestanding is it will store the string in Unicode on managed side (?). Furthermore, that wouldn't work for packed bits (bytes that contains several values, or consecutive bytes hosting values that have to be bit shifted). I believe it is impossible to specify value position, length and format at bit level.

Why bother ? context :

I'm defining a bunch of structures to parse binary datas from array of bytes (IO.File.ReadAllBytes) or streams, and write them back, datas related to a game. Application logic should use the structures to quickly access and manipulate the datas on demand. Assembly expected capabilities is read, validate, edit, create and write, outside the scope of the game (addon building, control) and inside the scope of the game (API, live modding or monitoring). Other purpose is to understand the content of binaries (hex) and make use of that understanding to build what's missing in the game.

The purpose of the assembly is to provide a ready to use basis components for a c# addon contributor (I don't plan to make the code portable). Creating applications for the game or processing addon from source to compilation into game binaries. It's nice to have a class that loads the entire content of a file in memory, but some context require you to not do that, and only retrieve from the file what is necessary, hence the choice of the struct pattern.

I need to figure out the trust and legal issues (copyrighted data) but that's outside the scope of the main concern. If that matter, Microsoft did provide over the years public freely accessible SDKs exposing binaries structures on previous versions of the game, for the purpose of what I'm doing (I'm not the first and probably not the last to do so). Though, I wouldn't dare to expose undocumented binaries (for the latest game datas for instance), nor facilitate a copyright breach on copyrighted materials/binaries.

I'm just asking confirmation if there is a way or not to have private fields not being part of the structure layout. Naive belief ATM is "that's impossible, but there are workarounds". It's just that my c# experience is pretty sparce, so maybe I'm wrong, why I ask. Thanks !


As suggested, there are several ways to get the job done. Here are the getters/setters I came up with within the structure. I'll measure how each code performs on various scenarios later. The dict approach is very seducing as on many scenarios, I would need a directly accessible global database of (59000) airports with runways and parking spots (not just the Ident), but a fast check between struct fields is also interesting.

    public string IdentStr_Marshal {
        get {
            var output = "";
            GCHandle pinnedHandle; // CS0165 for me (-> c# v5)
            try { // Fast if no exception, (very) slow if exception thrown
                pinnedHandle = GCHandle.Alloc(this, GCHandleType.Pinned);
                IntPtr structPtr = pinnedHandle.AddrOfPinnedObject();
                output = Marshal.PtrToStringAnsi(structPtr, 4);
                // Cannot use UTF8 because the assembly should work in Framework v4.5
            } finally { if (pinnedHandle.IsAllocated) pinnedHandle.Free(); }
            return output;
        }
        set {
            value.PadRight(4);  // Must fill the blanks - initial while loop replaced (Charlieface's)
            IntPtr intValuePtr = IntPtr.Zero;
            // Cannot use UTF8 because some users are on Win7 with FlightSim 2004
            try { // Put a try as a matter of habit, but not convinced it's gonna throw.
                intValuePtr = Marshal.StringToHGlobalAnsi(value);
                Ident = Marshal.ReadInt32(intValuePtr, 0).BinaryConvertToUInt32(); // Extension method to convert type.
            } finally { Marshal.FreeHGlobal(intValuePtr); // freeing the right pointer }
        }
    }
    
    public unsafe string IdentStr_Pointer {
        get {
            string output = "";
            fixed (UInt32* ident = &Ident) { // Fixing the field
                sbyte* bytes = (sbyte*)ident;
                output = new string(bytes, 0, 4, System.Text.Encoding.ASCII); // Encoding added (@Charlieface)
            }
            return output;
        }
        set {
            // value must not exceed a length of 4 and must be in Ansi [A-Z,0-9,whitespace 0x20].
            // value validation at this point occurs outside the structure.
            fixed (UInt32* ident = &Ident) { // Fixing the field
                byte* bytes = (byte*)ident;
                byte[] asciiArr = System.Text.Encoding.ASCII.GetBytes(value);
                if (asciiArr.Length >= 4) // (asciiArr.Length == 4) would also work
                    for (Int32 i = 0; i < 4; i++) bytes[i] = asciiArr[i];
                else {
                    for (Int32 i = 0; i < asciiArr.Length; i++) bytes[i] = asciiArr[i];
                    for (Int32 i = asciiArr.Length; i < 4; i++) bytes[i] = 0x20;
                }
            }
        }
    }
    
    static Dictionary<UInt32, string> ps_dict = new Dictionary<UInt32, string>();
    
    public string IdentStr_StaticDict {
        get {
            string output; // logic update with TryGetValue (@Charlieface)
            if (ps_dict.TryGetValue(Ident, out output)) return output;
            output = System.Text.Encoding.ASCII.GetString(Ident.ToBytes(EndiannessType.LittleEndian));
            ps_dict.Add(Ident, output);
            return output;
        }
        set { // input can be "FMEE", "DME" or "DK". length of 2 characters is the minimum.
            var bytes = new byte[4]; // Need to convert value to a 4 byte array
            byte[] asciiArr = System.Text.Encoding.ASCII.GetBytes(value); // should be 4 bytes or less
            // Put the valid ASCII codes in the array.
            if (asciiArr.Length >= 4) // (asciiArr.Length == 4) would also work
                for (Int32 i = 0; i < 4; i++) bytes[i] = asciiArr[i];
            else {
                for (Int32 i = 0; i < asciiArr.Length; i++) bytes[i] = asciiArr[i];
                for (Int32 i = asciiArr.Length; i < 4; i++) bytes[i] = 0x20;
            }
            Ident = BitConverter.ToUInt32(bytes, 0); // Set structure int value
            if (!ps_dict.ContainsKey(Ident)) // Add if missing
                ps_dict.Add(Ident, System.Text.Encoding.ASCII.GetString(bytes));
        }
    }
4

2 回答 2

2

As mentioned by others, it is not possible to exclude a field from a struct for marshalling.

You also cannot use a pointer as a string in most places.

If the number of different possible strings is relatively small (and it probably will be, given it's only 4 characters), then you could use a static Dictionary<int, string> as a kind of string-interning mechanism.

Then you write a property to add/retrieve the real string.

Note that dictionary access is O(1), and hashing an int just returns itself, so it will be very, very fast, but will take up some memory.

[StructLayout(LayoutKind.Explicit, Size=8)]
public struct AirportHeader
{
    [FieldOffset(0)]
    [MarshalAs(UnmanagedType.I4)]
    public int Ident; // a 4 bytes ASCII : "FIMP" { 0x46, 0x49, 0x4D, 0x50 }

    [FieldOffset(4)]
    [MarshalAs(UnmanagedType.I4)]
    public int Offset;
    

    static Dictionary<int, string> _identStrings = new Dictionary<int, string>();

    public string IdentStr =>
        _identStrings.TryGetValue(Ident, out var ret) ? ret :
            (_identStrings[Ident] = Encoding.ASCII.GetString(Ident.GetBytes());
}
于 2021-07-05T19:20:00.160 回答
1

This is not possible because a structure must contain all of its values ​​in a specific order. Usually this order is controlled by the CLR itself. If you want to change the order of the data order, you can use the StructLayout. However, you cannot exclude a field or that data would simply not exist in memory.

Instead of a string (which is a reference type) you can use a pointer to point directly to that string and use that in your structure in combination with the StructLayout. To get this string value, you can use a get-only property that reads directly from unmanaged memory.

于 2021-07-05T18:34:41.080 回答