Add necessary APIs to interact with synthetic state components #136

russell-islam · 2024-03-27T21:44:29Z

Summary of the PR

*This PR adds necessary structs, APIs, unit tests to save and restore synthetic state components *

Requirements

Before submitting your PR, please make sure you addressed the following
requirements:

All commits in this PR have Signed-Off-By trailers (with
git commit -s), and the commit message has max 60 characters for the
summary and max 75 characters for each description line.
All added/changed functionality has a corresponding unit/integration
test.
All added/changed public-facing functionality has entries in the "Upcoming
Release" section of CHANGELOG.md (if no such section exists, please create one).
Any newly added unsafe code is properly documented.

NunoDasNeves · 2024-03-27T22:02:39Z

I wonder if we can avoid having so much bespoke code for all these different kinds of state. The main use for this API is save/restore. We could just loop over the different kinds of state, storing the result in opaque fixed-size buffers (e.g. of 4096 bytes, which is the max state size we know about). And then do the same in reverse to restore the state.
I'm not sure how much code change it would require on the CLH side...

Being able to set/get LAPIC state by itself seems useful, because CLH actually uses it for something other than save/restore state.

russell-islam · 2024-03-28T00:27:28Z

I wonder if we can avoid having so much bespoke code for all these different kinds of state. The main use for this API is save/restore. We could just loop over the different kinds of state, storing the result in opaque fixed-size buffers (e.g. of 4096 bytes, which is the max state size we know about). And then do the same in reverse to restore the state. I'm not sure how much code change it would require on the CLH side...

Being able to set/get LAPIC state by itself seems useful, because CLH actually uses it for something other than save/restore state.

I see your point. That would unify the API but for readability and API differentiation I would prefer the current approach. I was thinking the same as yours then I looked into hvlite code. hvlite also has different api and struct.

For now we have three states that are similar (xsave, simp, and sief). For synthetic timers we do want keep it separate.

@liuw @jinankjain any thoughts. I am open to anything that we can agree.

mshv-ioctls/src/ioctls/vcpu.rs

mshv-bindings/src/regs.rs

mshv-ioctls/src/ioctls/vcpu.rs

jinankjain · 2024-03-28T16:27:01Z

In general, I had one more question with this serialization/deserialization logic? Is there a way to versionize these structs, if some of these fields in these structs changes in future? Is there a way to downgrade and upgrade between these versions?

russell-islam · 2024-03-28T18:46:03Z

In general, I had one more question with this serialization/deserialization logic? Is there a way to versionize these structs, if some of these fields in these structs changes in future? Is there a way to downgrade and upgrade between these versions?

No way at the moment to versionalize these structs. Even KVM does not have anything.

liuw · 2024-03-28T22:41:56Z

I wonder if we can avoid having so much bespoke code for all these different kinds of state. The main use for this API is save/restore. We could just loop over the different kinds of state, storing the result in opaque fixed-size buffers (e.g. of 4096 bytes, which is the max state size we know about). And then do the same in reverse to restore the state. I'm not sure how much code change it would require on the CLH side...
Being able to set/get LAPIC state by itself seems useful, because CLH actually uses it for something other than save/restore state.

I see your point. That would unify the API but for readability and API differentiation I would prefer the current approach. I was thinking the same as yours then I looked into hvlite code. hvlite also has different api and struct.

For now we have three states that are similar (xsave, simp, and sief). For synthetic timers we do want keep it separate.

@liuw @jinankjain any thoughts. I am open to anything that we can agree.

I don't have a clear answer to this. The downside of the fixed size buffers is it unnecessarily increases the amount of data transferred a lot, since most of the states will be far smaller than 4K. Imagine in the future you want to migrate a VM with hundreds VCPUs. That adds up quickly.

russell-islam · 2024-03-28T23:20:17Z

I wonder if we can avoid having so much bespoke code for all these different kinds of state. The main use for this API is save/restore. We could just loop over the different kinds of state, storing the result in opaque fixed-size buffers (e.g. of 4096 bytes, which is the max state size we know about). And then do the same in reverse to restore the state. I'm not sure how much code change it would require on the CLH side...
Being able to set/get LAPIC state by itself seems useful, because CLH actually uses it for something other than save/restore state.

I see your point. That would unify the API but for readability and API differentiation I would prefer the current approach. I was thinking the same as yours then I looked into hvlite code. hvlite also has different api and struct.
For now we have three states that are similar (xsave, simp, and sief). For synthetic timers we do want keep it separate.
@liuw @jinankjain any thoughts. I am open to anything that we can agree.

I don't have a clear answer to this. The downside of the fixed size buffers is it unnecessarily increases the amount of data transferred a lot, since most of the states will be far smaller than 4K. Imagine in the future you want to migrate a VM with hundreds VCPUs. That adds up quickly.

There will be no unnecessary data. Nuno's suggestion is to unify three state components, all will be 4k buffer. API will have an extra parameter that indicates the type of the state(xsave, simp or sief). API will accept/return a 4k buffer always. @NunoDasNeves Please correct me if I am wrong.

liuw · 2024-03-28T23:24:23Z

I wonder if we can avoid having so much bespoke code for all these different kinds of state. The main use for this API is save/restore. We could just loop over the different kinds of state, storing the result in opaque fixed-size buffers (e.g. of 4096 bytes, which is the max state size we know about). And then do the same in reverse to restore the state. I'm not sure how much code change it would require on the CLH side...
Being able to set/get LAPIC state by itself seems useful, because CLH actually uses it for something other than save/restore state.

I see your point. That would unify the API but for readability and API differentiation I would prefer the current approach. I was thinking the same as yours then I looked into hvlite code. hvlite also has different api and struct.
For now we have three states that are similar (xsave, simp, and sief). For synthetic timers we do want keep it separate.
@liuw @jinankjain any thoughts. I am open to anything that we can agree.

I don't have a clear answer to this. The downside of the fixed size buffers is it unnecessarily increases the amount of data transferred a lot, since most of the states will be far smaller than 4K. Imagine in the future you want to migrate a VM with hundreds VCPUs. That adds up quickly.

There will be no unnecessary data. Nuno's suggestion is to unify three state components, all will be 4k buffer. API will have an extra parameter that indicates the type of the state(xsave, simp or sief). API will accept/return a 4k buffer always. @NunoDasNeves Please correct me if I am wrong.

Okay, I misread. Sorry for the noise.

I don't have a strong opinion then.

NunoDasNeves · 2024-03-28T23:45:25Z

To sum it up - the IOCTL API needs a page-aligned, page-multiple-sized buffer, but the VP state data doesn't have to be stored that way. It could even be concatenated into one big buffer on the heap, so no space is wasted.

Here's some 'pseudorust':

struct AllTheVpState {
    offsets: [u64; MSHV_VP_STATE_COUNT], // each value is an index into buffer. The sizes can be worked out from this.
    buffer: Vec<u8>,
};

fn get_all_the_vp_state() -> Result<AllTheVpState> {
    mut buffer = Buffer::new(HV_PAGE_SIZE, HV_PAGE_SIZE); // assume this is big enough for any VP state
    mut ret = AllTheVpState{
        ..Default::default()
    };
    for i in 0..MSHV_VP_STATE_COUNT {
        mut vp_state = mshv_get_set_vp_state{
            buf_ptr: buffer.buf as u64,
            buf_sz: buffer.size() as u32,
            type_: i as u8,
            ..Default::default()
        };
        self.get_vp_state_ioctl(&mut vp_state)?;
        let actual_size = vp_state.buf_sz;
        ret.offsets[i] = ret.buffer.len();
        // idk if this is a real function...
        ret.buffer.append_from_pointer(buffer.buf, actual_size);
    }
    Ok(ret)
}

fn set_all_the_vp_state(state: &AllTheVpState) -> Result<()> {
    mut buffer = Buffer::new(HV_PAGE_SIZE, HV_PAGE_SIZE); // assume this is big enough for any VP state
    for i in 0..MSHV_VP_STATE_COUNT {
        let pointer_to_data = state.buffer.as_ptr().offset(ret.offsets[i]);
        let data_size = if (i < MSHV_VP_STATE_COUNT - 1) ret.offsets[i + 1] else ret.buffer.len() - ret.offsets[i];
        memcpy(pointer_to_data, buffer.buf, data_size);
        mut vp_state = mshv_get_set_vp_state{
            buf_ptr: buffer.buf as u64,
            buf_sz: buffer.size() as u32,
            type_: i as u8,
            ..Default::default()
        };
        self.set_vp_state_ioctl(&mut vp_state)?;
        let actual_size = vp_state.buf_sz;
        assert(actual_size == data_size);
    }
    Ok(ret)
}

mshv-bindings/src/regs.rs

NunoDasNeves · 2024-04-01T21:54:39Z

Here's another option. I really think storing it like this will simplify the whole thing and be less error prone than this huge amount of manual (de)serialization:

struct AllTheVpState {
    xsave: [u8; 4096],
    lapic: [u8; 1024],
    simp: [u8, 4096],
    siefp: [u8, 4096],
    stimers: hv_synthetic_timers_state, // or another byte array of the correct size if you don't want to expose this to CLH
};

You can still retrieve the LAPIC state separately with the existing code, of course.

mshv-ioctls/src/ioctls/vcpu.rs

russell-islam · 2024-04-01T22:10:00Z

To sum it up - the IOCTL API needs a page-aligned, page-multiple-sized buffer, but the VP state data doesn't have to be stored that way. It could even be concatenated into one big buffer on the heap, so no space is wasted.

Here's some 'pseudorust':

struct AllTheVpState {
    offsets: [u64; MSHV_VP_STATE_COUNT], // each value is an index into buffer. The sizes can be worked out from this.
    buffer: Vec<u8>,
};

fn get_all_the_vp_state() -> Result<AllTheVpState> {
    mut buffer = Buffer::new(HV_PAGE_SIZE, HV_PAGE_SIZE); // assume this is big enough for any VP state
    mut ret = AllTheVpState{
        ..Default::default()
    };
    for i in 0..MSHV_VP_STATE_COUNT {
        mut vp_state = mshv_get_set_vp_state{
            buf_ptr: buffer.buf as u64,
            buf_sz: buffer.size() as u32,
            type_: i as u8,
            ..Default::default()
        };
        self.get_vp_state_ioctl(&mut vp_state)?;
        let actual_size = vp_state.buf_sz;
        ret.offsets[i] = ret.buffer.len();
        // idk if this is a real function...
        ret.buffer.append_from_pointer(buffer.buf, actual_size);
    }
    Ok(ret)
}

fn set_all_the_vp_state(state: &AllTheVpState) -> Result<()> {
    mut buffer = Buffer::new(HV_PAGE_SIZE, HV_PAGE_SIZE); // assume this is big enough for any VP state
    for i in 0..MSHV_VP_STATE_COUNT {
        let pointer_to_data = state.buffer.as_ptr().offset(ret.offsets[i]);
        let data_size = if (i < MSHV_VP_STATE_COUNT - 1) ret.offsets[i + 1] else ret.buffer.len() - ret.offsets[i];
        memcpy(pointer_to_data, buffer.buf, data_size);
        mut vp_state = mshv_get_set_vp_state{
            buf_ptr: buffer.buf as u64,
            buf_sz: buffer.size() as u32,
            type_: i as u8,
            ..Default::default()
        };
        self.set_vp_state_ioctl(&mut vp_state)?;
        let actual_size = vp_state.buf_sz;
        assert(actual_size == data_size);
    }
    Ok(ret)
}

We want to keep LAPIC and SynicTimers excluded form this. Right? Merging all the states to one buffer seems overkill and make the reading complex while these states and APIs are too simple. I am not sure what benefits give us if we use just use one big buffer.

russell-islam · 2024-04-01T22:38:05Z

Here's another option. I really think storing it like this will simplify the whole thing and be less error prone than this huge amount of manual (de)serialization:
struct AllTheVpState {
    xsave: [u8; 4096],
    lapic: [u8; 1024],
    simp: [u8, 4096],
    siefp: [u8, 4096],
    stimers: hv_synthetic_timers_state, // or another byte array of the correct size if you don't want to expose this to CLH
};
You can still retrieve the LAPIC state separately with the existing code, of course.

I am not sure though how can I serialize and deserialize it?

russell-islam · 2024-04-02T23:26:21Z

Here's another option. I really think storing it like this will simplify the whole thing and be less error prone than this huge amount of manual (de)serialization:
struct AllTheVpState {
    xsave: [u8; 4096],
    lapic: [u8; 1024],
    simp: [u8, 4096],
    siefp: [u8, 4096],
    stimers: hv_synthetic_timers_state, // or another byte array of the correct size if you don't want to expose this to CLH
};
You can still retrieve the LAPIC state separately with the existing code, of course.

We are minimizing the API Call to mshv-ioctls not the hypercall, right? If we use this structure we will endup copying the buffer to each this array, and in a single API we will do 5 IOCTLS, doesn't it look messy?

NunoDasNeves · 2024-04-03T17:44:37Z

We are minimizing the API Call to mshv-ioctls not the hypercall, right? If we use this structure we will endup copying the buffer to each this array, and in a single API we will do 5 IOCTLS, doesn't it look messy?

It's not about minimizing the API calls, but the amount of code. For what we need, I think having a separate get_*() function and data structures for each type of state is not necessary.

We could also change the IOCTL interface if you like.

russell-islam · 2024-04-05T20:42:39Z

We are minimizing the API Call to mshv-ioctls not the hypercall, right? If we use this structure we will endup copying the buffer to each this array, and in a single API we will do 5 IOCTLS, doesn't it look messy?

It's not about minimizing the API calls, but the amount of code. For what we need, I think having a separate get_*() function and data structures for each type of state is not necessary.

We could also change the IOCTL interface if you like.

@NunoDasNeves After thinking a bit more I have decided to use one large buffer. It is more cleaner that way. Making this PR as draft and I will let you know once ready to review.

mshv-bindings/src/regs.rs

mshv-ioctls/src/ioctls/vcpu.rs

mshv-bindings/src/regs.rs

praveen-pk · 2024-04-10T20:13:36Z

Apart from above comments, I don't have much to add. LGTM!

Defining a struct with a large buffer to hold all VP state components. Data is stored sequentially with lowest value of state type first. Signed-off-by: Muminul Islam <muislam@microsoft.com>

Implement some helper function for the VP state components struct. These functions will be used in the API to get/set VP state components. Signed-off-by: Muminul Islam <muislam@microsoft.com>

mshv-bindings/src/regs.rs

Signed-off-by: Muminul Islam <muislam@microsoft.com>

NunoDasNeves

LGTM

russell-islam requested review from liuw, NunoDasNeves and jinankjain as code owners March 27, 2024 21:44

russell-islam force-pushed the muislam/state-components branch from 77de36b to 8369156 Compare March 27, 2024 21:48

russell-islam changed the title ~~Add necessary APIs to interact with sythetic state components~~ Add necessary APIs to interact with synthetic state components Mar 27, 2024

praveen-pk reviewed Mar 28, 2024

View reviewed changes

russell-islam force-pushed the muislam/state-components branch from 8369156 to 75fdd24 Compare March 28, 2024 23:16

russell-islam force-pushed the muislam/state-components branch from 75fdd24 to 3d221d3 Compare March 29, 2024 23:41

NunoDasNeves reviewed Apr 1, 2024

View reviewed changes

mshv-bindings/src/regs.rs Outdated Show resolved Hide resolved

NunoDasNeves reviewed Apr 1, 2024

View reviewed changes

mshv-bindings/src/regs.rs Outdated Show resolved Hide resolved

NunoDasNeves reviewed Apr 1, 2024

View reviewed changes

mshv-ioctls/src/ioctls/vcpu.rs Outdated Show resolved Hide resolved

russell-islam changed the title ~~Add necessary APIs to interact with synthetic state components~~ [WIP] Add necessary APIs to interact with synthetic state components Apr 5, 2024

russell-islam force-pushed the muislam/state-components branch 4 times, most recently from 414dfb7 to f1ac961 Compare April 8, 2024 22:59