Restore inrefaces on restart in case of external VPP #512

glazychev-art · 2022-02-18T13:09:51Z

Description

When we create VPP on the forwarder pod we assume that forwarder death means VPP death. But it is not right for the external VPP socket (e.g. Calico-VPP case). If forwarder restarts it can leave network interfaces there.

In this issue we need to prepare possible solutions.

glazychev-art · 2022-02-18T13:30:13Z

Possible solution [DRAFT]

Use restoreInterface and expireInterface

Improve tag chain element to store not only connID but nsm prefix, client or server identification and expiration time. For example it can looks like:
nsm-c-1644245337-6434d7fe-a03c-40fe-8a38-321e5b7967c4, where

nsm - NSM prefix,
c or (s) - means client (or server),
1644245337 - expiration time in sec,
6434d7fe-a03c-40fe-8a38-321e5b7967c4 - connection ID
After each refresh we need to update the expiration time of the corresponding interfaces.

On the forwarder startup we need to Dump nsm interfaces and store them to the list of:

type interfaceInfo struct {
	clientIfIndex uint32
	serverIfIndex uint32
	connID string
	expirationTime uint32
}

Implement restoreInterface chain element
It receives a list of interfaceInfo on creation and restore the connection metadata in case of healing Request (or refresh)
Implement expireInterface chain element.
It will monitor the interfaces for their timely removal. It also receives a list of interfaceInfo on creation to start monitoring the old interfaces.
When time runs out it deletes clientInterface, serverInterface and all possible cross-connects

denis-tingaikin · 2022-02-18T22:38:51Z

@glazychev-art Can we go with something more simple?

Basically, we need to do this:

Forwarder starts -> dump vpp interfaces and remember all tagged interfaces.
Start async delayed jobs for each tagged interface that will simply call Close via begin chain element. Delay time can be equal to maxtTokenLifeTime
For each refresh request from the client cancel the jobs from step2.

I guess tag can be: ${forwawrderName}-${connectionId}. It will allow using multi-forwarder instances for the node.

glazychev-art · 2022-02-21T07:54:30Z

I partly agree.

Probably we can use tag like this: ${forwarderName}-${connectionId}.
But what if forwarder restarted and ${forwarderName} was changed?

Concerning step 2:
We have no information about previous connections after Forwarder restart in begin:
https://github.com/networkservicemesh/sdk/blob/main/pkg/networkservice/common/begin/server.go#L87
And we have no metadata to delete something on Close.
For example, let's take a look at tap mechanism: https://github.com/networkservicemesh/sdk-vpp/blob/main/pkg/networkservice/mechanisms/kernel/kerneltap/common.go#L145-L146
maxtTokenLifeTime is expired:

After dumping, we don't know what mechanism was used.
We don't have swIfIndex in ifindex metadata. We need to fill all metadata somehow before the Close call.

The basic idea of the #512 (comment) is:

Forwarder starts -> dump vpp interfaces and remember all tagged interfaces.
Start async delayed jobs for each tagged interface that will just remove interfaces via vpp-api, regardless of mechanisms or metadata after expiration time (e.g. maxtTokenLifeTime) (I think we can also remove L2xc or L3xc)
For each refresh request from the client cancel the jobs from step2 and fill metadata.

glazychev-art · 2022-02-21T13:34:20Z

Additional thoughts.

If we want new forwarders to be responsible for the previous ones.

We need to remember, that:

The forwarder can be restarted and its name can be saved.
The forwarder can be restarted and its name can be changed.
We can have many forwarders on the same node.

So, I think we need a more complex tag.
Like this: nsm-${client|server_identifier}-${refresh_identifier}-${forwarderName}-${connectionId}

Why do we need 'nsm-'?
If forwarder restarted and its name was changed - this prefix is a reliable way to find nsm-interfaces.

Why do we need '${client|server_identifier}-'?
If we want to restore the previous connection we need to distinguish client and server interfaces to store their swIfIndexes in ifindex metadata.

Why do we need '${forwarderName}'?
We need to figure out - does this interface belong to our forwarder or not?

Why do we need '${connectionId}'?
We need to know the connection this interface belongs to

Why do we need '${refresh_identifier}'?
If we know, that this interface doesn't belong to our forwarder - we can keep track it until it expires. We can check before deleting, is '${refresh_identifier}' was changed? If yes - that means this interface is used by other forwarder (identifier was updated on refresh) and we don't need to delete it.
This identifier can be a simple counter.

Another question arises - the tag length is limited to 64 characters - https://github.com/FDio/vpp/blob/master/src/vnet/interface.api#L363

glazychev-art · 2022-02-22T14:12:59Z

We discussed another solution internally.

Probably we need to manage only the interfaces created by this forwarder or the previous one (i.e. its name was saved after the restart)
In this case it is enough to use the tag like: ${forwarderName}-${client|server_identifier}-${connectionId}.

In other cases:

If it was grace deletion - we can try to call Close for all of the connections before closing the application. We can get connections, for example, using monitor.
If it was force deletion - we have no guarantee that interfaces will be removed. And the new forwarders will not be responsible for the previous.

denis-tingaikin · 2022-02-26T18:50:08Z

@glazychev-art

We've internally discussed this problem with @edwarnicke

Discussion results:

We're seeing two basic cases:

Forwarder restart
Forwarder delete(or upgrade)

Case 1, currently can be considered with a tag-based recovery strategy as we discussed in the PR meeting.

Case 2, currently is blocked. We need to verify that folks are using the k8s deployments upgrade feature. If so we can consider a solution with clearing resources on the application stops.

glazychev-art · 2022-03-01T14:28:58Z

@denis-tingaikin @edwarnicke

So, I think now it will look like this:

Tag: ${forwarderName}-${client|server_identifier}-${connID}

We have 2 main steps:

Step 1
Forwarder starts -> dump vpp interfaces, parse and store to map[connID]*ConnectionInterfaces all tagged interfaces.

type DeleteFn func() error

type IfaceInfo struct {
	delete    DeleteFn
	value     interface{}
}

type ConnectionInterfaces struct {
	clientMeta map[string]*IfaceInfo
	serverMeta map[string]*IfaceInfo
}

ConnectionInterfaces.clientMeta/serverMeta map stores values for the metadata. For example, clientMeta["ifindex"].value = 1 will become to swIfIndex = 1 ifindex for the client metadata. We have several such metadatas (links, peers...)

We can determine which particular interface is being used by InterfaceDevType ("memif", "VXLAN" and so on...). So we can choose the correct delete function (e.g. "memif" - > DeleteFn).
So, after that step, we have parsed interfaces with the corresponding delete functions.

Step 2
Add restore chain element.

NewServer receives dumped map[connID]*ConnectionInterfaces. It also runs a goroutine for each connID to delete vpp-interfaces if they are no longer in use (call DeleteFns). Delay can be equal to maxtTokenLifeTime.
If we get a Request or Close with connID that we already have in the our map - stop goroutine and restore metadata for this connection.

edwarnicke · 2022-03-01T14:46:20Z

@glazychev-art Didn't we discuss adding an expire time to the tag in VPP so we could simply time things out?

glazychev-art · 2022-03-01T14:53:24Z

@edwarnicke
Yes, but:

Tag will be longer (it is limited - https://github.com/FDio/vpp/blob/master/src/vnet/interface.api#L363)
I think it's much easier if we always use maxtTokenLifeTime as expiration time. I don't think this will be a problem...

edwarnicke · 2022-03-02T14:56:15Z

@glazychev-art I think you are correct, using maxTokenLifeTime should work, it will eventually clean up the connections if they are not renewed.

edwarnicke · 2022-03-02T21:59:41Z

@glazychev-art Might it be possible to solve this in a more localized fashion.

Right now, understanding of how to handle particular interfaces is 'localized' in a particular mechanism chain element.

What you are suggesting in step 1 would require having a second place that understand all of the mechanisms.

Might it perhaps not be simpler to have each mechanism chain element dump interfaces, keep a map[string][*interfaces.SwInterfaceDetails]. When the mechanism element gets a a connection request, for which it cannot find a swIfIndex in the context, it can simply look in the map for an interface for that connId, and if it finds the connId, delete the entry from the Map, and carry on with that swIfIndex as if it had created it (presuming the interface is of the correct type).

This localizes the issue to a particular mechanism, and so we don't 'smear' localized knowledge into the global sphere.

maxExpireTime after launch, you can just delete anything remaining in the Map.

Thoughts?

edwarnicke · 2022-03-02T22:00:47Z

One other question... have you given thought to how this would intersect with VNI allocation?

glazychev-art · 2022-03-03T04:19:03Z

@edwarnicke
Very good solution! But in this case, we are missing one of the most important parts - how do we remove expired interfaces from the VPP after maxExpireTime?

VNI... It's good that you pointed it out. Now it is stored in the local map.

glazychev-art · 2022-03-03T05:04:45Z

So, VNI. There is a problem with 2A case:

If Client VxLAN side was failed - vni.Client doesn't work with the metadata at all - https://github.com/networkservicemesh/sdk/blob/main/pkg/networkservice/common/mechanisms/vxlan/vni/client.go#L56-L71
If Server VxLAN side was failed:
Case A: We receive the Request from the old client that we want to restore - ~~vni.Server takes VNI from the mechanism~~. We can't use the mechanism, because we clean it before healing - https://github.com/networkservicemesh/sdk/blob/main/pkg/networkservice/common/begin/event_factory.go#L100
Case B: We receive the Request from the new client, but we have a VxLAN tunnel, that is not restored yet (we didn't receive the Request from the "old" Client) and the generated VNI matched the existing one - I think we will fail creating a new VxLAN tunnel, retry will happen and we will try again with a new random VNI.

glazychev-art · 2022-03-04T13:02:36Z

We have also the problem with wireguard - we can't get the wireguard peer index using dump - https://github.com/FDio/vpp/blob/master/src/plugins/wireguard/wireguard.api#L98-L109
We have a logic based on the peer index - for example HERE and HERE

glazychev-art · 2022-03-05T12:45:43Z

So, for cases where the forwarder has its previous name after a restart and we use an interface dump, we have the following problems:

VxLAN VNI - Restore inrefaces on restart in case of external VPP #512 (comment)
Wireguard peer dump - Restore inrefaces on restart in case of external VPP #512 (comment)
Edited: Wireguard patch was merged - https://gerrit.fd.io/r/c/vpp/+/35624

I guess that, this mostly applies to cases where the container is restarted, not the entire pod.

glazychev-art · 2022-03-05T13:05:48Z

We know that when users update the forwarder, the pod restarts without saving the name. In this case, the correct solution is to remove the interfaces before terminating.

Moved to a separate issue - #516

glazychev-art · 2022-03-30T13:45:50Z

@edwarnicke @denis-tingaikin
I've prepared the draft PR with an idea of how restoring can be implemented - #542
There, each element is only responsible for dumping its interfaces.
Could you have a look?

glazychev-art self-assigned this Feb 18, 2022

glazychev-art mentioned this issue Mar 14, 2022

[qfix] Fix typos in tag chain element #520

Merged

This was referenced Mar 30, 2022

[PoC] Restore vpp interfaces #542

Draft

Delete tap interfaces on startup #545

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restore inrefaces on restart in case of external VPP #512

Restore inrefaces on restart in case of external VPP #512

glazychev-art commented Feb 18, 2022

glazychev-art commented Feb 18, 2022

denis-tingaikin commented Feb 18, 2022 •

edited

Loading

glazychev-art commented Feb 21, 2022

glazychev-art commented Feb 21, 2022 •

edited

Loading

glazychev-art commented Feb 22, 2022 •

edited

Loading

denis-tingaikin commented Feb 26, 2022

glazychev-art commented Mar 1, 2022

edwarnicke commented Mar 1, 2022

glazychev-art commented Mar 1, 2022

edwarnicke commented Mar 2, 2022

edwarnicke commented Mar 2, 2022

edwarnicke commented Mar 2, 2022

glazychev-art commented Mar 3, 2022

glazychev-art commented Mar 3, 2022 •

edited

Loading

glazychev-art commented Mar 4, 2022

glazychev-art commented Mar 5, 2022 •

edited

Loading

glazychev-art commented Mar 5, 2022 •

edited

Loading

glazychev-art commented Mar 30, 2022

Restore inrefaces on restart in case of external VPP #512

Restore inrefaces on restart in case of external VPP #512

Comments

glazychev-art commented Feb 18, 2022

Description

glazychev-art commented Feb 18, 2022

Possible solution [DRAFT]

denis-tingaikin commented Feb 18, 2022 • edited Loading

glazychev-art commented Feb 21, 2022

glazychev-art commented Feb 21, 2022 • edited Loading

glazychev-art commented Feb 22, 2022 • edited Loading

denis-tingaikin commented Feb 26, 2022

glazychev-art commented Mar 1, 2022

edwarnicke commented Mar 1, 2022

glazychev-art commented Mar 1, 2022

edwarnicke commented Mar 2, 2022

edwarnicke commented Mar 2, 2022

edwarnicke commented Mar 2, 2022

glazychev-art commented Mar 3, 2022

glazychev-art commented Mar 3, 2022 • edited Loading

glazychev-art commented Mar 4, 2022

glazychev-art commented Mar 5, 2022 • edited Loading

glazychev-art commented Mar 5, 2022 • edited Loading

glazychev-art commented Mar 30, 2022

denis-tingaikin commented Feb 18, 2022 •

edited

Loading

glazychev-art commented Feb 21, 2022 •

edited

Loading

glazychev-art commented Feb 22, 2022 •

edited

Loading

glazychev-art commented Mar 3, 2022 •

edited

Loading

glazychev-art commented Mar 5, 2022 •

edited

Loading

glazychev-art commented Mar 5, 2022 •

edited

Loading