Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore inrefaces on restart in case of external VPP #512

Open
glazychev-art opened this issue Feb 18, 2022 · 18 comments
Open

Restore inrefaces on restart in case of external VPP #512

glazychev-art opened this issue Feb 18, 2022 · 18 comments
Assignees

Comments

@glazychev-art
Copy link
Contributor

Description

When we create VPP on the forwarder pod we assume that forwarder death means VPP death. But it is not right for the external VPP socket (e.g. Calico-VPP case). If forwarder restarts it can leave network interfaces there.

In this issue we need to prepare possible solutions.

@glazychev-art
Copy link
Contributor Author

Possible solution [DRAFT]

Use restoreInterface and expireInterface

  1. Improve tag chain element to store not only connID but nsm prefix, client or server identification and expiration time. For example it can looks like:
    nsm-c-1644245337-6434d7fe-a03c-40fe-8a38-321e5b7967c4, where
  • nsm - NSM prefix,
  • c or (s) - means client (or server),
  • 1644245337 - expiration time in sec,
  • 6434d7fe-a03c-40fe-8a38-321e5b7967c4 - connection ID
    After each refresh we need to update the expiration time of the corresponding interfaces.
  1. On the forwarder startup we need to Dump nsm interfaces and store them to the list of:
type interfaceInfo struct {
	clientIfIndex uint32
	serverIfIndex uint32
	connID string
	expirationTime uint32
}
  1. Implement restoreInterface chain element
    It receives a list of interfaceInfo on creation and restore the connection metadata in case of healing Request (or refresh)

  2. Implement expireInterface chain element.
    It will monitor the interfaces for their timely removal. It also receives a list of interfaceInfo on creation to start monitoring the old interfaces.
    When time runs out it deletes clientInterface, serverInterface and all possible cross-connects

@glazychev-art glazychev-art self-assigned this Feb 18, 2022
@denis-tingaikin
Copy link
Member

denis-tingaikin commented Feb 18, 2022

@glazychev-art Can we go with something more simple?

Basically, we need to do this:

  1. Forwarder starts -> dump vpp interfaces and remember all tagged interfaces.
  2. Start async delayed jobs for each tagged interface that will simply call Close via begin chain element. Delay time can be equal to maxtTokenLifeTime
  3. For each refresh request from the client cancel the jobs from step2.

I guess tag can be: ${forwawrderName}-${connectionId}. It will allow using multi-forwarder instances for the node.

@glazychev-art
Copy link
Contributor Author

I partly agree.

Probably we can use tag like this: ${forwarderName}-${connectionId}.
But what if forwarder restarted and ${forwarderName} was changed?

Concerning step 2:
We have no information about previous connections after Forwarder restart in begin:
https://github.com/networkservicemesh/sdk/blob/main/pkg/networkservice/common/begin/server.go#L87
And we have no metadata to delete something on Close.
For example, let's take a look at tap mechanism: https://github.com/networkservicemesh/sdk-vpp/blob/main/pkg/networkservice/mechanisms/kernel/kerneltap/common.go#L145-L146
maxtTokenLifeTime is expired:

  1. After dumping, we don't know what mechanism was used.
  2. We don't have swIfIndex in ifindex metadata. We need to fill all metadata somehow before the Close call.

The basic idea of the #512 (comment) is:

  1. Forwarder starts -> dump vpp interfaces and remember all tagged interfaces.
  2. Start async delayed jobs for each tagged interface that will just remove interfaces via vpp-api, regardless of mechanisms or metadata after expiration time (e.g. maxtTokenLifeTime) (I think we can also remove L2xc or L3xc)
  3. For each refresh request from the client cancel the jobs from step2 and fill metadata.

@glazychev-art
Copy link
Contributor Author

glazychev-art commented Feb 21, 2022

Additional thoughts.

If we want new forwarders to be responsible for the previous ones.

We need to remember, that:

  1. The forwarder can be restarted and its name can be saved.
  2. The forwarder can be restarted and its name can be changed.
  3. We can have many forwarders on the same node.

So, I think we need a more complex tag.
Like this: nsm-${client|server_identifier}-${refresh_identifier}-${forwarderName}-${connectionId}

Why do we need 'nsm-'?
If forwarder restarted and its name was changed - this prefix is a reliable way to find nsm-interfaces.

Why do we need '${client|server_identifier}-'?
If we want to restore the previous connection we need to distinguish client and server interfaces to store their swIfIndexes in ifindex metadata.

Why do we need '${forwarderName}'?
We need to figure out - does this interface belong to our forwarder or not?

Why do we need '${connectionId}'?
We need to know the connection this interface belongs to

Why do we need '${refresh_identifier}'?
If we know, that this interface doesn't belong to our forwarder - we can keep track it until it expires. We can check before deleting, is '${refresh_identifier}' was changed? If yes - that means this interface is used by other forwarder (identifier was updated on refresh) and we don't need to delete it.
This identifier can be a simple counter.

Another question arises - the tag length is limited to 64 characters - https://github.com/FDio/vpp/blob/master/src/vnet/interface.api#L363

@glazychev-art
Copy link
Contributor Author

glazychev-art commented Feb 22, 2022

We discussed another solution internally.

Probably we need to manage only the interfaces created by this forwarder or the previous one (i.e. its name was saved after the restart)
In this case it is enough to use the tag like: ${forwarderName}-${client|server_identifier}-${connectionId}.

In other cases:

  • If it was grace deletion - we can try to call Close for all of the connections before closing the application. We can get connections, for example, using monitor.
  • If it was force deletion - we have no guarantee that interfaces will be removed. And the new forwarders will not be responsible for the previous.

@denis-tingaikin
Copy link
Member

@glazychev-art

We've internally discussed this problem with @edwarnicke

Discussion results:

We're seeing two basic cases:

  1. Forwarder restart
  2. Forwarder delete(or upgrade)

Case 1, currently can be considered with a tag-based recovery strategy as we discussed in the PR meeting.

Case 2, currently is blocked. We need to verify that folks are using the k8s deployments upgrade feature. If so we can consider a solution with clearing resources on the application stops.

@glazychev-art
Copy link
Contributor Author

@denis-tingaikin @edwarnicke

So, I think now it will look like this:

Tag: ${forwarderName}-${client|server_identifier}-${connID}

We have 2 main steps:

Step 1
Forwarder starts -> dump vpp interfaces, parse and store to map[connID]*ConnectionInterfaces all tagged interfaces.

type DeleteFn func() error

type IfaceInfo struct {
	delete    DeleteFn
	value     interface{}
}

type ConnectionInterfaces struct {
	clientMeta map[string]*IfaceInfo
	serverMeta map[string]*IfaceInfo
}

ConnectionInterfaces.clientMeta/serverMeta map stores values for the metadata. For example, clientMeta["ifindex"].value = 1 will become to swIfIndex = 1 ifindex for the client metadata. We have several such metadatas (links, peers...)

We can determine which particular interface is being used by InterfaceDevType ("memif", "VXLAN" and so on...). So we can choose the correct delete function (e.g. "memif" - > DeleteFn).
So, after that step, we have parsed interfaces with the corresponding delete functions.

Step 2
Add restore chain element.

  • NewServer receives dumped map[connID]*ConnectionInterfaces. It also runs a goroutine for each connID to delete vpp-interfaces if they are no longer in use (call DeleteFns). Delay can be equal to maxtTokenLifeTime.
  • If we get a Request or Close with connID that we already have in the our map - stop goroutine and restore metadata for this connection.

@edwarnicke
Copy link
Member

@glazychev-art Didn't we discuss adding an expire time to the tag in VPP so we could simply time things out?

@glazychev-art
Copy link
Contributor Author

@edwarnicke
Yes, but:

  1. Tag will be longer (it is limited - https://github.com/FDio/vpp/blob/master/src/vnet/interface.api#L363)
  2. I think it's much easier if we always use maxtTokenLifeTime as expiration time. I don't think this will be a problem...

@edwarnicke
Copy link
Member

@glazychev-art I think you are correct, using maxTokenLifeTime should work, it will eventually clean up the connections if they are not renewed.

@edwarnicke
Copy link
Member

@glazychev-art Might it be possible to solve this in a more localized fashion.

Right now, understanding of how to handle particular interfaces is 'localized' in a particular mechanism chain element.

What you are suggesting in step 1 would require having a second place that understand all of the mechanisms.

Might it perhaps not be simpler to have each mechanism chain element dump interfaces, keep a map[string][*interfaces.SwInterfaceDetails]. When the mechanism element gets a a connection request, for which it cannot find a swIfIndex in the context, it can simply look in the map for an interface for that connId, and if it finds the connId, delete the entry from the Map, and carry on with that swIfIndex as if it had created it (presuming the interface is of the correct type).

This localizes the issue to a particular mechanism, and so we don't 'smear' localized knowledge into the global sphere.

maxExpireTime after launch, you can just delete anything remaining in the Map.

Thoughts?

@edwarnicke
Copy link
Member

One other question... have you given thought to how this would intersect with VNI allocation?

@glazychev-art
Copy link
Contributor Author

@edwarnicke
Very good solution! But in this case, we are missing one of the most important parts - how do we remove expired interfaces from the VPP after maxExpireTime?

VNI... It's good that you pointed it out. Now it is stored in the local map.

@glazychev-art
Copy link
Contributor Author

glazychev-art commented Mar 3, 2022

So, VNI. There is a problem with 2A case:

  1. If Client VxLAN side was failed - vni.Client doesn't work with the metadata at all - https://github.com/networkservicemesh/sdk/blob/main/pkg/networkservice/common/mechanisms/vxlan/vni/client.go#L56-L71
  2. If Server VxLAN side was failed:
    Case A: We receive the Request from the old client that we want to restore - vni.Server takes VNI from the mechanism. We can't use the mechanism, because we clean it before healing - https://github.com/networkservicemesh/sdk/blob/main/pkg/networkservice/common/begin/event_factory.go#L100
    Case B: We receive the Request from the new client, but we have a VxLAN tunnel, that is not restored yet (we didn't receive the Request from the "old" Client) and the generated VNI matched the existing one - I think we will fail creating a new VxLAN tunnel, retry will happen and we will try again with a new random VNI.

@glazychev-art
Copy link
Contributor Author

We have also the problem with wireguard - we can't get the wireguard peer index using dump - https://github.com/FDio/vpp/blob/master/src/plugins/wireguard/wireguard.api#L98-L109
We have a logic based on the peer index - for example HERE and HERE

@glazychev-art
Copy link
Contributor Author

glazychev-art commented Mar 5, 2022

So, for cases where the forwarder has its previous name after a restart and we use an interface dump, we have the following problems:

  1. VxLAN VNI - Restore inrefaces on restart in case of external VPP #512 (comment)
  2. Wireguard peer dump - Restore inrefaces on restart in case of external VPP #512 (comment)
    Edited: Wireguard patch was merged - https://gerrit.fd.io/r/c/vpp/+/35624

I guess that, this mostly applies to cases where the container is restarted, not the entire pod.

@glazychev-art
Copy link
Contributor Author

glazychev-art commented Mar 5, 2022

We know that when users update the forwarder, the pod restarts without saving the name. In this case, the correct solution is to remove the interfaces before terminating.

Moved to a separate issue - #516

@glazychev-art
Copy link
Contributor Author

@edwarnicke @denis-tingaikin
I've prepared the draft PR with an idea of how restoring can be implemented - #542
There, each element is only responsible for dumping its interfaces.
Could you have a look?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants