Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the az.create-aks task #967

Merged
merged 2 commits into from
Jul 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ Pulumi.*.yaml

# Kubeconfig files generated by invoke tasks
*-kubeconfig.yaml
*-config.yaml

# JetBrains IDE
.idea/
Expand Down
12 changes: 11 additions & 1 deletion .test_infra_config.yaml.example
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,16 @@ configParams:
# would be the github team name

teamTag: ""

# azure configuration
azure:
# azure account where resources will be created on
# defaults to "agent-sandbox"
account: "agent-sandbox"

# local path to your public ssh key
publicKeyPath: ""

# agent related config
agent:
# Datadog API key
Expand All @@ -31,7 +41,7 @@ configParams:

apiKey: "00000000000000000000000000000000"
# Raw stack parameters for Pulumi, passed as-is to the so called ConfigMap
# There is no validation over these values, usefull to pass parameters not yet documented
# There is no validation over these values, useful to pass parameters not yet documented

stackParams:
# # namespace
Expand Down
52 changes: 31 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ This repository contains IaC code based on Pulumi to provision dynamic test infr
To run scripts and code in this repository, you will need:

* [Go](https://golang.org/doc/install) 1.19 or later. You'll also need to set your `$GOPATH` and have `$GOPATH/bin` in your path.
* Python 3.7+ along with development libraries for tooling.
* Python 3.9+ along with development libraries for tooling.
* `account-admin` role on AWS `agent-sandbox` account. Ensure it by running

```bash
Expand Down Expand Up @@ -54,22 +54,32 @@ inv setup
Invoke tasks help deploying most common environments - VMs, Docker, ECS, EKS. Run `inv -l` to learn more.

```bash
inv -l
❯ inv -l
Available tasks:
create-aks Create a new AKS environment. It lasts around 5 minutes.
create-docker Create a docker environment.
create-ecs Create a new ECS environment.
create-eks Create a new EKS environment. It lasts around 20 minutes.
aws.create-vm Create a new virtual machine on aws.
az.create-vm Create a new virtual machine on azure.
destroy-aks Destroy a AKS environment created with invoke create-aks.
destroy-docker Destroy an environment created by invoke create_docker.
destroy-ecs Destroy a ECS environment created with invoke create-ecs.
destroy-eks Destroy a EKS environment created with invoke create-eks.
aws.destroy-vm Destroy a virtual machine on aws.
az.destroy-vm Destroy a virtual machine on azure.
check-s3-image-exists Verify if an image exists in the s3 repository to create a vm
retry-job Retry gitlab pipeline job
aws.create-docker Create a docker environment.
aws.create-ecs Create a new ECS environment.
aws.create-eks Create a new EKS environment. It lasts around 20 minutes.
aws.create-installer-lab
aws.create-kind Create a kind environment.
aws.create-vm Create a new virtual machine on aws.
aws.destroy-docker Destroy an environment created by invoke aws.create-docker.
aws.destroy-ecs Destroy a ECS environment created with invoke aws.create-ecs.
aws.destroy-eks Destroy a EKS environment created with invoke aws.create-eks.
aws.destroy-installer-lab
aws.destroy-kind Destroy an environment created by invoke aws.create-kind.
aws.destroy-vm Destroy a new virtual machine on aws.
az.create-aks Create a new AKS environment. It lasts around 5 minutes.
az.create-vm Create a new virtual machine on azure.
az.destroy-aks Destroy a AKS environment created with invoke az.create-aks.
az.destroy-vm Destroy a new virtual machine on azure.
ci.create-bump-pr-and-close-stale-ones-on-datadog-agent
setup.debug Debug E2E and test-infra-definitions required tools and configuration
setup.debug-keys Debug E2E and test-infra-definitions SSH keys
setup.setup (setup) Setup a local environment, interactively by default
test.check-xslt Checks the XSLT transformations in the scenarios/aws/microVMs/microvms/resources path
```
Run any `-h` on any of the available tasks for more information
Expand Down Expand Up @@ -111,7 +121,7 @@ In this example, we're going to create an ECS Cluster:
```
# You need to have a DD APIKey in variable DD_API_KEY
aws-vault exec sso-agent-sandbox-account-admin -- pulumi up -c scenario=aws/ecs -c ddinfra:aws/defaultKeyPairName=<your_exisiting_aws_keypair_name> -c ddinfra:env=aws/agent-sandbox -c ddagent:apiKey=$DD_API_KEY -s <your_name>-ecs-test
pulumi up -c scenario=aws/ecs -c ddinfra:aws/defaultKeyPairName=<your_exisiting_aws_keypair_name> -c ddinfra:env=aws/agent-sandbox -c ddagent:apiKey=$DD_API_KEY -s <your_name>-ecs-test
```
In case of failure, you may update some parameters or configuration and run the command again.
Expand All @@ -122,10 +132,10 @@ Note that all `-c` parameters have been set in your `Pulumi.<stack_name>.yaml` f
### Destroying a stack
Once you're finished with the test environment you've created, you can safely delete it.
To do this, we'll use the `destroy` operation referecing our `Stack` file:
To do this, we'll use the `destroy` operation referencing our `Stack` file:
```
aws-vault exec sso-agent-sandbox-account-admin -- pulumi destroy -s <your_name>-ecs-test
pulumi destroy -s <your_name>-ecs-test
```
Note that we don't need to use `-c` again as the configuration values were put into the `Stack` file.
Expand All @@ -140,21 +150,21 @@ pulumi stack rm <your_name>-ecs-test
```
# You need to have a DD APIKey in variable DD_API_KEY
aws-vault exec sso-agent-sandbox-account-admin -- pulumi up -c scenario=aws/dockervm -c ddinfra:aws/defaultKeyPairName=<your_exisiting_aws_keypair_name> -c ddinfra:env=aws/agent-sandbox -c ddagent:apiKey=$DD_API_KEY -c ddinfra:aws/defaultPrivateKeyPath=$HOME/.ssh/id_rsa -s <your_name>-docker
pulumi up -c scenario=aws/dockervm -c ddinfra:aws/defaultKeyPairName=<your_exisiting_aws_keypair_name> -c ddinfra:env=aws/agent-sandbox -c ddagent:apiKey=$DD_API_KEY -c ddinfra:aws/defaultPrivateKeyPath=$HOME/.ssh/id_rsa -s <your_name>-docker
```
## Quick start: Create an ECS EC2 (Windows/Linux) + Fargate (Linux) Cluster
```
# You need to have a DD APIKey in variable DD_API_KEY
aws-vault exec sso-agent-sandbox-account-admin -- pulumi up -c scenario=aws/ecs -c ddinfra:aws/defaultKeyPairName=<your_exisiting_aws_keypair_name> -c ddinfra:env=aws/agent-sandbox -c ddagent:apiKey=$DD_API_KEY -s <your_name>-ecs
pulumi up -c scenario=aws/ecs -c ddinfra:aws/defaultKeyPairName=<your_exisiting_aws_keypair_name> -c ddinfra:env=aws/agent-sandbox -c ddagent:apiKey=$DD_API_KEY -s <your_name>-ecs
```
## Quick start: Create an EKS (Linux/Windows) + Fargate (Linux) Cluster + Agent (Helm)
```
# You need to have a DD APIKey AND APPKey in variable DD_API_KEY / DD_APP_KEY
aws-vault exec sso-agent-sandbox-account-admin -- pulumi up -c scenario=aws/eks -c ddinfra:aws/defaultKeyPairName=<your_exisiting_aws_keypair_name> -c ddinfra:env=aws/agent-sandbox -c ddagent:apiKey=$DD_API_KEY -c ddagent:appKey=$DD_APP_KEY -s <your_name>-eks
pulumi up -c scenario=aws/eks -c ddinfra:aws/defaultKeyPairName=<your_exisiting_aws_keypair_name> -c ddinfra:env=aws/agent-sandbox -c ddagent:apiKey=$DD_API_KEY -c ddagent:appKey=$DD_APP_KEY -s <your_name>-eks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏 praise

```
## Troubleshooting
Expand Down
6 changes: 3 additions & 3 deletions integration-tests/invoke_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -60,12 +60,12 @@ func TestInvokes(t *testing.T) {
testAwsInvokeVM(t, tmpConfigFile, *workingDir)
})

t.Run("invoke-docker-vm", func(t *testing.T) {
t.Run("aws.invoke-docker-vm", func(t *testing.T) {
t.Parallel()
testInvokeDockerVM(t, tmpConfigFile, *workingDir)
})

t.Run("invoke-kind", func(t *testing.T) {
t.Run("aws.invoke-kind", func(t *testing.T) {
t.Parallel()
testInvokeKind(t, tmpConfigFile, *workingDir)
})
Expand All @@ -76,7 +76,7 @@ func testAzureInvokeVM(t *testing.T, tmpConfigFile string, workingDirectory stri

stackName := fmt.Sprintf("az-invoke-vm-%s", os.Getenv("CI_PIPELINE_ID"))
t.Log("creating vm")
createCmd := exec.Command("invoke", "az.create-vm", "--no-interactive", "--stack-name", stackName, "--config-path", tmpConfigFile, "--account", "agent-qa")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ question
It doesn't work with agent-qa? Or we are setting it at setup?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved it to the config file: https://github.com/DataDog/test-infra-definitions/pull/967/files#diff-5b0d7f0d4cd5bb98edb0f1e6b5ac0747cae4be958137b00d9785e4863a90acdcR12 sorry it was not super explicit. We already do the same thing with AWS so I just centralised everything

createCmd := exec.Command("invoke", "az.create-vm", "--no-interactive", "--stack-name", stackName, "--config-path", tmpConfigFile)
createCmd.Dir = workingDirectory
createOutput, err := createCmd.Output()
assert.NoError(t, err, "Error found creating vm: %s", string(createOutput))
Expand Down
1 change: 1 addition & 0 deletions integration-tests/testfixture/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ configParams:
teamTag: test-ci
azure:
publicKeyPath: PUBLIC_KEY_PATH
account: ACCOUNT
pulumi:
logLevel: 1
logToStdErr: true
Expand Down
4 changes: 2 additions & 2 deletions resources/azure/environmentDefaults.go
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ func agentSandboxDefault() environmentDefault {
defaultVNet: "/subscriptions/9972cab2-9e99-419b-a683-86bfa77b3df1/resourceGroups/dd-agent-sandbox/providers/Microsoft.Network/virtualNetworks/dd-agent-sandbox",
defaultSubnet: "/subscriptions/9972cab2-9e99-419b-a683-86bfa77b3df1/resourceGroups/dd-agent-sandbox/providers/Microsoft.Network/virtualNetworks/dd-agent-sandbox/subnets/dd-agent-sandbox-private",
defaultSecurityGroup: "/subscriptions/9972cab2-9e99-419b-a683-86bfa77b3df1/resourceGroups/dd-agent-sandbox/providers/Microsoft.Network/networkSecurityGroups/appgategreen",
defaultInstanceType: "Standard_D2a_v4", // Allows nested virtualization for kata runtimes
defaultInstanceType: "Standard_D4s_v5", // Allows nested virtualization for kata runtimes
defaultARMInstanceType: "Standard_D4ps_v5", // No azure arm instance supports nested virtualization
aks: ddInfraAks{
linuxKataNodeGroup: true,
Expand All @@ -97,7 +97,7 @@ func agentQaDefault() environmentDefault {
defaultVNet: "/subscriptions/c767177d-c6fc-47d3-a87e-3ab195f5b99e/resourceGroups/dd-agent-qa/providers/Microsoft.Network/virtualNetworks/dd-agent-qa",
defaultSubnet: "/subscriptions/c767177d-c6fc-47d3-a87e-3ab195f5b99e/resourceGroups/dd-agent-qa/providers/Microsoft.Network/virtualNetworks/dd-agent-qa/subnets/dd-agent-qa-private",
defaultSecurityGroup: "/subscriptions/c767177d-c6fc-47d3-a87e-3ab195f5b99e/resourceGroups/dd-agent-qa/providers/Microsoft.Network/networkSecurityGroups/appgategreen",
defaultInstanceType: "Standard_D2a_v4", // Allows nested virtualization for kata runtimes
defaultInstanceType: "Standard_D4s_v5", // Allows nested virtualization for kata runtimes
defaultARMInstanceType: "Standard_D4ps_v5", // No azure arm instance supports nested virtualization
aks: ddInfraAks{
linuxKataNodeGroup: true,
Expand Down
39 changes: 28 additions & 11 deletions tasks/azure/aks.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,12 @@
import pyperclip
import yaml
from invoke.context import Context
from invoke.exceptions import Exit
from invoke.tasks import task
from pydantic_core._pydantic_core import ValidationError

from tasks import doc, tool
from tasks import config, doc, tool
from tasks.config import get_full_profile_path
from tasks.deploy import deploy
from tasks.destroy import destroy

Expand All @@ -28,13 +31,23 @@ def create_aks(
install_agent: Optional[bool] = True,
install_workload: Optional[bool] = True,
agent_version: Optional[str] = None,
config_path: Optional[str] = None,
account: Optional[str] = None,
interactive: Optional[bool] = True,
):
"""
Create a new AKS environment. It lasts around 5 minutes.
"""

extra_flags = {}
extra_flags["ddinfra:env"] = "az/sandbox"
try:
cfg = config.get_local_config(config_path)
except ValidationError as e:
raise Exit(f"Error in config {get_full_profile_path(config_path)}") from e

extra_flags = {
"ddinfra:env": f"az/{account if account else cfg.get_azure().account}",
"ddinfra:az/defaultPublicKeyPath": cfg.get_azure().publicKeyPath,
}

full_stack_name = deploy(
ctx,
Expand All @@ -46,22 +59,26 @@ def create_aks(
install_workload=install_workload,
agent_version=agent_version,
extra_flags=extra_flags,
config_path=config_path,
)

tool.notify(ctx, "Your AKS cluster is now created")
if interactive:
tool.notify(ctx, "Your AKS cluster is now created")

_show_connection_message(ctx, full_stack_name)
_show_connection_message(ctx, full_stack_name, interactive)


@task(help={"stack_name": doc.stack_name, "yes": doc.yes})
def destroy_aks(ctx: Context, stack_name: Optional[str] = None, yes: Optional[bool] = False):
def destroy_aks(
ctx: Context, stack_name: Optional[str] = None, yes: Optional[bool] = False, config_path: Optional[str] = None
):
"""
Destroy a AKS environment created with invoke az.create-aks.
"""
destroy(ctx, scenario_name=scenario_name, stack=stack_name, force_yes=yes)
destroy(ctx, scenario_name=scenario_name, stack=stack_name, force_yes=yes, config_path=config_path)


def _show_connection_message(ctx: Context, full_stack_name: str):
def _show_connection_message(ctx: Context, full_stack_name: str, copy_to_clipboard: bool | None):
outputs = tool.get_stack_json_outputs(ctx, full_stack_name)
kubeconfig_output = yaml.safe_load(outputs["dd-Cluster-az-aks"]["kubeConfig"])
kubeconfig_content = yaml.dump(kubeconfig_output)
Expand All @@ -73,6 +90,6 @@ def _show_connection_message(ctx: Context, full_stack_name: str):
command = f"KUBECONFIG={kubeconfig} kubectl get nodes"

print(f"\nYou can run the following command to connect to the AKS cluster\n\n{command}\n")

input("Press a key to copy command to clipboard...")
pyperclip.copy(command)
if copy_to_clipboard:
input("Press a key to copy command to clipboard...")
pyperclip.copy(command)
7 changes: 4 additions & 3 deletions tasks/azure/vm.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,9 +52,10 @@ def create_vm(
if not cfg.get_azure().publicKeyPath:
raise Exit("The field `azure.publicKeyPath` is required in the config file")

extra_flags = dict()
extra_flags["ddinfra:env"] = f"az/{account if account else cfg.get_azure().account}"
extra_flags["ddinfra:az/defaultPublicKeyPath"] = cfg.get_azure().publicKeyPath
extra_flags = {
"ddinfra:env": f"az/{account if account else cfg.get_azure().account}",
"ddinfra:az/defaultPublicKeyPath": cfg.get_azure().publicKeyPath,
}

if ssh_user:
extra_flags["ddinfra:sshUser"] = ssh_user
Expand Down