Pythian Blog: Technical Track

Using El Carro Operator on AWS

Introduction

Google recently released a Kubernetes operator for Oracle, El Carro.

The project includes good examples of how it works on GCP and on your local computer (using minikube). Given it is a portable implementation, we wanted to give it a try on AWS deploying an Oracle XE database.

The changes from the original instructions are mainly to replace GCP services with the equivalent in AWS:

This paper follows the steps performed using AWS services to deploy an Oracle XE database on k8s using the El Carro operator.

Note that this is just an example covering similar steps described in the 18c XE quickstart guide for GCP; it may not be the only way to make this work in AWS. Also, the names for the resources we create in AWS are the same as described in the GCP guide. We were looking to make it easy to follow and compare, but you may want to customize that to meet your needs—for example, GCLOUD as the container database name, gke as the dbDomain, csi-gce-pd storage class name, etc.

Preparation

I used an AWS account still in its first year with enough free services quota for most of the compute and storage needs of this test.

The first difference I found is that while you can run this test on GCP using Cloud Shell, the equivalent AWS CloudShell doesn’t support Docker containers. Looking to save time, we used a small EC2 instance (t2.micro) with the Amazon Linux 2 AMI image, where docker can be installed easily.

All steps below were executed as the ec2-user from within the EC2 instance:

1) Download the El Carro software:

$ wget https://github.com/GoogleCloudPlatform/elcarro-oracle-operator/releases/download/v0.0.0-alpha/release-artifacts.tar.gz
$ tar -xzvf release-artifacts.tar.gz

That will create the directory v0.0.0-alpha in your working directory.

2) Install git and docker:

$ sudo amazon-linux-extras install docker
$ sudo yum update -y
$ sudo service docker start
$ sudo usermod -a -G docker ec2-user
$ reboot
$ sudo yum install git

3) Configure the AWS client:

$ aws configure

Enter the following data from your account, which is available in your AWS console:

AWS Access Key ID [None]: <your AK id>
AWS Secret Access Key [None]: <your SAK>
Default region name [None]: <your region>

Prepare environment variables for config values that will be needed frequently: Zone and Account ID:

$ export ZONE=$(curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone | sed 's/\(.*\)[a-z]/\1/') ; echo $ZONE
$ export ACC_ID=$(aws sts get-caller-identity --query "Account") ; echo $ACC_ID

Creating a Containerized Oracle Database Image

The following steps create a docker container image and pushes it into an AWS ECR repository, automating the process using AWS CodeBuild,

1) Create a service role, needed for CodeBuild as described here:

$ vi create-service-role.json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "codebuild.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

$ vi put-role-policy.json
$ cat put-role-policy.json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "CloudWatchLogsPolicy",
      "Effect": "Allow",
      "Action": [
        "logs:CreateLogGroup",
        "logs:CreateLogStream",
        "logs:PutLogEvents"
      ],
      "Resource": [
        "*"
      ]
    },
    {
      "Sid": "CodeCommitPolicy",
      "Effect": "Allow",
      "Action": [
        "codecommit:GitPull"
      ],
      "Resource": [
        "*"
      ]
    },
    {
      "Sid": "S3GetObjectPolicy",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:GetObjectVersion"
      ],
      "Resource": [
        "*"
      ]
    },
    {
      "Sid": "S3PutObjectPolicy",
      "Effect": "Allow",
      "Action": [
        "s3:PutObject"
      ],
      "Resource": [
        "*"
      ]
    },
    {
      "Sid": "S3BucketIdentity",
      "Effect": "Allow",
      "Action": [
        "s3:GetBucketAcl",
        "s3:GetBucketLocation"
      ],
      "Resource": [
        "*"
      ]
    },
    {
      "Action": [
        "ecr:BatchCheckLayerAvailability",
        "ecr:CompleteLayerUpload",
        "ecr:GetAuthorizationToken",
        "ecr:InitiateLayerUpload",
        "ecr:PutImage",
        "ecr:UploadLayerPart"
      ],
      "Resource": "*",
      "Effect": "Allow"
   }
  ]
}

$ aws iam create-role --role-name CodeBuildServiceRole --assume-role-policy-document file://create-service-role.json
$ aws iam put-role-policy --role-name CodeBuildServiceRole --policy-name CodeBuildServiceRolePolicy --policy-document file://put-role-policy.json

2) Create an ECR repository. We are naming it elcarro:

$ aws ecr create-repository \
>     --repository-name elcarro \
>     --image-scanning-configuration scanOnPush=true \
>     --region ${ZONE}

3) Create the build instructions file to create the container image and push it into the registry. We are also including in that directory the scripts used by those steps—Dockerfile and shell scripts called from there:

$ mkdir cloudbuild 
$ cp v0.0.0-alpha/dbimage/* cloudbuild/ 
$ cd cloudbuild/ 
[cloudbuild]$ rm cloudbuild.yaml cloudbuild-18c-xe.yaml image_build.sh 
[cloudbuild]$ vi buildspec.yml 
[cloudbuild]$ cat buildspec.yml
version: 0.2

phases:
  pre_build:
    commands:
      - echo Logging in to Amazon ECR...
      - aws ecr get-login-password --region $AWS_DEFAULT_REGION | docker login --username AWS --password-stdin $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com
  build:
    commands:
      - echo Build started on `date`
      - echo Building the Docker image...          
      - docker build --no-cache --build-arg=DB_VERSION=18c --build-arg=CDB_NAME=GCLOUD --build-arg=CHARACTER_SET=$_CHARACTER_SET --build-arg=EDITION=xe -t $IMAGE_REPO_NAME:$IMAGE_TAG .
      - docker tag $IMAGE_REPO_NAME:$IMAGE_TAG $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG 
  post_build:
    commands:
      - echo Build completed on `date`
      - echo Pushing the Docker image...
      - docker push $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG

4) Prepare a zip with all those files:

[cloudbuild]$ zip ../elCarroImage.zip *

5) Create an S3 bucket and copy the zip file:

$ aws s3 mb s3://codebuild-${ZONE}-${ACC_ID}-input-bucket --region $ZONE
$ aws s3 --region $ZONE cp elCarroImage.zip s3://codebuild-${ZONE}-${ACC_ID}-input-bucket --acl public-read
upload: ./elCarroImage.zip to s3://codebuild-<ZONE>-<ACC_ID>-input-bucket/elCarroImage.zip

6) Prepare a build JSON file for this project – replace the strings <$ZONE> and <$ACC_ID> with their actual values, as this is a static file:

$ cat elcarro-ECR.json
{
  "name": "elCarro",
  "source": {
    "type": "S3",
    "location": "codebuild-<$ZONE>-<$ACC_ID>-input-bucket/elCarroImage.zip"
  },
  "artifacts": {
    "type": "NO_ARTIFACTS"
  },
  "environment": {
    "type": "LINUX_CONTAINER",
    "image": "aws/codebuild/standard:4.0",
    "computeType": "BUILD_GENERAL1_SMALL",
    "environmentVariables": [
      {
        "name": "AWS_DEFAULT_REGION",
        "value": "<$ZONE>"
      },
      {
        "name": "AWS_ACCOUNT_ID",
        "value": "<$ACC_ID>"
      },
      {
        "name": "IMAGE_REPO_NAME",
        "value": "elcarro"
      },
      {
        "name": "IMAGE_TAG",
        "value": "latest"
      }
    ],
    "privilegedMode": true
  },
  "serviceRole": "arn:aws:iam::<$ACC_ID>:role/CodeBuildServiceRole"

7) Create the build project:

$ aws codebuild create-project --cli-input-json file://elcarro-ECR.json

8) Finally, start the build:

$ aws codebuild start-build --project-name elCarro

To check progress from the CLI:

$ aws codebuild list-builds
{
    "ids": [
        "elCarro:59ef7722-06d6-4a1d-b2d5-14d967e9e4df"
    ]
}

We can see build details using the latest ID from the above output:

$ aws codebuild batch-get-builds --ids <elCarro:59ef7722-06d6-4a1d-b2d5-14d967e9e4df>

The build takes around 20 minutes. Once completed successfully, we can see the image in our ECR repo:

$ aws ecr list-images --repository-name elcarro
{
    "imageIds": [
        {
            "imageTag": "latest",
            "imageDigest": "sha256:7a2bd504abdf7c959332601b1deef98dda19418a252b510b604779a6143ec809"
        }
    ]
}

Create a Kubernetes cluster

We need to install the CLI utils first:

$ curl -o kubectl https://amazon-eks.s3.us-west-2.amazonaws.com/1.20.4/2021-04-12/bin/linux/amd64/kubectl
$ chmod +x ./kubectl
$ mkdir -p $HOME/bin && cp ./kubectl $HOME/bin/kubectl && export PATH=$PATH:$HOME/bin
$ echo 'export PATH=$PATH:$HOME/bin' >> ~/.bashrc
$ kubectl version --short --client
Client Version: v1.20.4-eks-6b7464
   
$ curl --silent --location "https://github.com/weaveworks/eksctl/releases/latest/download/eksctl_$(uname -s)_amd64.tar.gz" | tar xz -C /tmp
$ sudo mv /tmp/eksctl /usr/local/bin
$ eksctl version
0.55.0

Using bash variables for frequently used customized values will provide clarity to upcoming steps, in addition to ACC_ID and ZONE that were already set previously:

export PATH_TO_EL_CARRO_AWS=/home/ec2-user/elcarro-aws
export PATH_TO_EL_CARRO_GCP=/home/ec2-user/v0.0.0-alpha
export DBNAME=GCLOUD
export CRD_NS=db
export CLUSTER_NAME=gkecluster

We need to create a new ssh key for this test: myekskeys. This will be stored in the account, so there’s no need to repeat it if you deploy a second test in the future:

$ aws ec2 create-key-pair --region ${ZONE} --key-name myekskeys

Next, we’ll create a local file with the content of the private key, using output from the previous command:

$ vi myekskeys.priv

We are ready to create the cluster. It will take about 25 minutes to complete:

$ eksctl create cluster \
    --name $CLUSTER_NAME \
    --region $ZONE \
    --with-oidc \
    --ssh-access \
    --ssh-public-key myekskeys \
    --managed

These are basic checks to monitor the progress and resources as they are created:

aws eks list-clusters
aws eks describe-cluster --name $CLUSTER_NAME
eksctl get addons --cluster $CLUSTER_NAME
kubectl get nodes --show-labels --all-namespaces
kubectl get pods -o wide --all-namespaces --show-labels
kubectl get svc --all-namespaces
kubectl describe pod --all-namespaces
kubectl get events --sort-by=.metadata.creationTimestamp

Add the EBS CSI driver

We need to use the AWS-EBS CSI driver:

$ cd ${PATH_TO_EL_CARRO_AWS}
$ git clone https://github.com/kubernetes-sigs/aws-ebs-csi-driver.git

Configuration files in El Carro project are set to use GCP storage; we need to adjust a few to use the AWS EBS driver. We will create new configuration files in the directory elcarro-aws/deploy/csi based on the original ones:

$ mkdir -p deploy/csi
$ cp ${PATH_TO_EL_CARRO_AWS}/deploy/csi/gce_pd_storage_class.yaml deploy/csi/aws_pd_storage_class.yaml
$ cp ${PATH_TO_EL_CARRO_AWS}/deploy/csi/gce_pd_volume_snapshot_class.yaml deploy/csi/aws_pd_volume_snapshot_class.yaml

$ vi deploy/csi/aws_pd_storage_class.yaml
$ cat deploy/csi/aws_pd_storage_class.yaml
# Copyright 2021 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: csi-gce-pd
# provisioner: pd.csi.storage.gke.io
provisioner: ebs.csi.aws.com
parameters:
  type: pd-standard
volumeBindingMode: WaitForFirstConsumer

$ vi deploy/csi/aws_pd_volume_snapshot_class.yaml
$ cat deploy/csi/aws_pd_volume_snapshot_class.yaml
# Copyright 2021 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: csi-gce-pd-snapshot-class
# driver: pd.csi.storage.gke.io
driver: ebs.csi.aws.com
deletionPolicy: Delete
$

The below policy is required only one time in your account, so there’s no need to re-execute this if you’re repeating cluster creation:

$ export OIDC_ID=$(aws eks describe-cluster --name $CLUSTER_NAME --query "cluster.identity.oidc.issuer" --output text | rev | cut -d"/" -f1 | rev)

$ aws iam list-open-id-connect-providers | grep $OIDC_ID

$ aws iam create-policy \
    --policy-name AmazonEKS_EBS_CSI_Driver_Policy \
    --policy-document file://${PATH_TO_EL_CARRO_AWS}/aws-ebs-csi-driver/docs/example-iam-policy.json

The following steps are mandatory every time you need to configure EBS CSI:

$ eksctl create iamserviceaccount \
    --name ebs-csi-controller-sa \
    --namespace kube-system \
    --cluster $CLUSTER_NAME \
    --attach-policy-arn arn:aws:iam::<${ACC_ID}>:policy/AmazonEKS_EBS_CSI_Driver_Policy \
    --approve \
    --override-existing-serviceaccounts

$ export ROLE_ARN=$(aws cloudformation describe-stacks --stack-name eksctl-${CLUSTER_NAME}-addon-iamserviceaccount-kube-system-ebs-csi-controller-sa --query='Stacks[].Outputs[?OutputKey==`Role1`].OutputValue' --output text)
$ echo $ROLE_ARN
    
$ kubectl apply -k "github.com/kubernetes-sigs/aws-ebs-csi-driver/deploy/kubernetes/overlays/stable/?ref=master"

This is in AWS docs but seems to not be needed. It may have been necessary for an older release, so we’re keeping it just to be consistent with official docs:

$ kubectl annotate serviceaccount ebs-csi-controller-sa \
    -n kube-system \
    eks.amazonaws.com/role-arn=$ROLE_ARN

We are ready to add our storage class into the cluster: 

$ kubectl create -f ${PATH_TO_EL_CARRO_AWS}/deploy/csi/aws_pd_storage_class.yaml

Now we’re completing the storage configuration and installing the volume snapshot class:

$ kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshotclasses.yaml
$ kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshotcontents.yaml
$ kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshots.yaml

$ kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/deploy/kubernetes/snapshot-controller/rbac-snapshot-controller.yaml
$ kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/deploy/kubernetes/snapshot-controller/setup-snapshot-controller.yaml

And now we add the storage class into the cluster:

$ kubectl create -f ${PATH_TO_EL_CARRO_AWS}/deploy/csi/aws_pd_volume_snapshot_class.yaml

At this point, the cluster is ready to deploy the operator. 

A few basic sanity checks of our k8s cluster can be performed to compare later in case of any problem appears:

$ kubectl get storageclass --all-namespaces
$ kubectl describe storageclass
$ kubectl get pods -o wide --all-namespaces
$ kubectl get events --sort-by=.metadata.creationTimestamp

Deploy El Carro operator and create the DB

We will use the code provided by the GCP project:

$ kubectl apply -f ${PATH_TO_EL_CARRO_GCP}/operator.yaml

Now we are ready to create an Oracle 18c XE database. 

We took the original YAML file for this and updated it to use the proper URL for the DB image—the one we created a few steps ago. There’s no need to adjust the storageClass name, as we deployed to the cluster with the same name used by the GCP example but pointing to the correct AWS driver:

$ vi ${PATH_TO_EL_CARRO_AWS}/samples/v1alpha1_instance_18c_XE_express.yaml
$ cat ${PATH_TO_EL_CARRO_AWS}/samples/v1alpha1_instance_18c_XE_express.yaml
apiVersion: oracle.db.anthosapis.com/v1alpha1
kind: Instance
metadata:
  name: mydb
spec:
  type: Oracle
  version: "18c"
  edition: Express
  dbDomain: "gke"
  disks:
  - name: DataDisk
    storageClass: csi-gce-pd
  - name: LogDisk
    storageClass: csi-gce-pd
  services:
    Backup: true
    Monitoring: false
    Logging: false
    HA/DR: false
  images:
    # Replace below with the actual URIs hosting the service agent images.
    # service: "gcr.io/${PROJECT_ID}/oracle-database-images/oracle-18c-xe-seeded-${DB}"
    service: "<${ACC_ID}>.dkr.ecr.us-east-2.amazonaws.com/elcarro:latest"
  sourceCidrRanges: [0.0.0.0/0]
  # Oracle SID character limit is 8, anything > gets truncated by Oracle
  cdbName: GCLOUD

Now we continue with the steps described in the GCP guide to deploy the 18c XE database:

$ export CRD_NS=db
$ kubectl create ns ${CRD_NS} 
$ kubectl get ns ${CRD_NS}
  
$ kubectl apply -f ${PATH_TO_EL_CARRO_AWS}/samples/v1alpha1_instance_18c_XE_express.yaml -n ${CRD_NS}

To monitor the progress of the deployment, we can check every component. The most relevant information is on the mydb-sts-0 pod:

kubectl get instances -n $CRD_NS
kubectl get events --sort-by=.metadata.creationTimestamp
kubectl get nodes
kubectl get pods -n $CRD_NS
kubectl get pvc -n $CRD_NS
kubectl get svc -n $CRD_NS
kubectl logs mydb-agent-deployment-<id> -c config-agent -n $CRD_NS
kubectl logs mydb-agent-deployment-<id> -c oracle-monitoring -n $CRD_NS
kubectl logs mydb-sts-0 -c oracledb -n $CRD_NS
kubectl logs mydb-sts-0 -c dbinit -n $CRD_NS
kubectl logs mydb-sts-0 -c dbdaemon -n $CRD_NS
kubectl logs mydb-sts-0 -c alert-log-sidecar -n $CRD_NS
kubectl logs mydb-sts-0 -c listener-log-sidecar -n $CRD_NS

kubectl describe pods mydb-sts-0 -n $CRD_NS
kubectl describe pods mydb-agent-deployment-<id> -n $CRD_NS

kubectl describe node ip-<our-ip>.compute.internal

After a few minutes, the database instance (CDB) is created:

$ kubectl get pods -n $CRD_NS
NAME                                     READY   STATUS    RESTARTS   AGE
mydb-agent-deployment-6f7748f88b-2wlp9   1/1     Running   0          11m
mydb-sts-0                               4/4     Running   0          11m

$ kubectl get instances -n $CRD_NS
NAME   DB ENGINE   VERSION   EDITION   ENDPOINT      URL                                                                           DB NAMES   BACKUP ID   READYSTATUS   READYREASON      DBREADYSTATUS   DBREADYREASON
mydb   Oracle      18c       Express   mydb-svc.db   af8b37a22dc304391b51335848c9f7ff-678637856.us-east-2.elb.amazonaws.com:6021                          True          CreateComplete   True            CreateComplete

The last step is to deploy the database (PDB):

$ kubectl apply -f ${PATH_TO_EL_CARRO_GCP}/samples/v1alpha1_database_pdb1_express.yaml -n ${CRD_NS}
database.oracle.db.anthosapis.com/pdb1 created

After this completes, we can see instance status now includes the PDB:

$ kubectl get instances -n $CRD_NS
NAME   DB ENGINE   VERSION   EDITION   ENDPOINT      URL                                                                           DB NAMES   BACKUP ID   READYSTATUS   READYREASON      DBREADYSTATUS   DBREADYREASON
mydb   Oracle      18c       Express   mydb-svc.db   af8b37a22dc304391b51335848c9f7ff-678637856.us-east-2.elb.amazonaws.com:6021   ["pdb1"]               True          CreateComplete   True            CreateComplete

We can connect to the database by either the TNS entry (af8b37a22dc304391b51335848c9f7ff-678637856.us-east-2.elb.amazonaws.com:6021/pdb1.gke) or directly to the container:

$ kubectl  exec -it -n db mydb-sts-0 -c oracledb -- bash -i
$ . oraenv <<<GCLOUD
bash-4.2$ sqlplus / as sysdba

SQL*Plus: Release 18.0.0.0.0 - Production on Wed Jul 7 21:03:28 2021
Version 18.4.0.0.0

Copyright (c) 1982, 2018, Oracle.  All rights reserved.


Connected to:
Oracle Database 18c Express Edition Release 18.0.0.0.0 - Production
Version 18.4.0.0.0

SQL> show pdbs

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB1                           READ WRITE NO

SQL> set lines 180 pages 180
SQL> select * from v$session_connect_info;
...
12 rows selected.

SQL> exit
Disconnected from Oracle Database 18c Express Edition Release 18.0.0.0.0 - Production
Version 18.4.0.0.0
bash-4.2$

Next steps

This setup is just the first step to provision an Oracle database into a k8s cluster on AWS EKS.

It is ready to use the features implemented by the operator, as backup using snapshots or RMAN, restore from a backup, or exporting and importing data using Data Pump. 

We will explore that outside of GCP in upcoming blog posts.

Deleting resources after testing

After our tests are finished, we should delete the cluster to avoid charges if we don’t plan to continue using it:

$ kubectl get svc --all-namespaces

$ kubectl delete svc mydb-svc -n ${CRD_NS} 

$ eksctl delete cluster --name $CLUSTER_NAME

Even if the output from the last command shows it completed successfully, check if that is true by inspecting the detailed output using the command below, as we found a couple of times “The following resource(s) failed to delete: [InternetGateway, SubnetPublicUSEAST2A, VPC, VPCGatewayAttachment, SubnetPublicUSEAST2B]. “ and had to delete those resources manually using the AWS console:

$ eksctl utils describe-stacks --region=us-east-2 --cluster=$CLUSTER_NAME

Conclusion

Since El Carro is based on open standards and technologies (such as Kubernetes), porting it to another cloud provider such as AWS isn’t that difficult. Though some minor changes are required, overall it’s not much of a departure from the GCP process. Maybe additional curated quickstart guides for other cloud providers will be created based on experiences like this, which could ease its adoption outside of GCP.

No Comments Yet

Let us know what you think

Subscribe by email