OpenSearch managed with Terraform

OpenSearch managed with Terraform

March 21, 2024
Get tips and best practices from Develeap’s experts in your inbox

AWS OpenSearch Service emerges as a powerful solution to meet effective search and analysis of vast amounts of information needs of organizations across various industries.

This article not only delves into the intricacies of Amazon OpenSearch Service but also offers a practical guide to deploying and managing OpenSearch clusters using Terraform. We’ll navigate through the complexities of infrastructure provisioning, security configurations, and user management, all while adhering to best practices and maximizing efficiency.

Whether you are a DevOps engineer looking to automate your OpenSearch deployments or a newcomer, this article offers a comprehensive roadmap to harness the power of OpenSearch in the most efficient way. We’ll explore Terraform code snippets, explain essential configurations, and guide you through the entire deployment process. By the end, you’ll have a clear understanding of how Terraform can simplify and enhance your experience with AWS OpenSearch Service, enabling you to take full control of your search and analytics infrastructure.

What is OpenSearch?

Amazon OpenSearch Service is a versatile platform that simplifies the deployment, operation, and scalability of OpenSearch clusters in the AWS Cloud. OpenSearch offers robust support for legacy Elasticsearch OSS versions up to 7.10, marking the final open-source release of the software. One of the standout features of OpenSearch Service is its flexibility, allowing users to choose between different search engines when creating clusters.

Why Use OpenSearch?

Here’s why OpenSearch is becoming an attractive choice for log analytics, real-time application monitoring and clickstream analysis:

  1. Managed vs. Serverless: OpenSearch Service offers both managed and serverless options, each with its own set of features. Managed domains are pre-provisioned clusters that provide granular control over node types and capacity management. In contrast, serverless collections offer automatic scaling and provisioning based on capacity usage, simplifying resource management and cost control.
  2. Billing: OpenSearch Service employs an EC2 instance-based billing model, where users pay for compute hours and cumulative storage size. On the other hand, OpenSearch Serverless employs an OCU-hour pricing model, factoring in compute, search/query operations, and S3 storage.
  3. Encryption: While encryption at rest is optional for domains, it’s a mandatory feature for collections in OpenSearch Serverless. This ensures that data remains secure in any scenario.
  4. Data access control: OpenSearch Service relies on IAM policies for data access control within domains, providing fine-grained control over who can access what. you can allow users to access collections and indexes regardless of their access mechanism or network source. You can provide access to IAM roles and SAML identities.
  5. Supported operations: Both OpenSearch Service and OpenSearch Serverless support specific subsets of OpenSearch API operations and cater to different use cases and needs.
  6. Dashboards sign-in: Sign-in procedures differ between the two options, with OpenSearch Service requiring a username and password, while OpenSearch Serverless grants automatic login when accessing the AWS console.
  7. APIs: Developers can interact programmatically with OpenSearch Service and OpenSearch Serverless using distinct APIs tailored to their chosen deployment option.
  8. Network access: Network settings vary between the two, with OpenSearch Service tightly coupling network access for domain and OpenSearch Dashboards endpoints, while OpenSearch Serverless offers more flexibility in configuring network access.
  9. Signing requests: OpenSearch Service and OpenSearch Serverless use different client libraries and service names when signing requests, with specific requirements for each.
  10. Version upgrades: The responsibility for upgrading to newer OpenSearch versions differs between managed and serverless options, with implications for maintenance and compatibility.
  11. Service software updates: Updates to service software are handled differently, with OpenSearch Serverless automatically incorporating bug fixes, features, and performance improvements.
  12. VPC Access: Users can configure VPC access in OpenSearch Service, while OpenSearch Serverless offers its own VPC-related options.
  13. SAML Authentication: OpenSearch Service enables SAML authentication on a per-domain basis, whereas OpenSearch Serverless employs a different approach at the account level.

OpenSearch managed (domain) vs. serverless:

There are some features that are different from each other.

The difference between OpenSearch managed, and OpenSearch serverless below shows the full picture to help you choose your preferred choice for your needs.

Feature OpenSearch Service OpenSearch Serverless
Domains versus collections Indexes are held in domains, which are pre-provisioned OpenSearch clusters.  
For more information, see https://docs.aws.amazon.com/opensearch-service/latest/developerguide/createupdatedomains.html. Indexes are held in collections, which are logical groupings of indexes that represent a specific workload or use case.  
For more information, see https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless-manage.html.    
Node types and capacity management You build a cluster with node types that meet your cost and performance specifications. You must calculate your own storage requirements and choose an instance type for your domain.  
For more information, see https://docs.aws.amazon.com/opensearch-service/latest/developerguide/sizing-domains.html. OpenSearch Serverless automatically scales and provisions additional compute units for your account based on your capacity usage.  
For more information, see https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless-scaling.html.    
Billing You pay for each hour of use of an EC2 instance and for the cumulative size of any EBS storage volumes attached to your instances.  
For more information, see https://docs.aws.amazon.com/opensearch-service/latest/developerguide/what-is.html#pricing. You’re charged in OCU hours for computing for data ingestion, computing for search and query, and storage retained in S3.  
For more information, see https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless-overview.html#serverless-pricing.    
Encryption Encryption at rest is optional for domains.  
For more information, see https://docs.aws.amazon.com/opensearch-service/latest/developerguide/encryption-at-rest.html. Encryption at rest is required for collections.  
For more information, see https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless-encryption.html.    
Data access control Access to the data within domains is determined by IAM policies and https://docs.aws.amazon.com/opensearch-service/latest/developerguide/fgac.html. Access to data within collections is determined by https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless-data-access.html.
Supported OpenSearch operations OpenSearch Service supports a subset of all of the OpenSearch API operations.  
For more information, see https://docs.aws.amazon.com/opensearch-service/latest/developerguide/supported-operations.html. OpenSearch Serverless supports a different subset of OpenSearch API operations.  
For more information, see https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless-genref.html.    
Dashboards sign-in Sign in with a username and password.  
For more information, see https://docs.aws.amazon.com/opensearch-service/latest/developerguide/fgac.html#fgac-dashboards. If you’re logged into the AWS console and navigate to your Dashboard URL, you’ll automatically log in.  
For more information, see https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless-manage.html#serverless-dashboards.    
APIs Interact programmatically with OpenSearch Service using the https://docs.aws.amazon.com/opensearch-service/latest/APIReference/Welcome.html. Interact programmatically with OpenSearch Serverless using the https://docs.aws.amazon.com/opensearch-service/latest/ServerlessAPIReference/Welcome.html.
Network access Network settings for a domain apply to the domain endpoint as well as the OpenSearch Dashboards endpoint. Network access for both is tightly coupled. Network settings for the domain endpoint and the OpenSearch Dashboards endpoint are decoupled. You can choose to not configure network access for OpenSearch Dashboards.
For more information, see https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless-network.html.    
Signing requests Use the OpenSearch high and low-level REST clients to sign requests. Specify the service name as es.  
For more information, see https://docs.aws.amazon.com/opensearch-service/latest/developerguide/request-signing.html. At this time, OpenSearch Serverless supports a subset of clients that OpenSearch Service supports.  
When you sign requests, specify the service name as aoss. The x-amz-content-sha256 header is required. For more information, see https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless-clients.html#serverless-signing.    
OpenSearch version upgrades You manually upgrade your domains as new versions of OpenSearch become available. You’re responsible for ensuring that your domain meets the upgrade requirements, and that you’ve addressed any breaking changes. OpenSearch Serverless automatically upgrades your collections to new OpenSearch versions. Upgrades don’t necessarily happen as soon as a new version is available.
Service software updates You manually apply service software updates to your domain as they become available. OpenSearch Serverless automatically updates your collections to consume the latest bug fixes, features, and performance improvements.
VPC access You can https://docs.aws.amazon.com/opensearch-service/latest/developerguide/vpc.html.  
You can also create additional https://docs.aws.amazon.com/opensearch-service/latest/developerguide/vpc-interface-endpoints.html to access the domain. You create one or more https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless-vpc.html for your account. Then, you include these endpoints within https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless-network.html.  
SAML authentication You enable SAML authentication on a per-domain basis.  
For more information, see https://docs.aws.amazon.com/opensearch-service/latest/developerguide/saml.html. You configure one or more SAML providers at the account level, then you include the associated user and group IDs within data access policies.  
For more information, see https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless-saml.html.  

Questions to ask ourselves before we decide which service is most suitable for us:

  • Managed or Serverless? Do we want to manage our cluster or use it as serverless? Are we getting more value versus the price difference?
  • Managed OpenSearch Users + Access Control :
    • Do we want to manage users inside OpenSearch own database or outside?
    • If we want outside the cluster how do we manage it?
      • IAM users (but you won’t be able to connect to the dashboard if we use OpenSearch domain, we can use it for the CLI access (fluent bit).
      • Cognito, now the question with cognito is how to manage it.
      • We can use SAML auth, Google auth, Facebook auth etc.

Access control 

Fine-grained access control:

Fine-grained access control offers additional ways of controlling access to your data on Amazon OpenSearch Service. For example, depending on who makes the request, you might want a search to return results from only one index. You might want to hide certain fields in your documents or exclude certain documents altogether.

You can read more at: https://docs.aws.amazon.com/opensearch-service/latest/developerguide/fgac.html

  • Cognito for OpenSearch: (We will import it in the Terraform section, you can create it in Terraform if you need to).

This Section is for users only you can configure other Federated identity providers like Facebook, Google, Amazon, Apple and etc.

Before configuring cognito for OpenSearch we need a few requirements.

  • Amazon Cognito user pool
  • Amazon Cognito identity pool
  • IAM role that has the AmazonOpenSearchServiceCognitoAccess policy attached (CognitoAccessForAmazonOpenSearch)

Deploying OpenSearch with Terraform:

While the power of OpenSearch is undeniable, the question arises: how can we streamline its deployment and management within the AWS Cloud? Enter Terraform, an Infrastructure as Code tool that let you define and provision resources in a declarative manner.

Users Setup:

We will create 2 users, 1 master user with admin permissions, 1 limited user with limited permissions.

For that, we need to create an IAM Role.

For the master user, we will use full permission.

For the limited user, we will give him only Read access.

When creating a role, we want to change the “Trust relationship” to:

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Principal": {
      "Federated": "cognito-identity.amazonaws.com"
    },
    "Action": "sts:AssumeRoleWithWebIdentity",
    "Condition": {
      "StringEquals": {
        "cognito-identity.amazonaws.com:aud": "<identity-pool-id>"
      },
      "ForAnyValue:StringLike": {
        "cognito-identity.amazonaws.com:amr": "authenticated"
      }
    }
  }]
}

and for limited user :

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "es:ESHttpGet",
            "Resource": "*"
        }
    ]
}

Now we are going to create two groups and a user for each group inside the Cognito user pool that we created.

Click on groups inside the user pool we created. And create two groups.

admin: in this group, those users with admin permissions for OpenSearch Dashboards will live, please select IAM Role IAMMAdminUserRole.

limited: in this group, those users with limited permissions for OpenSearch Dashboards will live, please select IAM Role IAMLimitedUserRole.

Terraform :

* This is terraform configured for Public access. – If you want private access you will need to configure VPC (see main.tf vpc_options down below and to access it vpc tunnel).

variable.tf

variable "master-iam-user" {
  default = "rotem"
}

variable "region" {
  default = "eu-central-1"
}

variable "domain" {
  default = "opensearch-demo"
}

variable "url" {
  default = "example.com"
}

variable "engine_version" {
  default = "OpenSearch_2.7"
}


variable "cognito_user_pool_name" {
  default = "opensearch-demo"
}
variable "cognito_identity_pool_id" {
  default = "eu-central-1:*********************"
}

main.tf

locals {
  name     = "opensearch-demo"

  # To ensure the name is consistent between what was created and the user data script
  tags = {
    name       = local.name
    Made_by    = "terraform"
    Created_by = "rotem"
  }
}

data "aws_caller_identity" "current" {}

data "aws_acm_certificate" "com" {
  domain      = "*.${var.url}" # *.example.com
  types       = ["AMAZON_ISSUED"]
  most_recent = true
}


data "aws_cognito_user_pools" "selected" {
  name = var.cognito_user_pool_name
}

resource "aws_iam_service_linked_role" "example" {
  aws_service_name = "opensearchservice.amazonaws.com"
}

data "aws_region" "current" {}

data "aws_caller_identity" "current" {}

resource "aws_opensearch_domain" "example" {
  domain_name    = "local.name"
  engine_version = "${var.engine_version}"

  domain_endpoint_options {
    custom_endpoint_enabled = true #if we want custom endpoint.
    custom_endpoint         = "opensearch-custom.${var.url}"
    custom_endpoint_certificate_arn = data.aws_acm_certificate.com.arn
    tls_security_policy             = "Policy-Min-TLS-1-2-2019-07"
    enforce_https                   = true
  }

  cluster_config {
    instance_type          = "t3.medium.search" #t3 is not supported by auto-tune.
    zone_awareness_enabled = true
    zone_awareness_config {
      availability_zone_count = 2
    }
    instance_count = 2
    warm_enabled   = false

    dedicated_master_enabled = false

  }
	# IF YOU WANT VPC access.
  # vpc_options {
  #   subnet_ids = [
  #     module.vpc.public_subnets[0],
  #     module.vpc.public_subnets[1],
  #   ]
  #   security_group_ids = [aws_security_group.example.id]
  # }

  advanced_security_options {
    enabled                        = false
    anonymous_auth_enabled         = true
    internal_user_database_enabled = true
    master_user_options {
      master_user_name     = "admin"
      master_user_password = random_password.password.result
    }

  }
  cognito_options {
    enabled          = true
    identity_pool_id = var.cognito_identity_pool_id
    role_arn         = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/service-role/CognitoAccessForAmazonOpenSearch"
    user_pool_id     = data.aws_cognito_user_pools.selected.ids[0]

  }



  ebs_options {
    ebs_enabled = true
    volume_size = 100
    volume_type = "gp3"
  }

  encrypt_at_rest {
    enabled = true
  }

  node_to_node_encryption {
    enabled = true
  }

  #  off_peak_window_options {
  #    enabled = true
  #    off_peak_window {
  #      window_start_time {
  #        hours   = 07
  #        minutes = 00
  #      }
  #    }
  #  }

  snapshot_options {
    automated_snapshot_start_hour = 07 #10:00 israel time
  }

  tags = local.tags
  depends_on = [aws_iam_service_linked_role.example]
}

resource "aws_opensearch_domain_policy" "opensearch" {
  domain_name = aws_opensearch_domain.example.domain_name

  access_policies = <<POLICIES
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/service-role/opensearch-demo"
      },
      "Action": "es:*",
      "Resource": "arn:aws:es:eu-central-1:${data.aws_caller_identity.current.account_id}:domain/opensearch-demo/*"
    },
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": ["arn:aws:iam::021453066858:user/fluent-bit", "arn:aws:iam::021453066858:user/<YOUR MASTER USER>""]
      },
      "Action": [
        "es:ESHttpPut",
        "es:ESHttpPost",
        "es:ESHttpPatch"
      ],
      "Resource": "arn:aws:es:eu-central-1:021453066858:domain/opensearch-demo/*"
    }
  ]
}
POLICIES
}

providers.tf

provider "aws" {
  region = "eu-central-1"
}-

terraform {
  required_version = ">= 1.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = ">= 4.47"
    }
  }
 }

 # Update to suit your needs
 # backend "s3" {
 #   bucket = "open-search-demo-tfstate"
 #   region = "eu-central-1"
 #   key    = "state/terraform.tfstate"
 # }
}

Now :  terraform init terraform plan terraform apply

Keep in mind that OpenSearch might take some time to fully deploy, so why not take a break, and enjoy a cup of coffee?

Conclusion:

As you embark on this journey, remember that OpenSearch’s versatility and Terraform’s declarative approach enable you to create, configure, and maintain your search and analytics infrastructure efficiently. Whether you opt for managed domains or serverless collections, fine-tune access control, or leverage federated identity providers like Cognito, you have the tools at your disposal to build a robust and secure solution tailored to your needs.

In conclusion, by combining the capabilities of Amazon OpenSearch Service and Terraform, you’re not just managing technology; you’re empowering your organization with the data-driven insights and performance it needs to thrive in today’s data-centric world.

We’re Hiring!
Develeap is looking for talented DevOps engineers who want to make a difference in the world.