You want to use CHAOSSEARCH for searching and indexing your data, and that's great. You're going to need a few things. A bucket to store data and a role for CHAOSSEARCH to assume to read out of that bucket. If you want to use LiveIndexing, you'll need an SQS Queue and Bucket Notification Policy to notify the system new objects have arrived. And of course you want everything on the bucket encrypted, so your bucket should have KMS encryption turned on. You could do these steps manually (we have some great documentation), or you could use some automation tooling to make it easier. For this blog post we're going to use Terraform, but if you are interested in a CloudFormation example — you can get that here.
Have you heard the joke about the person who decided to use regular expressions to solve a problem? And how now they have two problems? You could substitute Terraform for regular expressions and the joke will still work. That said, Terraform is pretty awesome, and it has a destroy function. It's a great way to get you quickly set up so you can play, and when you're done, use a single command to tear down the environment.
Terraform uses your working directory for context. Create a new directory called cs_terraform and make a new file called cs_env.tf
mkdir cs_terraform cd cs_terraform touch cs_env.tf
Edit cs_env.tf with your favorite editor. You will want to fill in your customer_id and the name of the bucket you want to create. If you aren't running in us-east-1, you'll want to modify that value as well.
provider "aws" { alias = "us-east-1" region = "us-east-1" version = "~> 2.10" } module "s3_customer_buckets" { source = "git::https://github.com/ChaosSearch/terraform-modules.git//encrypted-s3-bucket-live-indexing" region = "us-east-1" cs_external_id = "YOUR_CUSTOMER_ID" cs_data_bucket = "YOUR_BUCKET_NAME" sqs_queue = true providers = { aws = "aws.us-east-1" } }
This configuration will create several objects:
Running terraform init will configure the local Terraform provider, and pull the Terraform module to your local machine.
terraform init
Output
Initializing modules... - module.s3_customer_buckets Initializing provider plugins... - Checking for available provider plugins on https://releases.hashicorp.com... - Downloading plugin for provider "aws" (2.10.0)... Terraform has been successfully initialized! You may now begin working with Terraform. Try running "terraform plan" to see any changes that are required for your infrastructure. All Terraform commands should now work. If you ever set or change modules or backend configuration for Terraform, rerun this command to reinitialize your working directory. If you forget, other commands will detect it and remind you to do so if necessary.
Are plans required in the Terraform world? No, not really. If you don't save a plan, a plan is run before you do an apply. HOWEVER, you should run plans before you run applies. It makes you look more professional, and it’s safer. Don't you want to know what Terraform is going to do before you run a command that may alter your environment?
In this example, we save the plan to a file called cs_test for additional safety and control.
Note: I use scenery to cleanup the output of Terraform.
terraform plan -out cs_test | scenery
Output
+ module.s3_customer_buckets.aws_iam_policy.cs_logging_server_side_role_policy id: arn: name: "cs_logging_server_side_role_policy" path: "/" policy: "{ "Statement": [ { "Action": [ "s3:Get*", "s3:List*", "s3:PutObjectTagging" ], "Effect": "Allow", "Resource": [ "${aws_s3_bucket.cs_data_bucket.arn}", "${aws_s3_bucket.cs_data_bucket.arn}/*" ] }, { "Action": [ "s3:ListAllMyBuckets" ], "Effect": "Allow", "Resource": "*" }, { "Action": "*", "Effect": "Allow", "Resource": [ "arn:aws:s3:::cs-${var.cs_external_id}" ] }, { "Action": "*", "Effect": "Allow", "Resource": [ "arn:aws:s3:::cs-${var.cs_external_id}/*" ] }, { "Action": [ "kms:GenerateDataKey", "kms:Decrypt" ], "Effect": "Allow", "Resource": [ "${aws_kms_key.cs_data_bucket_key.arn}" ] } ], "Version": "2012-10-17" }" + module.s3_customer_buckets.aws_iam_role.cs_logging_server_side_role id: arn: assume_role_policy: "{ "Statement": [ { "Action": "sts:AssumeRole", "Effect": "Allow", "Principal": { "Service": "ec2.amazonaws.com" } }, { "Action": "sts:AssumeRole", "Condition": { "StringEquals": { "sts:ExternalId": "98ffdf9d-bc56-4943-b89e-113b982e3ec2" } }, "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::515570774723:root" } } ], "Version": "2012-10-17" }" create_date: force_detach_policies: "false" max_session_duration: "3600" name: "cs_logging_server_side_role" path: "/" unique_id: + module.s3_customer_buckets.aws_iam_role_policy_attachment.cs_logging_server_side_role_policy_attach id: policy_arn: "${aws_iam_policy.cs_logging_server_side_role_policy.arn}" role: "cs_logging_server_side_role" + module.s3_customer_buckets.aws_kms_alias.cs_data_bucket_key id: arn: name: "alias/cs_cs-blog-test_us-east-1" target_key_arn: target_key_id: "${aws_kms_key.cs_data_bucket_key.key_id}" + module.s3_customer_buckets.aws_kms_key.cs_data_bucket_key id: arn: deletion_window_in_days: "10" description: "This key is used to encrypt cs-blog-test-us-east-1" enable_key_rotation: "false" is_enabled: "true" key_id: key_usage: policy: + module.s3_customer_buckets.aws_s3_bucket.cs_data_bucket id: acceleration_status: acl: "private" arn: bucket: "cs-blog-test-us-east-1" bucket_domain_name: bucket_regional_domain_name: force_destroy: "false" hosted_zone_id: lifecycle_rule.#: "1" lifecycle_rule.0.abort_incomplete_multipart_upload_days: "7" lifecycle_rule.0.enabled: "true" lifecycle_rule.0.expiration.#: "1" lifecycle_rule.0.expiration.2843080737.date: "" lifecycle_rule.0.expiration.2843080737.days: "31" lifecycle_rule.0.expiration.2843080737.expired_object_delete_marker: "" lifecycle_rule.0.id: "cleanup_after_30_days" lifecycle_rule.0.transition.#: "1" lifecycle_rule.0.transition.1899348542.date: "" lifecycle_rule.0.transition.1899348542.days: "30" lifecycle_rule.0.transition.1899348542.storage_class: "GLACIER" region: request_payer: server_side_encryption_configuration.#: "1" server_side_encryption_configuration.0.rule.#: "1" server_side_encryption_configuration.0.rule.0.apply_server_side_encryption_by_default.#: "1" server_side_encryption_configuration.0.rule.0.apply_server_side_encryption_by_default.0.kms_master_key_id: "${aws_kms_key.cs_data_bucket_key.arn}" server_side_encryption_configuration.0.rule.0.apply_server_side_encryption_by_default.0.sse_algorithm: "aws:kms" tags.%: "1" tags.Name: "cs-blog-test-us-east-1" versioning.#: "1" versioning.0.enabled: "true" versioning.0.mfa_delete: "false" website_domain: website_endpoint: + module.s3_customer_buckets.aws_s3_bucket_notification.cs_data_bucket_notification id: bucket: "${aws_s3_bucket.cs_data_bucket.id}" queue.#: "1" queue.0.events.#: "1" queue.0.events.3356830603: "s3:ObjectCreated:*" queue.0.id: queue.0.queue_arn: "${aws_sqs_queue.cs_s3_bucket_sqs.arn}" + module.s3_customer_buckets.aws_sqs_queue.cs_s3_bucket_sqs id: arn: content_based_deduplication: "false" delay_seconds: "0" fifo_queue: "false" kms_data_key_reuse_period_seconds: max_message_size: "2048" message_retention_seconds: "86400" name: "s3-sqs-cs-blog-test-us-east-1" policy: receive_wait_time_seconds: "0" tags.%: "1" tags.Bucket: "cs-blog-test-us-east-1" visibility_timeout_seconds: "480" + module.s3_customer_buckets.aws_sqs_queue_policy.cs_s3_bucket_sqs id: policy: "{ "Statement": [ { "Action": "sqs:SendMessage", "Condition": { "ArnEquals": { "aws:SourceArn": "${aws_s3_bucket.cs_data_bucket.arn}" } }, "Effect": "Allow", "Principal": "*", "Resource": "${aws_sqs_queue.cs_s3_bucket_sqs.arn}" }, { "Action": "sqs:*", "Effect": "Allow", "Principal": { "AWS": [ "${aws_iam_role.cs_logging_server_side_role.arn}" ] }, "Resource": "${aws_sqs_queue.cs_s3_bucket_sqs.arn}" } ], "Version": "2012-10-17" }" queue_url: "${aws_sqs_queue.cs_s3_bucket_sqs.id}" Plan: 9 to add, 0 to change, 0 to destroy.
Let's dig into some of the output from that plan.
Create a bucket to stash the data.
+ module.s3_customer_buckets.aws_s3_bucket.cs_data_bucket
Create a KMS key, and alias, to encrypt your s3 objects at rest transparently.
+ module.s3_customer_buckets.aws_kms_alias.cs_data_bucket_key + module.s3_customer_buckets.aws_kms_key.cs_data_bucket_key
Create a notification for new objects added to the bucket.
+ module.s3_customer_buckets.aws_s3_bucket_notification.cs_data_bucket_notification
Create an SQS Queue for the bucket to send notifications on new objects, create a user policy to interact with that queue to notify system when new objects have been added to the bucket.
+ module.s3_customer_buckets.aws_sqs_queue.cs_s3_bucket_sqs + module.s3_customer_buckets.aws_sqs_queue_policy.cs_s3_bucket_sqs
Create an IAM role CHAOSSEARCH will assume to interact with your bucket uses these objects.
+ module.s3_customer_buckets.aws_iam_policy.cs_logging_server_side_role_policy + module.s3_customer_buckets.aws_iam_role.cs_logging_server_side_role + module.s3_customer_buckets.aws_iam_role_policy_attachment.cs_logging_server_side_role_policy_attach
After reviewing the plan, everything looks good. Everything is being created, nothing deleted, nothing changed. Let's apply the plan.
terraform apply cs_test
Applying the plan got us an error! Lets dig in.
Error: Error applying plan: 1 error(s) occurred: * module.s3_customer_buckets.aws_sqs_queue_policy.cs_s3_bucket_sqs: 1 error(s) occurred: * aws_sqs_queue_policy.cs_s3_bucket_sqs: Error updating SQS attributes: InvalidAttributeValue: Invalid value for the parameter Policy. status code: 400, request id: c267dace-4173-5178-8ff4-8fae86fb9713 Terraform does not automatically rollback in the face of errors. Instead, your Terraform state file has been partially updated with any resources that successfully completed. Please address the error above and apply again to incrementally change your infrastructure.
In truth, while testing this blog we didn't get the error that frequently — maybe 1 out of 10 times. But I've seen errors like this before. They are "eventual consistency" errors. Just because we created a resource in AWS, and got back its ID or ARN, it doesn't mean that all other services in AWS are aware of the new resources. They are transient, and normally don't last more than a few 10s of seconds.
In this case, we are trying to give the SQS Queue a Policy that contains the ID of the S3 bucket, which apparently isn't known across all the AWS services yet. Running another plan and apply will (probably) finish successfully, but the first few times you run into eventual consistency it feels pretty jarring.
Output from running another plan:
+ module.s3_customer_buckets.aws_s3_bucket_notification.cs_data_bucket_notification id: bucket: "cs-blog-test-us-east-1" queue.#: "1" queue.0.events.#: "1" queue.0.events.3356830603: "s3:ObjectCreated:*" queue.0.id: queue.0.queue_arn: "arn:aws:sqs:us-east-1:095701894487:s3-sqs-cs-blog-test-us-east-1" + module.s3_customer_buckets.aws_sqs_queue_policy.cs_s3_bucket_sqs id: policy: "{ "Statement": [ { "Action": "sqs:SendMessage", "Condition": { "ArnEquals": { "aws:SourceArn": "arn:aws:s3:::cs-blog-test-us-east-1" } }, "Effect": "Allow", "Principal": "*", "Resource": "arn:aws:sqs:us-east-1:095701894487:s3-sqs-cs-blog-test-us-east-1" }, { "Action": "sqs:*", "Effect": "Allow", "Principal": { "AWS": [ "arn:aws:iam::095701894487:role/cs_logging_server_side_role" ] }, "Resource": "arn:aws:sqs:us-east-1:095701894487:s3-sqs-cs-blog-test-us-east-1" } ], "Version": "2012-10-17" }" queue_url: "https://sqs.us-east-1.amazonaws.com/095701894487/s3-sqs-cs-blog-test-us-east-1" Plan: 2 to add, 0 to change, 0 to destroy.
Which applies just fine:
module.s3_customer_buckets.aws_sqs_queue_policy.cs_s3_bucket_sqs: Creating... policy: "" => "{\n \"Version\": \"2012-10-17\",\n \"Statement\": [\n {\n \"Effect\": \"Allow\",\n \"Principal\": \"*\",\n \"Action\": \"sqs:SendMessage\",\n \"Resource\": \"arn:aws:sqs:us-east-1:095701894487:s3-sqs-cs-blog-test-us-east-1\",\n \"Condition\": {\n \"ArnEquals\": { \"aws:SourceArn\": \"arn:aws:s3:::cs-blog-test-us-east-1\" }\n }\n },\n {\n \"Effect\": \"Allow\",\n \"Principal\": {\n \"AWS\": [ \"arn:aws:iam::095701894487:role/cs_logging_server_side_role\" ]\n },\n \"Action\": \"sqs:*\",\n \"Resource\": \"arn:aws:sqs:us-east-1:095701894487:s3-sqs-cs-blog-test-us-east-1\"\n }\n ]\n}\n" queue_url: "" => "https://sqs.us-east-1.amazonaws.com/095701894487/s3-sqs-cs-blog-test-us-east-1" module.s3_customer_buckets.aws_sqs_queue_policy.cs_s3_bucket_sqs: Creation complete after 1s (ID: https://sqs.us-east-1.amazonaws.com/095701894487/s3-sqs-cs-blog-test-us-east-1) module.s3_customer_buckets.aws_s3_bucket_notification.cs_data_bucket_notification: Creating... bucket: "" => "cs-blog-test-us-east-1" queue.#: "" => "1" queue.0.events.#: "" => "1" queue.0.events.3356830603: "" => "s3:ObjectCreated:*" queue.0.id: "" => "" queue.0.queue_arn: "" => "arn:aws:sqs:us-east-1:095701894487:s3-sqs-cs-blog-test-us-east-1" module.s3_customer_buckets.aws_s3_bucket_notification.cs_data_bucket_notification: Creation complete after 1s (ID: cs-blog-test-us-east-1) Apply complete! Resources: 2 added, 0 changed, 0 destroyed.
When you first log into CHAOSSEARCH, it asks you for the role to use to access your buckets. After putting in my role arn (arn:aws:iam::095701894487:role/cs_logging_server_side_role) I got a happy checkbox. This checkbox will appear red unless you can list buckets and some resources in the buckets.
And here you can see the bucket created visible in the CHAOSSEARCH GUI:
Testing is a very CHAOSSEARCH-specific problem, so I'm not going to focus on it. I'm going to call this part a victory and move on to clean up. The regular CHAOSSEARCH docs are much better than I could capture in a blog post.
Terraform is great at standing up new infrastructure, as well as tearing it down. Sometimes it's too efficient in tearing things down (see outages applying Terraform changes to the wrong environment). The way this module is written, it will not delete the bucket if there is data in it. You'll have to manually empty it out using some of the standard AWS techniques.
Once your bucket is empty, you can run a destroy. It will prompt you to say yes before anything is deleted.
terraform destroy
aws_iam_role.cs_logging_server_side_role: Refreshing state... (ID: cs_logging_server_side_role) aws_kms_key.cs_data_bucket_key: Refreshing state... (ID: db03b87b-b69b-4709-85da-0a6fc6fee6bf) aws_sqs_queue.cs_s3_bucket_sqs: Refreshing state... (ID: https://sqs.us-east-1.amazonaws.com/095701894487/s3-sqs-cs-blog-test-us-east-1) aws_s3_bucket.cs_data_bucket: Refreshing state... (ID: cs-blog-test-us-east-1) aws_kms_alias.cs_data_bucket_key: Refreshing state... (ID: alias/cs_cs-blog-test_us-east-1) aws_sqs_queue_policy.cs_s3_bucket_sqs: Refreshing state... (ID: https://sqs.us-east-1.amazonaws.com/095701894487/s3-sqs-cs-blog-test-us-east-1) aws_iam_policy.cs_logging_server_side_role_policy: Refreshing state... (ID: arn:aws:iam::095701894487:policy/cs_logging_server_side_role_policy) aws_s3_bucket_notification.cs_data_bucket_notification: Refreshing state... (ID: cs-blog-test-us-east-1) aws_iam_role_policy_attachment.cs_logging_server_side_role_policy_attach: Refreshing state... (ID: cs_logging_server_side_role-20190523162829178700000001) An execution plan has been generated and is shown below. Resource actions are indicated with the following symbols: - destroy Terraform will perform the following actions: - module.s3_customer_buckets.aws_iam_policy.cs_logging_server_side_role_policy - module.s3_customer_buckets.aws_iam_role.cs_logging_server_side_role - module.s3_customer_buckets.aws_iam_role_policy_attachment.cs_logging_server_side_role_policy_attach - module.s3_customer_buckets.aws_kms_alias.cs_data_bucket_key - module.s3_customer_buckets.aws_kms_key.cs_data_bucket_key - module.s3_customer_buckets.aws_s3_bucket.cs_data_bucket - module.s3_customer_buckets.aws_s3_bucket_notification.cs_data_bucket_notification - module.s3_customer_buckets.aws_sqs_queue.cs_s3_bucket_sqs - module.s3_customer_buckets.aws_sqs_queue_policy.cs_s3_bucket_sqs Plan: 0 to add, 0 to change, 9 to destroy. Do you really want to destroy all resources? Terraform will destroy all your managed infrastructure, as shown above. There is no undo. Only 'yes' will be accepted to confirm.
Here you need to type yes to destroy the resource.
Enter a value: yes module.s3_customer_buckets.aws_s3_bucket_notification.cs_data_bucket_notification: Destroying... (ID: cs-blog-test-us-east-1) module.s3_customer_buckets.aws_iam_role_policy_attachment.cs_logging_server_side_role_policy_attach: Destroying... (ID: cs_logging_server_side_role-20190523162829178700000001) module.s3_customer_buckets.aws_kms_alias.cs_data_bucket_key: Destroying... (ID: alias/cs_cs-blog-test_us-east-1) module.s3_customer_buckets.aws_kms_alias.cs_data_bucket_key: Destruction complete after 0s module.s3_customer_buckets.aws_iam_role_policy_attachment.cs_logging_server_side_role_policy_attach: Destruction complete after 0s module.s3_customer_buckets.aws_iam_policy.cs_logging_server_side_role_policy: Destroying... (ID: arn:aws:iam::095701894487:policy/cs_logging_server_side_role_policy) module.s3_customer_buckets.aws_s3_bucket_notification.cs_data_bucket_notification: Destruction complete after 0s module.s3_customer_buckets.aws_sqs_queue_policy.cs_s3_bucket_sqs: Destroying... (ID: https://sqs.us-east-1.amazonaws.com/095701894487/s3-sqs-cs-blog-test-us-east-1) module.s3_customer_buckets.aws_sqs_queue_policy.cs_s3_bucket_sqs: Destruction complete after 0s module.s3_customer_buckets.aws_sqs_queue.cs_s3_bucket_sqs: Destroying... (ID: https://sqs.us-east-1.amazonaws.com/095701894487/s3-sqs-cs-blog-test-us-east-1) module.s3_customer_buckets.aws_iam_role.cs_logging_server_side_role: Destroying... (ID: cs_logging_server_side_role) module.s3_customer_buckets.aws_iam_policy.cs_logging_server_side_role_policy: Destruction complete after 0s module.s3_customer_buckets.aws_s3_bucket.cs_data_bucket: Destroying... (ID: cs-blog-test-us-east-1) module.s3_customer_buckets.aws_sqs_queue.cs_s3_bucket_sqs: Destruction complete after 0s module.s3_customer_buckets.aws_iam_role.cs_logging_server_side_role: Destruction complete after 1s module.s3_customer_buckets.aws_s3_bucket.cs_data_bucket: Destruction complete after 1s module.s3_customer_buckets.aws_kms_key.cs_data_bucket_key: Destroying... (ID: db03b87b-b69b-4709-85da-0a6fc6fee6bf) module.s3_customer_buckets.aws_kms_key.cs_data_bucket_key: Still destroying... (ID: db03b87b-b69b-4709-85da-0a6fc6fee6bf, 10s elapsed) module.s3_customer_buckets.aws_kms_key.cs_data_bucket_key: Still destroying... (ID: db03b87b-b69b-4709-85da-0a6fc6fee6bf, 20s elapsed) module.s3_customer_buckets.aws_kms_key.cs_data_bucket_key: Destruction complete after 20s Destroy complete! Resources: 9 destroyed.
All cleaned up. If we poke around in the CHAOSSEARCH console, we'll start getting errors (we deleted all the roles and queues and buckets), so we will be redirected to enter in a valid role ARN.
There you have it, a 20-minute adventure. A new bucket you can use to test CHAOSSEARCH with, all created, used, and removed. We should be expanding our offering of Terraform and CloudFormation templates for users, so keep an eye out for future blog posts. Do you have any requests?
Hit me up: @platformpatrick