Chef – Ohai in AWS EC2 VPC

This is a quick tip to those of you who are using Chef inside an AWS VPC. The EC2 Ohai plugin does not run by default, which prevents some meaningful node attributes from being collected.

The EC2-specific node attributes I find most useful are:

node['ec2']['instance_id'] # => Instance's ID
node['ec2']['local_ipv4'] # => Instance's IPv4 Address
node['ec2']['placement_availability_zone'] # => Instance's Region & Availability Zone
node['ec2']['ami_id'] # => Instance's Baseline AMI

To get your instances inside a VPC to pick up meaningful node attributes related to EC2, you have to create an Ohai hint file for the EC2 plugin. To do so, simply throw this into your initial bootstrap.

mkdir -p /etc/chef/ohai/hints && touch ${_}/ec2.json

Make sure you don’t do that blindly on non-EC2 instances, as it will significantly increase the execution time of Ohai.  You might want to wrap this in an if statement, and use something like the example below.

if [[ $(dmidecode | grep -i amazon) ]] ; then
 mkdir -p /etc/chef/ohai/hints && touch ${_}/ec2.json 
fi
Advertisements

AWS – Globally Adjusting ELB SSL Policy

A while back, I had to adjust the policy of all Elastic Load Balancers in my organization to disable SSLv3 due to the POODLE exploit. This can be an error-prone task if done by hand, especially if your architecture spans multiple regions and/or more than a handful of ELBs. The nice thing about cloud architecture, is that nearly everything can be automated and/or scripted.  For that reason, I went ahead and wrote a PowerShell script to handle this.

Most other write-ups I have seen do not take into account Stickiness policies, which are also applied to listeners.  If you run the Set-ELBLoadBalancerPolicyOfListener cmdlet with only an SSL policy applied, it will remove any other existing listener policies.  It is important to check the ELB for other policies, and make sure they are reapplied.  There is logic in this script that handles that.

It is important to note that this script makes use of Amazon’s template SSL Negotation Policies, but could be adapted to make use of your own.

As of 4/9/15, you cannot simply set the ELB policy to a newer reference policy, although AWS documentation states otherwise.  For this reason, a new policy must be created which references the AWS Reference-Security-Policy of choice.  You can retrieve a list of available reference policies with the Get-ELBLoadBalancerPolicy cmdlet.

Code also available on my GitHub

# AWS Global ELB SSL Policy
# Brian Dwyer - Intelligent Digital Services - 4/5/15

# Variables
$PolicyName="SSL-POLICY--$(Get-Date -Format yy-MM-ddTHHmmss)"
$ELBReferencePolicy='ELBSecurityPolicy-2015-03'

# Dependencies
Import-Module AWSPowerShell



Write-Host 'Finding AWS Regions containing ELBs...'

$RegionsWithELBs = @{}

ForEach ( $region in (Get-EC2Region).RegionName )
{
    $ELB_Count = (Get-ELBLoadBalancer -Region $region).count
    if ( $ELB_Count -ge 1 )
    {
        $RegionsWithELBs.Add($region, $ELB_Count)
    }
}

# Display ELB Regions & Count
$tformat = @{Expression={$_.Name};Label="Region"}, @{Expression={$_.Value};Label="ELB Count"}
$RegionsWithELBs.GetEnumerator() | Sort-Object Value -Descending | Format-Table $tformat -AutoSize




ForEach ( $region in $RegionsWithELBs.Keys )
{
    # Verify reference policy existence in region
    if ( (Get-ELBLoadBalancerPolicy -Region $region).PolicyName -contains $ELBReferencePolicy )
    {
        Write-Host "`nModifying ELBs in region: '$region' `n"

        # Loop through the ELBs
        ForEach ( $lb in (Get-ELBLoadBalancer -Region $region ).LoadBalancerName )
        {
            # Verify ELB serves HTTPS
            if ( (Get-ELBLoadBalancer -Region $region -LoadBalancerName $lb).ListenerDescriptions.Listener.Protocol -contains 'HTTPS' )
            {

                # Find Existing Policies (App/Cookie Stickiness, etc.)

                $PoliciesToApply = @($PolicyName)

                ForEach ( $currentpolicy in ((Get-ELBLoadBalancer -Region $region -LoadBalancerName $lb).ListenerDescriptions | Where-Object { $_.Listener.Protocol -contains 'HTTPS'}).PolicyNames )
                {
                    if ( (Get-ELBLoadBalancerPolicy -Region $region -LoadBalancerName $lb -PolicyName $currentpolicy).PolicyTypeName -ne 'SSLNegotiationPolicyType' )
                    {
                        $PoliciesToApply += @($currentpolicy)
                    }
                }

                # Configure SSL Policy
                Write-Host "`nCreating '$PolicyName' from '$ELBReferencePolicy' for $lb"
                New-ELBLoadBalancerPolicy -Region $region -LoadBalancerName $lb -PolicyName $PolicyName `
                  -PolicyTypeName SSLNegotiationPolicyType `
                  -PolicyAttribute @{ AttributeName="Reference-Security-Policy";AttributeValue="$ELBReferencePolicy"} `
                  -Force
                Write-Host "Activating policy '$PolicyName' for ELB: $lb"
                Set-ELBLoadBalancerPolicyOfListener -Region "$region" -LoadBalancerName "$lb" -LoadBalancerPort 443 -PolicyName $PoliciesToApply

                # Cleanup Old Policies
                ForEach ($policy in (Get-ELBLoadBalancerPolicy -Region "$region" -LoadBalancerName "$lb" | Where-Object {$_.PolicyTypeName -eq 'SSLNegotiationPolicyType'}).PolicyName)
                {
                    if ( $policy -ne $PolicyName -and $policy -ne $ELBReferencePolicy )
                    {
                        Write-Host "Removing old policy '$policy' from ELB: $lb"
                        Remove-ELBLoadBalancerPolicy -Region "$region" -LoadBalancerName "$lb" -PolicyName $policy -Force
                    }
                }
            }
        }
    }
    Else
    {
        Write-Host "Region $region does not contain policy $ELBReferencePolicy"
    }
}

AWS – Highly-Available NAT in VPC

arch_nat_active

Like most sysadmins, one of my primary responsibilities is ensuring high-availability in our environments. Recently, I’ve been working a lot more with Amazon AWS. Amazon recently began forcing new accounts to make use of VPC. When you create a VPC, an Internet Gateway must be provisioned to route traffic to the Internet. VPC’s utilize subnet constructs for virtual networking. Subnets are assigned a routing table, and in the case of a Public subnet, the default route of this table is pointed at the Internet Gateway. Instances in this public subnet are assigned public, non-RFC1918 Elastic IP addresses. At the moment, only 5 Elastic IP addresses may be requested per account. You can request more via support, but obviously they are trying to ween people away from using them for everything. Consequently, NAT & supporting instances must be in place to facilitate external communication for non-public subnets.

In the case of these subnets, the default-route should be pointed at a NAT instance residing in the Public subnet. This brings about a single point of failure. Should the NAT instance go down, nothing in that subnet can speak to the outside world; the default-route becomes a black-hole. In order to combat this, multiple NAT instances can be provisioned in different availability zones, and with a little magic, configured to take over each others traffic-routing responsibilities on-demand.

Amazon has furnished a document with a workaround for this situation. Essentially, a script running on each NAT instance performs a health check on the other NAT instance, and should the other instance go down, the healthy instance will take over. It does so by adjusting the routing tables via AWS API calls. The script will also attempt to bring the failed instance back online.

UPDATE: The NAT Monitor script outlined by Amazon has a flaw. The ec2-describe-instances call to determine the state of the other NAT instance does not function properly. The documentation references using $5 instead of $4 to set the NAT_STATE variable, however I have found $6 to work best, but test this because your EC2-API-tools version might yield different results. I also highly suggest the –show-empty-fields argument because if the number of fields changes, the awk statement could potentially grab the incorrect field.

NAT_STATE=`/opt/aws/bin/ec2-describe-instances $NAT_ID -U $EC2_URL --show-empty-fields | grep INSTANCE | awk '{print $6;}'`

There is one issue with the configuration outlined by the Amazon document; the IAM roles permissions are too loose. Using the policy defined in the document, the NAT instance is granted permissions to restart every instance belonging to the account. Additionally, the NAT instance could modify any and all routing tables, such as those in other regions, VPC's, etc. You probably don't want your NAT instances in US-West-2 making any modifications whatsoever to US-East-1. The below policy is an attempt to restrict permissions as best as permitted by supported IAM policy conditions. Just substitute/replace the region and VPC information with your own. Also, tag the NAT instances with a 'Type' and 'VPC' field, setting 'Type' to 'NAT' and 'VPC' with the VPC's ID.

Restricted IAM Policy

{
   "Statement":[
      {
         "Sid":"DescribeStuff",
         "Action":[
            "ec2:DescribeInstances"
         ],
         "Effect":"Allow",
         "Resource":"*",
         "Condition":{
            "StringLike":{
               "ec2:Region":"us-west-2",
               "ec2:ResourceTag/VPC":"vpc-abcd1234"
            }
         }
      },
      {
         "Sid":"RoutingTableAccess",
         "Action":[
            "ec2:CreateRoute",
            "ec2:ReplaceRoute"
         ],
         "Effect":"Allow",
         "Resource":"*",
         "Condition":{
            "StringEquals":{
               "ec2:Region":"us-west-2"
            }
         }
      },
      {
         "Sid":"NATInstanceControl",
         "Action":[
            "ec2:StartInstances",
            "ec2:StopInstances"
         ],
         "Effect":"Allow",
         "Resource":"arn:aws:ec2:us-west-2:*",
         "Condition":{
            "StringLike":{
               "ec2:ResourceTag/Type":"NAT",
               "ec2:ResourceTag/VPC":"vpc-abcd1234"
            }
         }
      }
   ]
}