DigitalOcean, Chef and Ohai – Retrieving a Droplet’s Private IP Address

Recently, I attempted to use the Ohai value for node['cloud_v2']['local_ipv4'] and node['cloud']['local_ipv4']['ip_address'] to determine the Private IP address of my Cloud-based nodes in a Chef cookbook.  Unfortunately, it does not work accurately for DigitalOcean instances any longer.

According to DigitalOcean documentation, if Private Networking is enabled, the IP will be assigned to eth1.  Recently, I noticed that a second Private IP address has begun to be assigned to the eth0 interface.  This is/was causing Ohai to assign the eth0 secondary (private) IP address to node['cloud_v2']['local_ipv4'] and node['cloud']['local_ipv4']['ip_address']

As you can see below, there is a second, private IP address assigned to eth0. I believe this has something to do with the recent release of Floating IP Addresses.

bdwyertech@dummy-droplet:~$ cat /etc/network/interfaces
# This file describes the network interfaces available on your
# system and how to activate them. For more information, see
# interfaces(5).

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth1 eth0
      iface eth0 inet static
      address 123.234.123.234
      netmask 255.255.255.0
      gateway 123.234.123.1
      up ip addr add 10.13.0.123/16 dev eth0
      dns-nameservers 8.8.8.8 8.8.4.4
iface eth1 inet static
      address 10.128.123.123
      netmask 255.255.0.0

Initially, I just wrote a simple function to detect the IP address on eth1 via the node hash if DigitalOcean is detected by Ohai as the cloud provider. However, querying DigitalOcean metadata seems to be the more robust solution.

DigitalOcean has released a metadata service, similar to AWS, where you can query http://169.254.169.254/metadata/{API_VERSION} for droplet information.  DigitalOcean conveniently allows you to query the droplet’s metadata in its entirety and return it in JSON format.  I’ve went ahead and written a simple library to query this data and bring it into Ruby as a hash.

# DigitalOcean Metadata Chef Library
# rubocop:disable LineLength

require 'net/http'

# Public: This defines a module to retrieve Metadata from DigitalOcean
module DoMetadata
  DO_METADATA_ADDR = '169.254.169.254' unless defined?(DO_METADATA_ADDR)
  DO_SUPPORTED_VERSIONS = %w( v1 )
  DO_DEFAULT_API_VERSION = 'v1'

  def self.http_client
    Net::HTTP.start(DO_METADATA_ADDR).tap { |h| h.read_timeout = 600 }
  end

  # Get metadata for a given path and API version
  def metadata_get(id, api_version = DO_DEFAULT_API_VERSION, json = false)
    path = "/metadata/#{api_version}/#{id}"
    path = "/metadata/#{api_version}.json" if json
    response = http_client.get(path)
    case response.code
    when '200'
      response.body
    when '404'
      Chef::Log.info("Encountered 404 response retreiving DO metadata path: #{path} ; continuing.")
      nil
    else
      fail "Encountered error retrieving DO metadata (#{path} returned #{response.code} response)"
    end
  end
  module_function :metadata_get

  # Retrieve the JSON metadata, and return it as a Ruby hash
  def parse_json_metadata(api_version = DO_DEFAULT_API_VERSION)
    retrieved_metadata = metadata_get(nil, api_version, true)
    return unless retrieved_metadata
    JSON.parse(retrieved_metadata) if retrieved_metadata
  end
  module_function :parse_json_metadata
end

Code also available on my Github.

This conveniently allows you to query the resulting Ruby hash, and use it in your code.

# => Get the Droplet's Metadata
metadata = DoMetadata.parse_json_metadata

metadata['interfaces']['private'][0]['ipv4']['ip_address'] # => Droplet's Private IP Address
metadata['interfaces']['private'][0]['ipv4']['netmask'] # => Droplet's Private Subnet Mask
Advertisements

Chef – Ohai in AWS EC2 VPC

This is a quick tip to those of you who are using Chef inside an AWS VPC. The EC2 Ohai plugin does not run by default, which prevents some meaningful node attributes from being collected.

The EC2-specific node attributes I find most useful are:

node['ec2']['instance_id'] # => Instance's ID
node['ec2']['local_ipv4'] # => Instance's IPv4 Address
node['ec2']['placement_availability_zone'] # => Instance's Region & Availability Zone
node['ec2']['ami_id'] # => Instance's Baseline AMI

To get your instances inside a VPC to pick up meaningful node attributes related to EC2, you have to create an Ohai hint file for the EC2 plugin. To do so, simply throw this into your initial bootstrap.

mkdir -p /etc/chef/ohai/hints && touch ${_}/ec2.json

Make sure you don’t do that blindly on non-EC2 instances, as it will significantly increase the execution time of Ohai.  You might want to wrap this in an if statement, and use something like the example below.

if [[ $(dmidecode | grep -i amazon) ]] ; then
 mkdir -p /etc/chef/ohai/hints && touch ${_}/ec2.json 
fi

AWS – Globally Adjusting ELB SSL Policy

A while back, I had to adjust the policy of all Elastic Load Balancers in my organization to disable SSLv3 due to the POODLE exploit. This can be an error-prone task if done by hand, especially if your architecture spans multiple regions and/or more than a handful of ELBs. The nice thing about cloud architecture, is that nearly everything can be automated and/or scripted.  For that reason, I went ahead and wrote a PowerShell script to handle this.

Most other write-ups I have seen do not take into account Stickiness policies, which are also applied to listeners.  If you run the Set-ELBLoadBalancerPolicyOfListener cmdlet with only an SSL policy applied, it will remove any other existing listener policies.  It is important to check the ELB for other policies, and make sure they are reapplied.  There is logic in this script that handles that.

It is important to note that this script makes use of Amazon’s template SSL Negotation Policies, but could be adapted to make use of your own.

As of 4/9/15, you cannot simply set the ELB policy to a newer reference policy, although AWS documentation states otherwise.  For this reason, a new policy must be created which references the AWS Reference-Security-Policy of choice.  You can retrieve a list of available reference policies with the Get-ELBLoadBalancerPolicy cmdlet.

Code also available on my GitHub

# AWS Global ELB SSL Policy
# Brian Dwyer - Intelligent Digital Services - 4/5/15

# Variables
$PolicyName="SSL-POLICY--$(Get-Date -Format yy-MM-ddTHHmmss)"
$ELBReferencePolicy='ELBSecurityPolicy-2015-03'

# Dependencies
Import-Module AWSPowerShell



Write-Host 'Finding AWS Regions containing ELBs...'

$RegionsWithELBs = @{}

ForEach ( $region in (Get-EC2Region).RegionName )
{
    $ELB_Count = (Get-ELBLoadBalancer -Region $region).count
    if ( $ELB_Count -ge 1 )
    {
        $RegionsWithELBs.Add($region, $ELB_Count)
    }
}

# Display ELB Regions & Count
$tformat = @{Expression={$_.Name};Label="Region"}, @{Expression={$_.Value};Label="ELB Count"}
$RegionsWithELBs.GetEnumerator() | Sort-Object Value -Descending | Format-Table $tformat -AutoSize




ForEach ( $region in $RegionsWithELBs.Keys )
{
    # Verify reference policy existence in region
    if ( (Get-ELBLoadBalancerPolicy -Region $region).PolicyName -contains $ELBReferencePolicy )
    {
        Write-Host "`nModifying ELBs in region: '$region' `n"

        # Loop through the ELBs
        ForEach ( $lb in (Get-ELBLoadBalancer -Region $region ).LoadBalancerName )
        {
            # Verify ELB serves HTTPS
            if ( (Get-ELBLoadBalancer -Region $region -LoadBalancerName $lb).ListenerDescriptions.Listener.Protocol -contains 'HTTPS' )
            {

                # Find Existing Policies (App/Cookie Stickiness, etc.)

                $PoliciesToApply = @($PolicyName)

                ForEach ( $currentpolicy in ((Get-ELBLoadBalancer -Region $region -LoadBalancerName $lb).ListenerDescriptions | Where-Object { $_.Listener.Protocol -contains 'HTTPS'}).PolicyNames )
                {
                    if ( (Get-ELBLoadBalancerPolicy -Region $region -LoadBalancerName $lb -PolicyName $currentpolicy).PolicyTypeName -ne 'SSLNegotiationPolicyType' )
                    {
                        $PoliciesToApply += @($currentpolicy)
                    }
                }

                # Configure SSL Policy
                Write-Host "`nCreating '$PolicyName' from '$ELBReferencePolicy' for $lb"
                New-ELBLoadBalancerPolicy -Region $region -LoadBalancerName $lb -PolicyName $PolicyName `
                  -PolicyTypeName SSLNegotiationPolicyType `
                  -PolicyAttribute @{ AttributeName="Reference-Security-Policy";AttributeValue="$ELBReferencePolicy"} `
                  -Force
                Write-Host "Activating policy '$PolicyName' for ELB: $lb"
                Set-ELBLoadBalancerPolicyOfListener -Region "$region" -LoadBalancerName "$lb" -LoadBalancerPort 443 -PolicyName $PoliciesToApply

                # Cleanup Old Policies
                ForEach ($policy in (Get-ELBLoadBalancerPolicy -Region "$region" -LoadBalancerName "$lb" | Where-Object {$_.PolicyTypeName -eq 'SSLNegotiationPolicyType'}).PolicyName)
                {
                    if ( $policy -ne $PolicyName -and $policy -ne $ELBReferencePolicy )
                    {
                        Write-Host "Removing old policy '$policy' from ELB: $lb"
                        Remove-ELBLoadBalancerPolicy -Region "$region" -LoadBalancerName "$lb" -PolicyName $policy -Force
                    }
                }
            }
        }
    }
    Else
    {
        Write-Host "Region $region does not contain policy $ELBReferencePolicy"
    }
}

AWS – Highly-Available NAT in VPC

arch_nat_active

Like most sysadmins, one of my primary responsibilities is ensuring high-availability in our environments. Recently, I’ve been working a lot more with Amazon AWS. Amazon recently began forcing new accounts to make use of VPC. When you create a VPC, an Internet Gateway must be provisioned to route traffic to the Internet. VPC’s utilize subnet constructs for virtual networking. Subnets are assigned a routing table, and in the case of a Public subnet, the default route of this table is pointed at the Internet Gateway. Instances in this public subnet are assigned public, non-RFC1918 Elastic IP addresses. At the moment, only 5 Elastic IP addresses may be requested per account. You can request more via support, but obviously they are trying to ween people away from using them for everything. Consequently, NAT & supporting instances must be in place to facilitate external communication for non-public subnets.

In the case of these subnets, the default-route should be pointed at a NAT instance residing in the Public subnet. This brings about a single point of failure. Should the NAT instance go down, nothing in that subnet can speak to the outside world; the default-route becomes a black-hole. In order to combat this, multiple NAT instances can be provisioned in different availability zones, and with a little magic, configured to take over each others traffic-routing responsibilities on-demand.

Amazon has furnished a document with a workaround for this situation. Essentially, a script running on each NAT instance performs a health check on the other NAT instance, and should the other instance go down, the healthy instance will take over. It does so by adjusting the routing tables via AWS API calls. The script will also attempt to bring the failed instance back online.

UPDATE: The NAT Monitor script outlined by Amazon has a flaw. The ec2-describe-instances call to determine the state of the other NAT instance does not function properly. The documentation references using $5 instead of $4 to set the NAT_STATE variable, however I have found $6 to work best, but test this because your EC2-API-tools version might yield different results. I also highly suggest the –show-empty-fields argument because if the number of fields changes, the awk statement could potentially grab the incorrect field.

NAT_STATE=`/opt/aws/bin/ec2-describe-instances $NAT_ID -U $EC2_URL --show-empty-fields | grep INSTANCE | awk '{print $6;}'`

There is one issue with the configuration outlined by the Amazon document; the IAM roles permissions are too loose. Using the policy defined in the document, the NAT instance is granted permissions to restart every instance belonging to the account. Additionally, the NAT instance could modify any and all routing tables, such as those in other regions, VPC's, etc. You probably don't want your NAT instances in US-West-2 making any modifications whatsoever to US-East-1. The below policy is an attempt to restrict permissions as best as permitted by supported IAM policy conditions. Just substitute/replace the region and VPC information with your own. Also, tag the NAT instances with a 'Type' and 'VPC' field, setting 'Type' to 'NAT' and 'VPC' with the VPC's ID.

Restricted IAM Policy

{
   "Statement":[
      {
         "Sid":"DescribeStuff",
         "Action":[
            "ec2:DescribeInstances"
         ],
         "Effect":"Allow",
         "Resource":"*",
         "Condition":{
            "StringLike":{
               "ec2:Region":"us-west-2",
               "ec2:ResourceTag/VPC":"vpc-abcd1234"
            }
         }
      },
      {
         "Sid":"RoutingTableAccess",
         "Action":[
            "ec2:CreateRoute",
            "ec2:ReplaceRoute"
         ],
         "Effect":"Allow",
         "Resource":"*",
         "Condition":{
            "StringEquals":{
               "ec2:Region":"us-west-2"
            }
         }
      },
      {
         "Sid":"NATInstanceControl",
         "Action":[
            "ec2:StartInstances",
            "ec2:StopInstances"
         ],
         "Effect":"Allow",
         "Resource":"arn:aws:ec2:us-west-2:*",
         "Condition":{
            "StringLike":{
               "ec2:ResourceTag/Type":"NAT",
               "ec2:ResourceTag/VPC":"vpc-abcd1234"
            }
         }
      }
   ]
}