Switch Security – DHCP Snooping, IP Source Guard and Dynamic ARP Inspection

A few years back, I was designing an Enterprise MDM strategy for iPhone deployments. At some point, I was browsing the Apple Enterprise iOS forum and saw a post about an iOS bug where devices sometimes retain a previously assigned IP (from another network) when joining a new network. In this case, someone was joining a network with a retained IP which coincided with the organization’s e-mail server. When this occurred, it apparently disabled e-mail service throughout the organization. From a network engineering standpoint, that sounds impossible if the network was designed correctly. I was trying to think of how that could possibly happen, and the only thing I could think of is that the email server resides on the same subnet as the clients and the client was responding to ARP requests for the e-mail servers IP… Sounds like a giant mess.

At the same time, I was studying for the CCNP SWITCH exam reading about DHCP Snooping, IP Source Guard and Dynamic ARP Inspection. Although these features were intended to mitigate spoofing attacks, they could also help prevent this supposed iOS bug from affecting network services. Whether malicious or accidental, spoofing is usually not a good thing.

DHCP Snooping

To better understand DHCP Snoopings purpose, it is important to understand what is possible without it. Under normal circumstances, an attacker could easily implement a rogue DHCP server. When other clients in the same VLAN first plug in and send a DHCP Request, they would receive and might accept a DHCP Offer sent by this rogue DHCP server. This rogue server might set the client default gateway to the attacker’s IP, causing all traffic destined for a foreign subnet to be sent to the attacker; a successful Man-in-the-Middle attack.

DHCP Snooping gives a Cisco switch the ability to control where a DHCP Reply can come from. Any DHCP server traffic such as a Reply, ACK or NACK is only permitted from a trusted port. On untrusted ports, only DHCP Requests are permitted. By default, all ports are untrusted therefore trusts must be configured manually. Basically, a trusted port is where a legitimate DHCP server resides or where legitimate DHCP Reply’s would be received from, therefore the traffic should be permitted. If ingress DHCP server traffic is intercepted on an untrusted port, the port is placed into the err-disabled state.

Typically, the trust boundary lies where end-users connect, therefore DHCP Snooping should be enabled at the access layer. Trusts should be configured only on links where a DHCP Reply would be expected such as uplinks to the distribution layer or the port where a local DHCP server resides.

In addition to permitting/denying DHCP server traffic, DHCP Snooping also keeps track of when clients receive a successful DHCP binding. It records information such as the IP address assignment, the lease time, and the requester’s MAC address as well as the port on which the request was received. This information is then stored and can be utilized by other technologies such as IPSG and DAI.

Dynamic ARP Inspection

Dynamic ARP Inspection provides a method to protect the integrity of layer-2 ARP transactions. DAI leverages the DHCP Snooping database to validate the integrity of ARP traffic. ARP is used when a host has an IP address and wants to determine the MAC address. As an example, if a client sends an ARP request for the default gateway, an attacker could potentially reply to the request with its own MAC address, causing the client to send traffic destined for the default gateway IP to the attacker. This type of attack is known as ARP Spoofing/Poisoning and DAI can potentially mitigate it’s threat by validating the integrity of the ARP traffic against known good bindings.

Like DHCP Snooping, DAI also utilizes the concept of trusted ports. On trusted ports, no ARP Inspection is performed. By default, all ports are considered untrusted, therefore trusted ports must be manually configured. Like DHCP Snooping, the trust boundary should lie at host-connected ports, therefore DAI should be enabled at the access layer. Trusts should only be configured on links to other switches and the distribution layer.

DAI inspects ARP traffic and verifies the source MAC and IP against the known trusted values in the Snooping database. For example, if in the database the MAC 111.222.333 is bound to 10.1.1.1, the host connected to this port will only be allowed to respond to ARP Requests for 10.1.1.1. ARP Reply’s found to be invalid are dropped and logged. Hosts utilizing static IP’s must be manually configured and permitted via an ARP ACL. This ACL must then be added to a DAI filter in order for it to recognize the bindings. Optionally, DAI can be configured to ignore the DHCP Snooping database and utilize only the ARP ACL for increased security. DAI is enabled on a per-VLAN basis.

IP Source Guard

IP Source Guard can help ensure that hosts utilize only their assigned IP addresses. IPSG can leverage the information in the DHCP Snooping database to dynamically create Port ACL’s permitting only Layer-3 IP traffic which has a source matching the port-IP binding in the DHCP Snooping binding database. This prevents a host from transmitting using a source IP differing from that which it was assigned via DHCP. The IPSG PACL also includes a VLAN binding, ensuring that the IP can only be used on its respective VLAN. IPSG can optionally verify the source MAC as well as the source IP for added protection by leveraging the port-security feature. When enabled, this ensures that the source MAC is associated with the source IP in the Snooping database. For statically configured hosts, a static binding must be manually configured to permit traffic flow. IP Source Guard is enabled on a per-port basis.

Lab Scenario

image

Configuring DHCP Snooping

First, we will enable and configure DHCP Snooping since it makes our lives easier with Dynamic ARP Inspection and IP Source Guard. In addition, we will configure Trust’s for the interfaces on which a DHCP Reply is expected. I have set up DHCP servers in a split-scope fashion on both 2821 and the 1841. This means that the paths to both the 1841 and 2821 must be trusted by DHCP, meaning fa0/48 on both switches as well as their interconnecting port-channel po1 which is a LACP-negotiated EtherChannel.

Remember, configuration of an port-channel interface automatically does the same on all port-channel member interfaces, in this case fa0/1 and fa0/2. It is not necessary to manually enter the trust configuration on individual ports which are members of a port-channel.

Above, you can see that DHCP snooping must be enabled globally and then on a per-VLAN basis. In this case, things are simple and there is only one VLAN. Multiple VLANs can be configured concurrently by using a VLAN list such as 1,5,7-10.

Verification of the DHCP Snooping Configuration

Now I will connect a client and record what happens using “debug ip dhcp snooping event”. The client is not getting an IP address. As you can see below, the DHCP packet is being received, relay information is being added, and then it is being sent out other active ports.

image

Seeing only the switch side, our understanding of the situation is precarious at best. In the final step, we can see that the packet is in fact being sent out, so let’s check what is getting to the DHCP server.

image

Aha! Debugging is a beautiful feature. From here, we can see that there is a problem with the relay information. By default, a DHCP Snooping adds Relay Information, or DHCP Option 82. My understanding from the above output is that the switch intercepts DHCP Requests and injects it’s relay information into it prior to forwarding it on. The problem here seems to be that it sets the “giaddress” to a null value, which would normally be the DHCP Relay’s IP address. Under normal circumstances, a DHCP Relay is used when a subnet does not locally contain a DHCP server which could overhear the layer-2 broadcast DHCP Requests. A DHCP relay acts as a relay for subnets without a DHCP server inside the layer-2 boundary, listening for DHCP Requests and relaying them past layer-2 boundaries to a configured DHCP server. When enabling DHCP snooping on a VLAN which has a directly connected DHCP server, the relay process is not necessary and the null giaddr value confuses the DHCP server which thinks the request came from a relay, not directly from the requester.

By default, the relay function is enabled. It is turned off by disabling the information option.

Now lets see what happens.

image

As you can see, the client now received its IP, and the switch recorded the transaction. It now knows the MAC, IP, Lease time, VLAN and Source Interface of the transaction. This information can now be leveraged by Dynamic ARP Inspection and IP Source Guard to help protect our network.

Configuring IP Source Guard

IP Source Guard leverages the DHCP snooping database in a powerful way. It dynamically creates Port ACL’ s based on the information in the DHCP Snooping database. It references the IP address and the associated interface to create a PACL which permits only traffic from the trusted IP. In this example, IP Source Guard creates a PACL permitting only traffic with a source address of 10.0.0.40 to enter port fa0/34; all other traffic is dropped.

Enable IPSG (Per-Interface)

We can also see what the dynamic PACL contains below.

With ip verify source port-security enabled, the result is the following:

As you can see above, the mac-address field shows permit-all. IPSG relies upon Port-Security to enforce the MAC validation. In order for it to do this, port-security must be enabled on the port using switchport port-security. After adding this command, the result is the following:

As you can see, IP Source Guard is now validating both MAC and IP. It utilizes a PACL to enforce the IP, and Port-Security for MAC. It is important to note that the only reason this worked is because I turned it on AFTER the host already had a MAC and IP binding present in the DHCP Snooping database. You should probably leave the port-security function of IPSG disabled unless you have manually configured bindings. When enabled and a binding is not present, IPSG’s port-security function defaults to a deny-all MAC function making it impossible for the switch to dynamically learn MAC bindings such as via the switchport mac-address sticky function. I would suggest you configure port-security separately unless you have a specific need for this functionality.

Lets see what happens if a host plugs into a port with IPSG enabled and tries to set a static IP. As seen below, if a host plugs in and no static binding or DHCP Snooping entry exists, a deny-all PACL and deny-all MAC port-security function is configured on the port.

This configuration makes it impossible for any communication to take place on the port without manually configuring a MAC for port security. All host traffic will be dropped at the switchport because of the deny-all rules.

IPSG & Static IP’s

In the case of a host with a static IP, a static source binding must be configured, binding a MAC to a specific VLAN and IP on a specific interface. For example:

image

The result of configuring the static binding entry is shown below.

The ip verify source port-security function plays much nicer with static bindings. Since IPSG is configured on a port-by-port basis, utilize “ip verify source” for DHCP ports and save “ip verify source port-security” functionality for ports with statically configured hosts. This function more or less facilitates automatic port-security MAC configuration so you do not have to enter it in two places if you wish to use the feature.

Configuring Dynamic ARP Inspection

Dynamic ARP Inspection is easy to configure when running DHCP Snooping. DAI inspects ingress ARP Replies and verifies the source-address against the DHCP Snooping database. In this case, DAI would inspect ARP Replies on fa0/34 and ensure that the source information matches the MAC and IP in the DHCP snooping database.

First, lets redact the static binding from the earlier IPSG config and disable the port-security portion of its configuration since I want to demonstrate DAI using the DHCP Snooping database.

image

Enable DHCP Snooping on the appropriate VLAN(s), in this case, VLAN 1.

Configure trust on our links to other switches, in this case, po1.

Now, lets see the results. What happens when someone sets a static IP? Below I have set the IP of a host to 10.0.0.111. When I brought it online, it cannot even resolve the MAC address of the default gateway, 10.0.0.1!

When the host attempts to perform an ARP request, DAI blocks the unknown binding since it cannot validate the source address.

image

Lets pretend that this static host binding is valid and we need to allow it. In order to do so, we must utilize an ARP ACL and add it to the DAI Filter to allow it.

The console cut off the rest of the line. The resulting ACL is shown below.

Now that we have created the ARP ACL, we must add it to the DAI filter.

The static keyword can be added at the end of this line to force DAI to use only the ARP ACL and not utilize the DHCP Snooping bindings. Be careful!

At this point, the statically configured host should be able to successfully perform an ARP Request for the default gateway. Lets see.

Now, lets see what happens if we change the IP address of the host to an ip different than the ARP ACL entry.

Success! The host is blocked again. This switch is blocking communication.

image

Finally, we can verify DAI operation. As seen below, the ACL Match shows that our STATIC_ARP_ACL is being utilized alongside the DHCP Snooping database for VLAN 1. If the filter was configured to utilize only the ACL and not the DHCP Snooping database, the Static ACL field would say Yes.

Summary

After seeing all of these features in action, it should be obvious that using DHCP to assign addresses makes implementation a lot easier. The DHCP Snooping database facilitates the collection of information required for the function of Dynamic ARP Inspection and IP Source Guard. Remember to configure DHCP Snooping Trust on interfaces where a valid DHCP Reply may be received, otherwise the DHCP Server traffic will be blocked.

Remember to configure interface trusts on ports extending to other switches for dynamic ARP inspection. Also, remember to create entries in an ARP ACL and apply it to the DAI filter if you are running hosts with static IP’s.

If you want to use IP Source Guard in combination with static IP hosts, remember to create static IP source bindings for each host otherwise the default PACL will deny all traffic.

Remember that IP Source Guard’s port-security integration feature requires port-security to be enabled on the port, otherwise it uses a permit-all function. Additionally, remember that this integration feature is easier used with static host entries than DHCP hosts, unless I missed something. This is no big deal because IPSG is implemented on a per-port basis anyway.

Important: When implementing Dynamic ARP Inspection to secure Wireless networks, make sure to set a minimum of 90 minute DHCP lease time… Wireless devices in standby mode behave erratically with regard to DHCP lease renewal, especially Apple devices. When first implementing DHCP-Snooping and DAI, definitely keep an eye on your switch logs to mitigate mass end-user complaints.

DigitalOcean, Chef and Ohai – Retrieving a Droplet’s Private IP Address

Recently, I attempted to use the Ohai value for node['cloud_v2']['local_ipv4'] and node['cloud']['local_ipv4']['ip_address'] to determine the Private IP address of my Cloud-based nodes in a Chef cookbook.  Unfortunately, it does not work accurately for DigitalOcean instances any longer.

According to DigitalOcean documentation, if Private Networking is enabled, the IP will be assigned to eth1.  Recently, I noticed that a second Private IP address has begun to be assigned to the eth0 interface.  This is/was causing Ohai to assign the eth0 secondary (private) IP address to node['cloud_v2']['local_ipv4'] and node['cloud']['local_ipv4']['ip_address']

As you can see below, there is a second, private IP address assigned to eth0. I believe this has something to do with the recent release of Floating IP Addresses.

bdwyertech@dummy-droplet:~$ cat /etc/network/interfaces
# This file describes the network interfaces available on your
# system and how to activate them. For more information, see
# interfaces(5).

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth1 eth0
      iface eth0 inet static
      address 123.234.123.234
      netmask 255.255.255.0
      gateway 123.234.123.1
      up ip addr add 10.13.0.123/16 dev eth0
      dns-nameservers 8.8.8.8 8.8.4.4
iface eth1 inet static
      address 10.128.123.123
      netmask 255.255.0.0

Initially, I just wrote a simple function to detect the IP address on eth1 via the node hash if DigitalOcean is detected by Ohai as the cloud provider. However, querying DigitalOcean metadata seems to be the more robust solution.

DigitalOcean has released a metadata service, similar to AWS, where you can query http://169.254.169.254/metadata/{API_VERSION} for droplet information.  DigitalOcean conveniently allows you to query the droplet’s metadata in its entirety and return it in JSON format.  I’ve went ahead and written a simple library to query this data and bring it into Ruby as a hash.

# DigitalOcean Metadata Chef Library
# rubocop:disable LineLength

require 'net/http'

# Public: This defines a module to retrieve Metadata from DigitalOcean
module DoMetadata
  DO_METADATA_ADDR = '169.254.169.254' unless defined?(DO_METADATA_ADDR)
  DO_SUPPORTED_VERSIONS = %w( v1 )
  DO_DEFAULT_API_VERSION = 'v1'

  def self.http_client
    Net::HTTP.start(DO_METADATA_ADDR).tap { |h| h.read_timeout = 600 }
  end

  # Get metadata for a given path and API version
  def metadata_get(id, api_version = DO_DEFAULT_API_VERSION, json = false)
    path = "/metadata/#{api_version}/#{id}"
    path = "/metadata/#{api_version}.json" if json
    response = http_client.get(path)
    case response.code
    when '200'
      response.body
    when '404'
      Chef::Log.info("Encountered 404 response retreiving DO metadata path: #{path} ; continuing.")
      nil
    else
      fail "Encountered error retrieving DO metadata (#{path} returned #{response.code} response)"
    end
  end
  module_function :metadata_get

  # Retrieve the JSON metadata, and return it as a Ruby hash
  def parse_json_metadata(api_version = DO_DEFAULT_API_VERSION)
    retrieved_metadata = metadata_get(nil, api_version, true)
    return unless retrieved_metadata
    JSON.parse(retrieved_metadata) if retrieved_metadata
  end
  module_function :parse_json_metadata
end

Code also available on my Github.

This conveniently allows you to query the resulting Ruby hash, and use it in your code.

# => Get the Droplet's Metadata
metadata = DoMetadata.parse_json_metadata

metadata['interfaces']['private'][0]['ipv4']['ip_address'] # => Droplet's Private IP Address
metadata['interfaces']['private'][0]['ipv4']['netmask'] # => Droplet's Private Subnet Mask

VLAN Security – VLAN Access Control List’s (VACL)

I’ve been studying to renew my CCNP as of recently, and I decided to create a refresher blog post about the implementation of VACLs.

VLAN Access Control Lists (VACLs) can be used to implement Access Control at both Layer 2 and Layer 3. Typically, access lists are applied to ingress or egress traffic on a routed, L3 interface. VACL’s allow you to apply the filtering to all packets, regardless of direction.

VACL’s are created and applied in a similar manner to route-maps and policy-based routing, in the sense that you create a VLAN access-map, and then apply the VLAN access-maps to VLAN’s with a filter statement. Lets see an example.

I have pre-created VLAN 123 with an L3 interface address of 10.123.123.1/24. I have joined a port to this VLAN, and connected a PC with an IP address of 10.123.123.124/24.

IP-based VACL

For a Layer 3, IP-based VACL, we must first create a regular ACL. This ACL will either contain IP’s to permit, or IP’s to block. Remember that by default, an implicit ‘deny all’ is in place, so unless you explicitly allow, the packets will be denied.  The implicit ‘deny all’ can be counteracted with an explicit ‘allow all’ at the tail end of the ACL. In this case, only specifically denied traffic will be denied.

Create the Access List
vacl_1  

Create the VLAN Access Map
vacl_2

Confirm the VLAN Access Map Configuration
vacl_4

Apply the the VLAN Access Map to specified VLAN(s)

Confirm the Application of the VLAN Access Map
vacl_5

Lets test the configuration.  We have specifically allowed only 10.123.123.123 to be able to communicate on the VLAN.

With the IP address set to 10.123.123.123/24, we can successfully ping the L3 interface of the VLAN.
vacl_6

With the IP address set to 10.123.123.124/24, we cannot.
vacl_7

We now have a functional implementation of a Layer 3 VLAN Access Control List!  Now, lets delve into how similar functionality can be achieved at Layer 2.

MAC-based VACL

Layer 2 filtering simply involves substituting MAC Access Control Lists for IP Access Control Lists.  MAC ACL’s are very similar to IP ACL’s, and extended MAC ACL’s can even make use of wildcard masks.  Wildcard masks might be used if you wanted to restrict traffic to a certain vendor’s MAC OUI. Under typical circumstances, the first 24 bits of a MAC address are known as an Organizationally Unique Identifier, and are assigned to vendors by the IEEE under ISO/IEC 8802 standards.  Anyway, on to the example.

First, clear out the existing VLAN access map with a no vlan access-map FILTER_NAME.

We need to make a MAC ACL.  In this case, I am specifically denying my laptop’s MAC address.  I have blurted out part of the MAC for security purposes.  I didn’t, but you’d probably want to add an allow any any statement if you only want to block specific MAC addresses.

Create the MAC Access Control List
vacl_8

Confirm the MAC ACL Creation

vacl_9

Create the VLAN Access Map
vacl_10

Confirm Application of VLAN Access-Map
vacl_12

vacl_13

Confirm Functionality of MAC-based VLAN Access-Map

vacl_7

There we have it!  We have successfully implemented both IP-based and MAC-based VACL’s.

UPDATE: Branch Office Connections – GRE over IPSec VPN

022512_2257_BranchOffic1.png

It has been almost 3 years since I wrote Branch Office Connections – GRE over IPSec VPN!  For those of you who are familiar with Cisco certification, that means time to renew!  Anyway, I’m coming back now and looking at this lab, I noticed that we are preferring the serial link for traffic!  In this day and age, the serial link would likely be the fallback, and a faster, cheaper GRE/IPSEC VPN would likely be the preferable route!  Anyway, to accomplish this, we need to look at EIGRP and its determination of the best route.  EIGRP uses bandwidth, delay, reliability, load, and MTU. Essentially, bandwidth and delay are what we are going to play with here; hopefully your ISP is reliable, and not over-provisioned.  Manipulation of bandwidth and delay will affect the feasible route’s metric, and in turn affect which route EIGRP will prefer.  Manipulation of these values will not affect the actual throughput or latency on the link, they are purely for manipulating routing decisions, as we are doing here.

tun0_bw

When I fired this lab back up, I found that the default bandwidth on the Tunnel interface was 9kbps! Actually, I used bandwidth inherit incorrectly; this configuration does not cause Tunnel0 to inherit the bandwidth of the FastEthernet interface.  Bandwidth inherit would be used on sub-interfaces, to inherit the bandwidth of the primary interface. A tunnel is not a sub-interface of the transmitting/receiving PHY’s. Anyway, we should make sure we declare a higher bandwidth and lower delay than the Serial backup on these Tunnel interfaces.

Cisco uses tens of microseconds as unit of measure for delay. 1 millisecond = 1000 microseconds, so 1 millisecond = 100 tens of microseconds.  So lets say you have a 50 millisecond ping between your branch office and HQ, you would use a delay value of 5000.  FastEthernet’s default bandwidth is 100,000kbps, so we’ll use that for our tunnel bandwidth.

On R4:

R4_bw_delay

R4’s Routing Table – Before & After

Before

R4_before

After

R4_after

At this point, our Branch Office is routing egress traffic destined for the Enterprise Core over the GRE/IPSEC tunnel.  Lets go ahead and see what the routing table looks like on the other side, on R2.

R2_before

Aha!  The Enterprise Core is still routing traffic destined for the Branch Office over the Serial link.  Why is this? Well, R2 still doesn’t realize that the Tunnel interface has a higher bandwidth and lower latency.  Let’s go ahead and fix that.

R2_bw_delay

After

R2_after

Bingo.  We’ve fixed it all up.  It is critical that any interface metric manipulations are performed on both sides of the link, otherwise the result is a non-symmetrical traffic pattern, which may result in confusion and difficult troubleshooting weeks, months, or years down the road.

AWS – Highly-Available NAT in VPC

arch_nat_active

Like most sysadmins, one of my primary responsibilities is ensuring high-availability in our environments. Recently, I’ve been working a lot more with Amazon AWS. Amazon recently began forcing new accounts to make use of VPC. When you create a VPC, an Internet Gateway must be provisioned to route traffic to the Internet. VPC’s utilize subnet constructs for virtual networking. Subnets are assigned a routing table, and in the case of a Public subnet, the default route of this table is pointed at the Internet Gateway. Instances in this public subnet are assigned public, non-RFC1918 Elastic IP addresses. At the moment, only 5 Elastic IP addresses may be requested per account. You can request more via support, but obviously they are trying to ween people away from using them for everything. Consequently, NAT & supporting instances must be in place to facilitate external communication for non-public subnets.

In the case of these subnets, the default-route should be pointed at a NAT instance residing in the Public subnet. This brings about a single point of failure. Should the NAT instance go down, nothing in that subnet can speak to the outside world; the default-route becomes a black-hole. In order to combat this, multiple NAT instances can be provisioned in different availability zones, and with a little magic, configured to take over each others traffic-routing responsibilities on-demand.

Amazon has furnished a document with a workaround for this situation. Essentially, a script running on each NAT instance performs a health check on the other NAT instance, and should the other instance go down, the healthy instance will take over. It does so by adjusting the routing tables via AWS API calls. The script will also attempt to bring the failed instance back online.

UPDATE: The NAT Monitor script outlined by Amazon has a flaw. The ec2-describe-instances call to determine the state of the other NAT instance does not function properly. The documentation references using $5 instead of $4 to set the NAT_STATE variable, however I have found $6 to work best, but test this because your EC2-API-tools version might yield different results. I also highly suggest the –show-empty-fields argument because if the number of fields changes, the awk statement could potentially grab the incorrect field.

NAT_STATE=`/opt/aws/bin/ec2-describe-instances $NAT_ID -U $EC2_URL --show-empty-fields | grep INSTANCE | awk '{print $6;}'`

There is one issue with the configuration outlined by the Amazon document; the IAM roles permissions are too loose. Using the policy defined in the document, the NAT instance is granted permissions to restart every instance belonging to the account. Additionally, the NAT instance could modify any and all routing tables, such as those in other regions, VPC's, etc. You probably don't want your NAT instances in US-West-2 making any modifications whatsoever to US-East-1. The below policy is an attempt to restrict permissions as best as permitted by supported IAM policy conditions. Just substitute/replace the region and VPC information with your own. Also, tag the NAT instances with a 'Type' and 'VPC' field, setting 'Type' to 'NAT' and 'VPC' with the VPC's ID.

Restricted IAM Policy

{
   "Statement":[
      {
         "Sid":"DescribeStuff",
         "Action":[
            "ec2:DescribeInstances"
         ],
         "Effect":"Allow",
         "Resource":"*",
         "Condition":{
            "StringLike":{
               "ec2:Region":"us-west-2",
               "ec2:ResourceTag/VPC":"vpc-abcd1234"
            }
         }
      },
      {
         "Sid":"RoutingTableAccess",
         "Action":[
            "ec2:CreateRoute",
            "ec2:ReplaceRoute"
         ],
         "Effect":"Allow",
         "Resource":"*",
         "Condition":{
            "StringEquals":{
               "ec2:Region":"us-west-2"
            }
         }
      },
      {
         "Sid":"NATInstanceControl",
         "Action":[
            "ec2:StartInstances",
            "ec2:StopInstances"
         ],
         "Effect":"Allow",
         "Resource":"arn:aws:ec2:us-west-2:*",
         "Condition":{
            "StringLike":{
               "ec2:ResourceTag/Type":"NAT",
               "ec2:ResourceTag/VPC":"vpc-abcd1234"
            }
         }
      }
   ]
}

Branch Office Connections – GRE over IPSec VPN

While studying for the CCNP Route exam, I noticed that GRE Tunneling and IPSec were mentioned as topics, however configuration of the two was never really covered in the certification guide.  In addition, I did not see any labs involving this type of scenario, so I decided to create my own.  Hopefully this provides some insight to others about how to configure GRE Tunnels over IPsec.  Don’t feel bad if this takes you a while, setting up this scenario and writing this up took a whole Saturday afternoon.  The configuration files are at the bottom of the post.

IPSec can secure data in transit and provides authentication that it came from a trusted source.

To route from the branch office to the enterprise core, you can use either static routes or a routing protocol. To secure traffic over the internet, IPSec can be used.  To utilize a routing protocol to connect securely to the core with IPsec, it must be combined with a GRE tunnel which supports the broadcasts/multicasts necessary for proper IGP operation.

GRE Tunnels

  • Act link a point-to-point link from a Layer 3 perspective.
  • Supports many passenger protocols, including IPv4.
  • Encapsulates/forwards broadcasts and multicasts, therefore supporting IPv4 IGP’s.
  • GRE tunnels can run through IPsec tunnels.

Implementing Branch Office GRE over IPSec VPN

Branch Office GRE/IPSec VPN (R2-R4) & Private Serial (R3-R4)

Here, we will utilize split-tunnel at the Branch Office to ensure that only traffic destined for the Core will be encrypted over the GRE IPSec tunnel. A default route is configured on R2 and R4 to ISP1. Additionally, ISP1 has been configured with a loopback to test split-tunnel functionality.

All routers (except ISP1 obviously) will participate in EIGRP AS 1. Each participating router shall be configured with the network 10.0.0.0 and no auto-summary commands.

Configure the GRE Tunnels on R2 and R4

The tunnel source is the internet-facing interface, in this case fa0/0 with the IP of 15.1.1.2 on R2 and fa0/1 with the IP of 15.1.2.2 on R4.
When associating a tunnel source with an interface, if the associated interface goes down, so does the tunnel.  If you have redundant connections, use a loopback as a source instead.

R2 – R4 GRE Tunnel Configuration

First, configure the GRE tunnel. After doing this, you should see your EIGRP neighbors go up.

In a real world situation, you’d probably also want to configure the tunnel with Path MTU Detection.  This will mitigate fragmentation issues that could cause your routers to drop CEF and use process switching.  Enable PMTUD at each tunnel endpoint with tunnel path-mtu-discovery.

R2

R4

R2 GRE Route-Map

Here, we will configure a route-map to prevent a default route from ever being advertised over the GRE tunnel. The only way for R4 to ever learn R2’s default route should be via the serial connection to R3, and it should only be utilized if R4’s fa0/0 interface goes down. We could also utilize this to block other networks from being advertised over the GRE tunnel by modifying the ACL.

Now, you can see that R4 only learns a backup Default-Route through R3. Previously, it learned it through the tunnel as well, which would be silly to utilize and might break split-tunnel if the static default-route was lost for some reason.

Encrypt GRE Tunnel traffic using IPSec

We will now encrypt traffic traversing the GRE tunnel using IPsec. We must first define an ACL to catch the traffic we wish to encrypt, which will be GRE traffic from our GRE source-interface destined for the GRE tunnel destination.

R2 IPsec Policy Creation

ACL (Catch GRE to R4)

ISAKMP Policy

ISAKMP Key & IPSec Transform Set (Transport Mode)

Keep in mind that transport mode is used only because we are not traversing NAT, we are going from R2 to R4’s public IP for both IPSec and GRE Tunnels.   Tunnel mode is required if the GRE destination is different from the IPSec destination.  If the endpoints were each behind a NAT router, the situation requires a combination of NAT-T and Tunnel Mode.  Tunnel mode adds an extra 20 bytes to the overall packet size. NAT-T also adds an extra 20 bytes per packet and encapsulates the IPSec traffic into UDP packets over port 4500.

Crypto Map Configuration

Here, we configure the Crypto Map to R4. We will use the previously created GRE catching ACL.

Apply Crypto Map to GRE Tunnel

The Crypto Map should be applied to the physical exit interface. Cisco recommends applying it to the logical tunnel interface as well.

R4 IPsec Policy Creation

We will follow the same procedure utilized on R2 and reverse the IP addresses.

ACL (Catch GRE to R2)

ISAKMP Policy

ISAKMP Key & IPSec Transform Set (Transport Mode)

Crypto Map Configuration

Here, we configure the Crypto Map to R4. We will use the previously created GRE catching ACL.

Apply Crypto Map to GRE Tunnel

Verify IPsec and Split-Tunnel Functionality

You can utilize Wireshark to verify only ESP traffic traverses between the GRE endpoints. I found that the Crypto Map did not need to be applied to the logical Tunnel 0 interface to function, but only the physical exit interface. I am not sure why Cisco recommends applying it to both.

You can see that the Crypto traffic for Tunnel 0 and FastEthernet 0/0 is identical. I am not sure if running the crypto map VPN on both the Tunnel 0 and FastEthernet 0/0 is causing excess encryption overhead, or if it only encrypts the traffic once.

After disabling Crypto Map VPN on the Tunnel 0 interface, I confirmed via Wireshark that the GRE traffic continued to be encrypted. A show crypto ipsec sa detail shows that the traffic continues to be encrypted as well.

Verify Split-Tunnel

To test this, I will simply ping a simulated public internet address of 1.2.3.4 from the inside interface of R4.

Conclusion

We have implemented remote branch office with redundant connectivity to the core; a leased line and a secured backup utilizing GRE over IPsec via the public Internet. Our branch office is now able to participate in the EIGRP routing process over either link.  If you wanted to use the IPsec VPN as your primary connection to HQ, modify the bandwidth value of the Tunnel 0 interfaces to make the EIGRP metric more preferable than the serial route.  By default, tunnel interfaces seem to be given a bandwidth value of only 9kbps.  The default delay value of tunnel interfaces is high as well.  Tweak the bandwidth and delay values to achieve desired results.

Normal Conditions

Downed Serial Link to HQ

Configuration Files

ISP1 Configuration

R1 Configuration

R2 Configuration

R3 Configuration

R4 Configuration

Cisco IP SLA

So while studying for CCNP Route, I came across the concept of IP SLA and utilizing probes to check availability of next-hop addresses prior to instating them inside a route-map.  An objective was to have two ISP’s listed in a route-map statement, and set all traffic originating from the local router with a next-hop IP of ISP1 as long as it was deemed reachable by the IP SLA tracker probe status.

Diagram

The Configuration (Cisco 2691 – IOS ADVENTERPRISE 12.4(15)T14)

Define IP SLA Policy
PolicyRouter(config)#ip sla 1
PolicyRouter(config-ip-sla)icmp-echo 200.1.1.2 source-interface Serial0/0
PolicyRouter(config-ip-sl.a-echo)#timeout 1000
PolicyRouter(config-ip-sla-echo)#frequency 3

Start Probe
PolicyRouter(config)#ip sla schedule 1 start-time now life forever

Track Probe Status
PolicyRouter(config)#track 1 rtr 1 reachability

ACL
PolicyRouter(config)#ip access-list extended ROUTER
PolicyRouter(config-ext-nacl)#permit ip any any

Route Map
PolicyRouter(config)#route-map ROUTER-TRAFFIC 10
PolicyRouter(config-route-map)#match ip address ROUTER
PolicyRouter(config-route-map)#set ip next-hop verify-availability 200.1.1.2 10 track 1
PolicyRouter(config-route-map)#set ip next-hop 201.1.1.2

Apply Policy-Based Routing to Locally Generated Traffic
PolicyRouter(config)#ip local policy route-map ROUTER-TRAFFIC

Result

Route Map Status

Traceroute with Both ISP’s Reachable

 Traceroute with ISP1 Down

The result was good.  The next probe was working and traffic was being routed fine.  The fail-over to ISP2 upon failure of Serial0/0 also works fine.

The Problem

When Serial0/0 comes back on-line, the PBR sticks with the route through ISP2.  After looking at the route map again after the failover and the restoration of Serial0/0, I saw that the tracker was still registering the next-hop as down.

I did some testing and noticed that pings to 200.1.1.2 would fail.  This is because of the route-map applied to locally generated traffic.  The locally-generated ICMP-echo probes were being set with the route-map’s active next-hop interface, sending the probe incorrectly to ISP2.  For the purpose of the probe, it is necessary the probe traffic to be permitted to travel to the correct next-hop of ISP1 regardless of where other traffic is destined.

Solution

To correct this issue, I created an ACL to catch the probe traffic and inserted a sequence above the current statement in the route-map.  This sequence allows the probe traffic to utilize the correct destination rather than be affected by the fail-over next-hop functionality.

ACL
PolicyRouter(config)#ip access-list extended RTR_PING_S0/0
PolicyRouter(config-ext-nacl)#permit icmp host 200.1.1.1 host 200.1.1.2 echo

Route Map
PolicyRouter(config)#route-map ROUTER-TRAFFIC 5
PolicyRouter(config-route-map)#match ip address RTR_PING_S0/0

Result

This fix resulted in next-hop redundancy including preempting when the specified next-hop ISP comes back on-line.  This could be applied to a router with more ISP connections, so long as an ACL and route-map are created with respect to the necessary traffic-flow characteristics.

NOTE: Be very careful applying route-maps to locally-generated traffic.  You must be aware of the implications to router functionality.