Contrail Cloud Solution With MPLS-Over-UDP Overlays

Summary:

MPLS-OVER-UDP Tunnels are used on datacenter environment as overlays. Existing technologies (like MPLS-OVER-GRE) to encapsulate Multi-Protocol Label Switching(MPLS) over IP are not adequate for efficient load balancing of MPLS application traffic, such as Layer3 Virtual Private Network (L3VPN) traffic across IP networks. This document specifies IP-based encapsulation technology, referred to as MPLS-in-User Datagram Protocol (UDP), which can facilitate the load balancing of MPLS application traffic across IP networks. This document also gives details on how to enable MPLS-OVER-UDP encapsulation when MX router interop with contrail controller.

Description:

Figure 1 shows the frame formats of MPLS-OVER-GRE and MPLS-OVER-UDP.There no easy way to load share the traffic between 2 tunnel end points when MPLS-OVER-GRE encapsulation is used.The issue is GRE header is same for all the flows between the 2 Tunnel end points.

But when we use MPLS-OVER-UDP encapsulation the source port value(in the MPLS-OVER-UDP header) can be generated based on the certain fields in the customer packets(in our case based on the packets generated by VMs in the vRouter).With this the routers between the Tunnel end points can load share the packets using hash of five-tuple of UDP packets.This is one of reason why MPLS-OVER-UDP overlays preferred over MPLS-OVER-GRE overlays.

Figure 1: IP over MPLS over UDP Packet Format

Figure 2: IP over MPLS over GRE Packet Format

The picture given above shows the implementation of Contrail cloud solution. In this topology PE2 & PE3 will be acting as local Datacenter gateways.PE1 will be the remote Datacenter gateway.Between PE2/PE3 and compute nodes (or vRouters) MPLS-OVER-UDP overlays will be implemented.So all traffic between compute nodes to PE2/PE3 will be encapsulated with MPLS-OVER-UDP header.Contrail controllers will be managing the vRouters using XMPP.PE2 &PE3 will talk to Contrail controllers with BGP.The vRouters in contrail cloud will act similar to PE router in L3 VPN environment.When vRouter need to send traffic to vRouters in another datacenter it will encapulate packets with MPLS-OVER-UDP header and forward it to PE2 or PE3.PE2 or PE3 will decapsulate the MPLS-OVER-UDP header and then encapsulate it with MPLS headers and forward it to PE1.PE1 decapsulate MPLS header and forward the packets to the node in remote Datacenter.

When a VM in a vRouter need to send traffic to VM in another vRouter in the same datacenter, the source vRouter will encapsulate the packets with MPLS-OVER-UDP/MPLS-OVER-GRE header and then forward it to the IP fabric.After the destination vRouter receives the packet it will decapsulate the packets and forward it to the destination VM.

Configuration on MX gateway:

BGP Configuration :

BGP group contrail:

This section gives details on BGP configuration of PE2 & PE3.Both the PEs will have a similar configuration. BGP group “contrail” (given below) peers to contrail controller. In this configuration, Family route-target knob controls the advertisement of routes from PE2 & PE3 to contrail controller.It makes sure that PE2 & PE3 advertise only required routes based on the route-targets received from the contrail controller.

Contrail controller advertises the route targets(which is applied to the virtual Networks created in contrail controller) to PE2 & PE3.PE2 & PE3 advertise routes with matching route targets. Unmatched routes will not be advertised. This avoids unnecessary routes advertised to contrail controller. The knob external-paths 5 is required when you have 3 controllers in a HA environment.

Before advertising any routes to contrail controller the next-hop of the route (through policy from-remote-pe-vrf1) is changed to PE2 or PE3 loopback address.This avoids the need for the controller to learn remote PE1 loopback address.

This policy also adds encapsulation extended community to the remote datacenter route before advertising it to the contrail controller.This community tells the vRouter(compute nodes) to use MPLS-OVER-UDP encapsulation before it forwards traffic for this route.If this community is not advertised,  then vRouter will use the default encapsulation (which is MPLS-OVER-GRE).

The encapsulation community  is in the format of “members 0x030c:XXX:13”. This identifies it as opaque extended community (type 0x03) of sub-type encapsulation (0x0c).Administrator field is set to 0 since it’s reserved(But JUNOS do not allow 0 so you need to set it to some value recommended is AS NUMBER) and encap value is 13 (for MPLSoUDP).

The configuration given below is used in PE2 and PE3 routers.


{master}[edit]
regress@PE2# show protocols bgp group contrail     
type internal;
local-address 10.255.181.172;
family inet-vpn {
    any;
}
family inet6-vpn {
    any;
}
family route-target {
    external-paths 5;
    advertise-default;
}
export from-remote-pe-vrf1;
vpn-apply-export;
cluster 2.2.2.2;
neighbor 3.3.3.2;

{master}[edit]
regress@PE2# 

master}[edit]
regress@PE2# show policy-options policy-statement from-remote-pe-vrf1
term 1 {
    from {
          protocol bgp;
          route-filter 103.0.4.0/24 orlonger;
             }
    then {
        next-hop self;
        accept;
    }
}

{master}[edit]
regress@PE2# show policy-options community udp         
members 0x030c:64512:13;

{master}[edit]
regress@PE2# 

You can verify encapsulation status in vRouter(in compute node) as shown below.For this first find the tap interface.Then check the route using the vrf number.This output shows the type of encapsulation, source and destination IPs of the tunnel and other details.

root@vm6:~# vif --list
Vrouter Interface Table

Flags: P=Policy, X=Cross Connect, S=Service Chain, Mr=Receive Mirror
       Mt=Transmit Mirror, Tc=Transmit Checksum Offload, L3=Layer 3, L2=Layer 2
       D=DHCP, Vp=Vhost Physical, Pr=Promiscuous, Vnt=Native Vlan Tagged
       Mnp=No MAC Proxy, Dpdk=DPDK PMD Interface, Rfl=Receive Filtering Offload, Mon=Interface is Monitored
       Uuf=Unknown Unicast Flood, Vof=VLAN insert/strip offload

vif0/3      OS: tap6b9bb806-20
            Type:Virtual HWaddr:00:00:5e:00:01:00 IPaddr:0
            Vrf:1 Flags:PL3L2D MTU:9160 Ref:5
            RX packets:15533540  bytes:4717521394 errors:0
            TX packets:15659512  bytes:4721801850 errors:0


root@vm6:~# rt --dump 1 |grep 103.0.8.0/24
103.0.8.0/24           24            P          -             24        -
root@vm6:~# nh --get 24
Id:24         Type:Composite      Fmly: AF_INET  Rid:0  Ref_cnt:6          Vrf:1
              Flags:Valid, Policy, Ecmp, 
              Sub NH(label): 16(17) 12(17)


Id:12         Type:Tunnel         Fmly: AF_INET  Rid:0  Ref_cnt:2          Vrf:0
              Flags:Valid, MPLSoUDP, 
              Oif:0 Len:14 Flags Valid, MPLSoUDP,  Data:02 00 08 00 00 2b 52 54 00 23 13 39 08 00 
              Vrf:0  Sip:60.60.0.6  Dip:10.255.181.172

root@vm6:~# 

The picture given below shows the bgp configuration in Contrail controller.There are 2 nodes shown here.Router Type “BGP Router” shows the PE1 configuration details and Router Type “Control Node” details BGP configuration of Contrail Controller.

 

 

The output from MX router (given below) shows that bgp connection established between MX and Contrail Controller.

 

{master}[edit]
regress@PE2# run show bgp summary 
Groups: 4 Peers: 5 Down peers: 0
Table          Tot Paths  Act Paths Suppressed    History Damp State    Pending
bgp.rtarget.0        
                      17         13          0          0          0          0
bgp.l3vpn.0          
                      58         54          0          0          0          0
bgp.l3vpn.2          
                       0          0          0          0          0          0
bgp.l3vpn-inet6.0    
                       0          0          0          0          0          0
bgp.l3vpn-inet6.2    
                       0          0          0          0          0          0
Peer                     AS      InPkt     OutPkt    OutQ   Flaps Last Up/Dwn State|#Active/Received/Accepted/Damped...
3.3.3.2               64512         40         56       0       0       11:15 Establ
  bgp.rtarget.0: 13/17/17/0
  bgp.l3vpn.0: 4/4/4/0
  bgp.l3vpn-inet6.0: 0/0/0/0
  vrf1.inet.0: 4/4/4/0

BGP Group core:

This BGP Group “core” peers to the remote PE(PE1) connected to remote data center. The policy “change-next” will change the next hop off routes advertised to remote PE1.

 

{master}[edit] 
regress@PE2# show protocols bgp group core 
type internal;
local-address 10.255.181.172;
family inet-vpn {
    any;
}
family inet6-vpn {
    any;
}
export change-next;
vpn-apply-export;
neighbor 10.255.178.174;

{master}[edit]
regress@PE2# 


{master}[edit]
regress@PE2# show policy-options policy-statement change-next 
Dec 07 14:27:33
term 1 {
    from protocol bgp;
  then {
        next-hop self;
        accept;
    }
}

{master}[edit]
regress@PE2#

Dynamic tunnel configuration:

This section explains Dynamic Tunnel Configuration between MX to Contrail Controller and MX to vrouter.

Dynamic Tunnel to contrail controller:

PE2 & PE3 needs to have dynamic tunnel (MPLS-OVER-UDP/MPLS-OVER-GRE) created to the contrail controller. This configuration will create route to the controller ip address in inet.3 table.Without this route MX will not advertise bgp.l3vpn.0 table routes to controller(when you have family route-target enabled on PE2 & PE3).This Tunnel status can be verified as shown below.This tunnel will be in Dn state as there are no routes received from controller with protocol next-hop of contrail controller IP address(in this case it is 3.3.3.2).This is expected as contrail controller will not be advertising any route with 3.3.3.2 as protocol next hop.

 

{master}[edit]
regress@PE2# show routing-options dynamic-tunnels to-controller       
source-address 10.255.181.172;
udp;
destination-networks {
    3.3.3.0/24;
}

{master}[edit]
regress@PE2# 


{master}[edit]
regress@PE2# run show dynamic-tunnels database terse 
Table: inet.3

Destination-network: 3.3.3.2/32
Destination         Source          Next-hop                  Type        Status
3.3.3.2/32          10.255.181.172  0x487e67c nhid 0          udp         Dn    nexthop not installed

master}[edit]
regress@PE2# run show route 3.3.3.0
Dec 07 13:53:43 

inet.0: 43 destinations, 44 routes (42 active, 0 holddown, 1 hidden)
+ = Active Route, - = Last Active, * = Both

3.3.3.0/24         *[OSPF/150] 00:40:14, metric 0, tag 0
                    > to 1.0.1.1 via ae0.0

inet.3: 18 destinations, 31 routes (18 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

3.3.3.0/24         *[Tunnel/300] 00:37:18
                      Tunnel

{master}[edit]
regress@PE2# 

Dynamic Tunnels to vRouters:

PE2 & PE3 should create Dynamic TUNNELS to all the vROUTERs. In our case it is MPLS-OVER-UDP TUNNEL created between PE2 & PE3 to all vROUTERs.When PE2 or PE3 receives traffic from remote PE1 it uses this MPLS-OVER-UDP tunnel to forward the traffic to the respective vROUTER.The Tunnel status can be verified as given below.The Tunnel will come up only when a route received with the protocol nexthop in the segment 60.60.0.0/16(which is configured in destination-networks as shown below).In this case 60.60.0.0/16 is the IP segment used vrouters.

 

regress@PE2# show routing-options dynamic-tunnels 
contrail-udp {
    source-address 10.255.181.172;
    udp;
    destination-networks {
       60.60.0.0/16;
    }
}
{master}[edit]
regress@PE2# run show dynamic-tunnels database 
Table: inet.3
Destination-network: 60.60.0.6/32
Tunnel to: 60.60.0.6/32
  Reference count: 1
  Next-hop type: UDP
    Source address: 10.255.181.172 Tunnel Id: 1610612742
    Next hop: tunnel-composite, 0x4875634, nhid 710
      Reference count: 2
      State: Up


{master}[edit]
regress@PE2# run show dynamic-tunnels database terse 
Table: inet.3

Destination-network: 60.60.0.6/32
Destination         Source          Next-hop                  Type        Status
60.60.0.6/32        10.255.181.172  0x48793f4 nhid 741        udp         Up   

 

This is the Routing instance configuration is created on PE2 & PE3 to install the routes to vrf table.This vrf configuration is required as it is a L3VPN scenario.

 

{master}[edit]
regress@PE2# show routing-instances vrf1 
instance-type vrf;
interface lo0.1;
route-distinguisher 64512:1;
vrf-import test1-import;
vrf-export test1-export;
vrf-table-label;
}

{master}[edit]
regress@PE2# show policy-options policy-statement test1-export
term 1 {
    from protocol  direct ;
    then {
        community add testtarget1;
        accept;
    }
}

{master}[edit]
regress@PE2# 

regress@PE2# show policy-options policy-statement test1-import 
term 1 {
    from community testtarget1;
    then accept;
}

{master}[edit]
regress@PE2#

Remote datacenter PE configuration:

This configuration is applied on the remote PE router (in this case PE1).This is a simple L3VPN configuration.

[edit]
regress@PE1# show routing-instances vrfs1 
Dec 06 13:21:03
instance-type vrf;
interface xe-1/1/1.1;
interface xe-1/1/1.2;
interface xe-1/1/1.3;
interface xe-1/1/1.4;
route-distinguisher 64512:1;
vrf-import test1-import;
vrf-export test1-export;
vrf-table-label;

[edit]
regress@PE1# show policy-options policy-statement test1-export 
Dec 06 13:21:10
term 1 {
    from protocol direct;
    then {
        community add testtarget1;
        accept;
    }
}

[edit]
regress@PE1# show policy-options policy-statement test1-import    
term 1 {
    from community testtarget1;
    then accept;
}

[edit]
regress@PE1# 

[edit]
regress@PE1# show protocols bgp
precision-timers;
group contrail-1 {
    type internal;
    local-address 10.255.178.174;
    family inet {
        unicast;
    }
    family inet-vpn {
        any;
    }
    family inet6-vpn {
        any;
    }
    cluster 3.3.3.3;
    neighbor 10.255.181.172;
        
[edit]
regress@PE1# 

Conclusion

In this document we have shown the use case of MPLS-OVER-UDP overlays. In addition to this configuration required to integrate MPLS-OVER-UDP overlays between MX and contrail controller is explained in this document.

APPENDIX A

The tunnel details from the FPC can be checked as shown below.The nexthop id can be used to check the tunnel details from the FPC.The Tunnel id,tunnel destination,Tunnel source in the FPC should match with the CLI.

 

{master}[edit]
regress@PE2# run show dynamic-tunnels database 
Dec 06 14:06:45
Table: inet.3

Destination-network: 60.60.0.6/32
Tunnel to: 60.60.0.6/32
  Reference count: 1
  Next-hop type: UDP
    Source address: 10.255.181.172 Tunnel Id: 1610612742
    Next hop: tunnel-composite, 0x4875634, nhid 710
      Reference count: 2
      State: Up

regress@PE2# run request pfe execute target fpc3 command "show nhdb id 710 extensive"       
Dec 07 13:58:26 
================ fpc3 ================
SENT: Ukern command: show nhdb id 710 extensive

   ID      Type      Interface    Next Hop Addr    Protocol       Encap     MTU               Flags  PFE internal Flags
-----  --------  -------------  ---------------  ----------  ------------  ----  ------------------  ------------------
  710    Compst  -              -                      MPLS             -     0  0x0000000000000000  0x0000000000000000

BFD Session Id: 0

Composite NH:
  Function: Tunnel Function
  Hardware Index: 0x0
  Composite flag: 0x0
  Composite pfe flag: 0xe
  Lower-level NH Ids:
  Derived NH Ids:
  Tunnel Data:
      Type     : UDP-V4
      Tunnel ID: 1610612742
      Encap VRF: 0
      Decap VRF: 0
      MTU      : 0
      Flags    : 0x2
      Encap Len: 28
      Encap    : 0x45 0x00 0x00 0x00 0x00 0x00 0x40 0x00
                 0x40 0x2f 0x00 0x00 0xac 0xb5 0xff 0x0a
                 0x06 0x00 0x3c 0x3c 0x00 0x00 0x00 0x00
                 0x00 0x00 0x00 0x00
      Data Len : 8
      Data     : 0x3c 0x3c 0x00 0x06 0x0a 0xff 0xb5 0xac
        Feature List: NH
           [pfe-0]: 0x08ce6d4c00100000;
           [pfe-1]: 0x08c12e3000100000;
          f_mask:0xe000000000000000; c_mask:0xe000000000000000; f_num:3; c_num:3,  inst:0xffffffff
        Idx#0          -:               
           [pfe-0]: 0x2bfffffd5e00a500  
           [pfe-1]: 0x2bfffffd5e00a500  
                                        
        Idx#1          -:               
           [pfe-0]: 0x23fffffc0000000c  
           [pfe-1]: 0x23fffffc0000000c  
                                        
        Idx#2          -:               
           [pfe-0]: 0x08c045d800080000  
           [pfe-1]: 0x08c03d8800080000  
                                        
Tunnel ID 1610612742                    
==============                          
         Ref-count 1                    
TunnelModel:                            
Dynamic Tunnel Model:                   
         Name = MPLSoUDP                
         MTU = 0                        
         VRF = default.0(0)             
         Source Entropy = 1             
         Packets = 0 Bytes = 0          
                                        
Source IP     : 10.255.181.172          
Destination IP: 60.60.0.6     


Ingress:                                
Index:0                                 
  PFE(0): 0x2bfffffd5e006500            
  PFE(1): 0x2bfffffd5e006500            
                                        
Index:1                                 
  PFE(0): 0x8ce6c8c00100000             
  PFE(1): 0x8c12d7000100000             
                                        
Handle JNH
0x8c045d800080000
0x8c03d8800080000
Egress:
Index:0
  PFE(0): 0x2bfffffd5e008500
  PFE(1): 0x2bfffffd5e008500

Index:1
  PFE(0): 0x23fffffc0000020a
  PFE(1): 0x23fffffc0000020a

Index:2
  PFE(0): 0x878b15400100000
  PFE(1): 0x878b2b000100000

Handle JNH
0x8ce6cec00100000
0x8c12dd000100000


  Routing-table id: 0

Full configuration:

PE2:

 

{master}[edit]
regress@PE2# show protocols ospf
area 0.0.0.0 {
    interface all {
        bfd-liveness-detection {
            minimum-interval 300;
            multiplier 3;
        }
    }
    interface fxp0.0 {
        disable;
    }
}

{master}[edit]
regress@PE2# 

regress@PE2# show protocols mpls
ipv6-tunneling;
interface all;
{master}[edit]
regress@PE2# 

egress@PE2# show protocols ldp;
interface ae0.0;
interface ae1.0;
interface ae2.0;
interface lo0.0;
{master}[edit]
regress@PE2#
 
{master}[edit]
regress@PE2# show protocols bgp group contrail   
type internal;
local-address 10.255.181.172;
family inet-vpn {
    any;
}
family inet6-vpn {
    any;
}
family route-target {
 external-paths 5;
 advertise-default;
}
export from-remote-pe-vrf1;
cluster 2.2.2.2;
neighbor 3.3.3.2;
{master}[edit]
regress@PE2# 

{master}[edit]
regress@PE2# show protocols bgp group core        
type internal;
local-address 10.255.181.172;
family inet-vpn {
    any;
}
family inet6-vpn {
    any;
}
export change-next;
vpn-apply-export;
neighbor 10.255.178.174;
{master}[edit]
regress@PE2# 

regress@leopard# show routing-options              
Dec 13 12:19:18
ppm {
    redistribution-timer 120;
}
nonstop-routing;
autonomous-system 64512;
dynamic-tunnels {
    gre next-hop-based-tunnel;
    controller {
        source-address 10.255.181.172;
        udp;
        destination-networks {
            3.3.3.2/32;
        }
    }
contrail-udp {
        source-address 10.255.181.172;
        udp;
        destination-networks {
            60.60.0.0/16;
  }
}
}
regress@PE2# show policy-options policy-statement from-remote-pe-vrf1
term 1 {
    from {
          protocol bgp;
          route-filter 103.0.4.0/24 orlonger;
             }
    then {
        next-hop self;
        accept;
    }
}

{master}[edit]
regress@PE2# show policy-options community udp         
members 0x030c:64512:13;

{master}[edit]
regress@PE2# 
{master}[edit]
regress@PE2# show policy-options policy-statement change-next 
Dec 07 14:27:33
term 1 {
    from protocol bgp;
  then {
        next-hop self;
        accept;
    }
}

{master}[edit]
regress@PE2#