When BGP SoO Site of Origin goes the wrong way

Sunday, 12 Aug 2018

BGP Site of Origin (SoO)

I have a scenario where an "internal" Service Provider (SP) MPLS Network interfaces with a Third Party's MPLS Network, as an IPVPN - rather than a true MP-BGP Handoff; or in other words, "I happen to know it's underpinned by MPLS so I'll call it that, even though technically it's not MPLS Presentation to me" (the same way most Enterprise Network shops refer to their WAN as "MPLS").

Unlike the base assumption of most Cisco articles on SoO, I don't actually control the Provider Edge (PE) Routers on this Third Party (let's say, "BT") MPLS Network; and nor am I the Third Party themselves. What I'd like to do is identify Prefixes I have on my IPVPN "Overlay" MPLS Network, from CE Routers on said IPVPN Overlay Network that I do control, and block them from coming back into my own SP MPLS Network. I thought BGP Site of Origin (SoO) might be my friend here...

The Scenario Topology

BGP SoO Scenario Topology

This is the Scenario Network Topology, showing:

  • 2x My MPLS SP Network Provider Edge (PE) Routers
  • 1x My MPLS SP Network Route Reflector (RR) Router
  • 2x BT MPLS PE Routers (Location Unknown)
  • 2x eBGP Peerings from My <-> BT MPLS PE Routers
  • 2x Repeated BGP SoO Communities (65432:999), applied to VRF (IPVPN) "BLAH"

What I thought would happen

Given the BGP SoO Attribute is applied towards the BT (Third Party MPLS Network) PE Router, I thought I'd be able to jump on one of my MPLS Enterprise Network CE Routers, and see the SoO Attribute 65432:999 applied, as it made sense to me that this configuration would "advertise" the BGP SoO Extended Community from Me -> BT:

PE1#sh run | sec router bgp
router bgp 65432
 address-family ipv4 vrf BLAH
  neighbor 192.168.0.1 remote-as 2856
  neighbor 192.168.0.1 send-community both
  neighbor 192.168.0.1 soo 65432:999

Where I then duly hop onto my CE1 Router and issue the following, expecting to see the SoO Tag for my VRF "BLAH" 99.99.99.99/32 Network:

Expected Routing Behaviour

But, no dice - it's just a normal IPv4 BGP Prefix with no Extended Communities. What gives?

What actually happens

At this point, confused, I start to wonder if I'm misunderstanding what SoO does and is, and getting confused between the simplistic Cisco and Internet examples of SoO (which are aimed at Service Providers, from their perspective, towards a singular Customer Edge/Customer) - so I poke around. The first point I poke around on is a "non-SoO Tagging PE Router', PE99 - which has an attachment to VRF "BLAH" (an ingress/attachment point into My MPLS Network, for VRF "BLAH"; but performs no SoO tagging) - and see what I see:

PE99#sh ip bgp vpnv4 vrf VLAH 99.99.99.99/32

99.99.99.99/32
  ...Extended Community: SoO:65432:999 RT...

Which starts to make it click - the SoO must be applied "the wrong-way-around" from what I thought, and be an "ingress only" behaviour, as otherwise I wouldn't see it this side of the MPLS CE-PE Network Handoff fence. Or more succinctly, this is the direction/Routing Domain the SoO Tag is applied into:

Actual Routing Behaviour

Even though I first looked at this part of the SoO command as an "advertise-out"/egress behaviour; it's actually a "Match Packets from this BT Peer, mark them with this when advertised deeper back to us" behaviour. Because I looked at this part of the config as an "advertise SoO to neighbour" behaviour:

PE1#sh run | sec router bgp
  neighbor 192.168.0.1 soo 65432:999

Which is different to what I first expected ("advertise-out" behaviour), which would have been this:

Suspected Routing Behaviour

What have I learned?

In effect, then, depending on your "directional thinking", based on the Cisco IOS syntax, you might be unpleasantly surprised by how this works - to my mind, it's working the wrong-way-around from what the config syntax would suggest. What actually happens, as a result of just one line of config, is:

  • On PE1# (My<->BT Network 1st CE-PE Handoff)
    • Apply SoO Tag 65432:999 inbound/ingress (BT->Me) for all BT-side Prefixes
    • Advertise this SoO Tag 65432:999 deeper back to My SP Network (not BT's at all)
  • On PE99# (Any other CE-like Router on My Network; Attachment Point into VRF "BLAH")
    • See SoO Tag 65432:999 on BT-native Prefixes
    • Do nothing about it; "advertise on" SoO Prefix (don't strip it out on re-advertise to another PE/Router)
  • On PE2# (My<->BT Network 2nd CE-PE Handoff)
    • See SoO Tag 65432:999 pre-egress/outbound (just before Me->BT) for the BT-side Prefix
    • Because the Me<->BT eBGP Peer has the same SoO Tag set, don't allow it out to BT Router ("reverse behaviour")

The same would then happen for BT->PE2->My MPLS->PE1, performing an overlapping-behaviour for the secondary/dual-homed path between My Network<->BT's Network.

So these bits are wrong then

Which means, given SoO is an "inbound behaviour", not an "outbound" behaviour, the whole concept of tagging these with 65432:999 as the SoO Tag doesn't make sense; it probably should be 2856:999, to show these are BT-native, not My Network-native.

It also means I should re-think why I'm using SoO here, as a Tag+Block/Route Map technique, using bog-standard BGP Communities, might be a better fit for the behaviour I wanted.

I've been here before

Sadly, I've fallen victim to this presumed "outbound behaviour of the config" before with my friend BGP AS-Override, which also has a strange "not the way you might expect" behaviour, but that's one for another blog post. Key points here are:

  • Trust nothing
  • Lab everything
  • Assume that Cisco Support Forums write-up that looked exactly like your scenario was too good to be true