When network operations professionals get together to talk Border Gateway Protocol (BGP), there are two main topics that come up – BGP (in)security with the risks of hijacks, and how easy it is to misconfigure BGP and cause a leak that impacts the entire internet. Pretty quickly, the discussion moves on to how to measure performance and monitor your routes to protect brand (and let’s face it – personal) reputation, profits and productivity.

Last week, Catchpoint and NYNOG (the New York Network Operators Group) convened a panel of BGP experts from Catchpoint and NS1 for a lively Network Ops Meeting focused around the Border Gateway Protocol. They spoke in detail about the insecurity of this pillar of the digital world and how it can be overcome through monitoring. Seven major themes emerged and are discussed below.

1. BGP was Designed Without Security in Mind

When BGP was considered thirty years ago, security was not an issue. Our panelists pointed to an interview in The Washington Post with the founders of BGP, which said that security wasn’t even on the table when they were designing the protocol.

In the early days of the Internet, getting stuff to work was the primary goal. There was no concept that people would use this to do malicious things… Security was not a big issue – K. Lougheed

Security “wasn’t even on the table.” – Y. Rekhter

Therefore, this protocol lacks a built-in mechanism to check the integrity and authenticate the packets that are exchanged between autonomous systems (ASes). This is why BGP is prone to attacks and misconfigurations, which can be broadly classified into prefix hijacks and route leaks.

2. BGP Hijacks and Leaks can Impact Companies of any Size

Our experts made it clear that BGP incidents, whether hijacks or leaks, are happening to companies of all sizes, from Cloudflare and Facebook, both of whom experienced major outages in 2019, to smaller, lesser-known companies not making the headlines.

Our panelists had analyzed data provided by BGP Stream, a public website that reports BGP incidents as perceived from several vantage points around the world. From BGP Stream, you can see that this year more than 600 autonomous systems were victims of a hijack attempt, and more than 800 of these systems were the victim of a leak. These outages affected all kinds of internet services, from web sites to DNS.

It’s important to remember that even companies that are too small to have their own AS are impacted when their providers experience an outage.  Panelist Luca Sani of Catchpoint also pointed out that hijacks and leaks frequently impact more than just one service provider’s network. In the case of the major Cloudflare leak in June of this year, there were more than 1,200 ASes involved, including Facebook, Comcast, T-Mobile, and Bloomberg; there were also ASes hosting sensitive websites and sensitive content, including nine American banks. BGP insecurity is an issue that affects everyone.

3. Several BGP Security Mechanisms Exist

The networking community has put in a lot of effort to bring in BGP security mechanisms. The most famous is Resource Public Key Infrastructure (RPKI). RPKI, simply speaking, is a mechanism where each AS administrator can authorize that they are allowed to announce a particular prefix. Then when a BGP router receives an announcement, it can check each announcement against the RPKI database and issue a result, which can be:
• VALID
• INVALID
• UNKNOWN

RPKI does help to filter some invalid (malicious) announcements, but it can’t stop all of them. For example, RPKI won’t protect against BGP leaks and intentional path forgeries. Another problem with RPKI is that adoption is low—few ASes have signed ROAs.

Besides RPKI, other BGP security mechanisms include peer-lock, IRR filtering, and max-prefixes, each with their own limitations.

The future of BGP security will probably be BGPSec, where BGP packets will be cryptographically signed within BGP. The main challenge with BGPSec will be computational overhead because each router will have to digitally verify and sign each packet it receives. Also, BGPSec isn’t the solution for all issues such as certain kinds of leaks or other misconfigurations.

4. BGP Configurations are Complex!

BGP might be a simple protocol, but each router configuration can be very complex. This is because BGP relationships (called peering relationships) are not simply controlled by networking parameters, but also by business parameters. One AS may have a producer-consumer relationship with another, but a transit relationship with a third. This means that the configurations that are manually added to each router must include filtering rules of which routing information to pass along to other ASes and which to stop. A simple typo can cause leaks that impact large parts of the internet.

5. Trust, but Verify

It’s extremely important to verify that routing configurations are working as expected. Even when expensive automated tools are used to configure routers, things can go wrong. The CloudFlare leak described above started when a BGP optimization tool selected a route that was then erroneously leaked due to improper filters.

In his talk, panelist Nathanael Jean-Francois spoke about ways to monitor from inside the network to ensure that packets are flowing as expected, including simple tools like Ping and Traceroute. Getting more involved and saving history for analysis and comparison requires building a data pipeline and system to crunch all of those numbers – that can get very complex.

6. Security Mechanisms and Prevention Alone is Not Enough

Given the limitations of the BGP security mechanisms and the ease of configuration mistakes, prevention alone is not enough. Instead, organizations need to monitor routes for BGP hijacks, leaks, and misconfigurations. Monitoring can help with

• Quick detection and remediation of network problems that have slipped past prevention mechanisms

• Verification that the problem has been solved

Why is mitigation via comprehensive monitoring so important? BGP hijacks, leaks, and misconfigurations have real business costs. For example, the panel shared that every minute of downtime costs a company $15,000. Furthermore, these incidents impact employee productivity by 29%. Plus, there’s a cost associated with the damage to your organization’s reputation with customers and employees.

7. Build-your-own BGP Route Monitoring Tools: Possible but Limited

To create a tool to monitor BGP routes, you “just” need a few things, our experts told us:
• Data
• Tools
• Some programming skills

One option to monitor BGP yourself is to set up a route collector. A route collector is a server running a BGP daemon – it’s basically a BGP router that promises never to make any announcements.

Another option is to use one of the existing public data sources available:

• Route Views – the University of Oregon project, which provides you with routing table (MRT) updates every 15 minutes from 145 ASes sharing a full routing table;

• Isolario – created by Catchpoint BGP researchers Luca Sani and Alessandro Improta, with MRT updates every five minutes from 205 ASes sharing a full routing table. Isolario also uses ADD_PATH, a BGP capability to receive not only the best route from your peer but also all the routes that peer has received from its own peers and providers;

• RIS – managed by RIPE NCC Routing Information Service, the European registry. It provides MRT updates every five minutes from 209 peers sharing a full routing table. Differently from Route Views and Isolario, RIS also allows for (near) real-time data collection via WebSocket.

The panelists also talked about various Open Source tools (MRT readers) that can be used to analyze all of this data – and it’s a lot of data! Each IPv4 routing table has approximately 700,000 routes, so 200 peers send approximately 140 million routes to the tool. A snapshot in time (called a RIB) from RouteViews is approximately 1GB in size. The best tool presented is bgpscanner, which is much faster than the others.

The panelists issued one last caution: home-made BGP monitoring tools have limitations. Data from RouteViews and Isolario is not realtime, so not only does it take longer to detect an issue, but also to verify that it’s been resolved. Home-made BGP collectors require complex business arrangements with ISPs around the world to obtain realtime data – and even if successful, limited vantage points mean that you may not see hijacks and leaks. These tools are a great start, but their limitations can have serious consequences for your business. 

If you want to know more and understand how BGP routing really works and best practices for monitoring BGP, including both ‘inside-out’ and ‘outside-in’ monitoring, download the Catchpoint eBook: The Comprehensive Guide to BGP.

Speakers:

Luca Sani, Catchpoint Senior R&D Engineer for BGP & Routing, Co-founder of the Isolario BGP Project

Nathanael Jean-Francois, NS1 Lead Engineer, Network Architecture

Alessandro Improta, Catchpoint Senior R&D Engineer for BGP & Routing, Co-founder of the Isolario BGP Project