log in

How AWS is helping to secure internet routing

Hacker News - Wed Jan 13 20:11

The internet works reliably, in large part, on the basis of a key technology called Border Gateway Protocol (BGP). BGP is a means by which the all the junction points on the internet (routers) communicate with each other to dynamically establish the correct (and correctly weighted) paths that network packets should follow to traverse the global networking system and reach their intended destination. Historically, however, BGP did not have built-in security. Routers simply trusted adjacent routers to send correct information. In the modern internet, this simple trust model is no longer adequate.

Over the last several decades, there have been a number of such BGP routing incidents, known as “BGP hijacks,” many of which were minor, but some of which have been very significant. Many have been the result of innocent errors, but intentional, malicious BGP hijacks have been on the rise. Malicious BGP hijacking occurs when attackers are able to falsely claim ownership of groups of IP addresses (prefixes), and use them to reroute internet traffic to a location of the attacker’s choice. Not only can users no longer reach the real systems, but attackers can, for example, create fake websites that attempt to install malware or do other damage.

To help put an end to BGP hijacking, AWS has been working closely with other industry leaders to make an industry-wide standard practice the use of Resource Public Key Infrastructure (RPKI) to digitally sign route announcements. This is not a simple process, and it has taken lots of time, effort, and cooperation. We are excited to announce today some important milestones in that journey. We are happy to have over 99% of our IPv4 and IPv6 -Space covered under a Route Origination Authorization, and that we are right now dropping RPKI invalid routes in every single Point-of-Presence for AS16509.

Let’s look at this important topic in more detail.

BGP and RPKI

Hijacks can happen due to accidental mis-configuration (typographical errors, for example), or deliberate malicious activity to redirect traffic to an unauthorized network. Internet routing uses Border Gateway Protocol (BGP) as the control-plane, allowing networks to exchange routing information with one another. BGP provides networks the ability to announce to each other the IP address ranges they can serve, and provides the internet with the ability to route traffic between networks.

Figure1: Block diagram of a how a hijack happens with a network announcing the wrong route and attracting traffic to itself instead of to the intended destination

Figure1: A BGP Hijack occurs when a network announces IP-prefixes that do not belong to them, which may circumvent the path to the actual destination.

Each network on the internet that participates in BGP is represented by an Autonomous System Number (ASN), which is a numeric value that is allocated to specific networks (for example, the primary ASN for AWS is 16509). One of the core challenges with internet route security is trust: how do you know if a specific ASN on the internet is authorized to advertise a particular IPv4 or IPv6 prefix? This is because networks rely on trust and routing policy to control which blocks of IP addresses (known as prefixes) can be advertised and propagated. But, these filters are based on the belief in best intentions and can be easily subverted.

AWS is taking a two-step approach to increasing routing security. First, we are using Resource Public Key Infrastructure (RPKI) to secure our announcements and discard invalid ones within our global network. Second, we are collaborating with other network operators to do the same through industry initiatives like Mutually Agreed Norms for Routing Security (MANRS), an Internet Society (ISOC) supported initiative aimed at securing global internet-routing. Both steps aim to increase the security posture of not only our customers but also help the internet as a whole protect itself.

Route Origin Authorization

Signing our IP-space with a Route Origin Authorization (ROA) enables the internet as a whole to make sure that AWS, and our customers’, IP addresses can only originate from AWS-authorized autonomous systems (AS). This is the first, and perhaps the most important, step of our journey to a more secure internet. A ROA is a cryptographically signed object that pairs an originating AS with a certain prefix, the length (size) of the prefix, and an expiration date. Over the last year, we have produced ROAs for large portions of our address-space and published them under each Regional Internet Registry (RIR) Certificate Authority (CA) under their publication servers.

Figure2: Block diagram of a ROA as a cryptographic object that pairs Origin ASN and IP-Prefix into an object authorized by the RIR, the user and signed by the RIR CA

Figure2: ROA is a cryptographic object that pairs Origin ASN and IP-Prefix to create one object authorized by the RIR, the user, and signed by the RIR CA.

Each ROA is specific to one of the five RIRs:

  • RIPE NCC – Europe, Middle-east, and Central Africa
  • AFRINIC – Africa
  • LACNIC – Latin America
  • APNIC – Asia Pacific
  • ARIN – North America

These RIR have the authoritative databases of information on the IP addresses designated for each operator. AWS operates 77 availability zones in 24 regions, being served from over 220 global edge-sites. As such, we must interact with all five RIRs to be able to sign our IP-space in each part of the world. We activate our CA, what RPKI environments typically call a Trust Anchor (TA), under each RIRs service-area. We then sign independent ROAs for our resources under that TA. These objects are then presented in each RIR’s repository of ROAs that are consumed by ISPs around the world to perform Origin Validation (OV) in their networks.

Figure 3: Graph of RPKI ROAs currently signed globally, with AWS having 148000 signed ROAs under 16509 and 70 ROAs signed under 14618

Figure3: NISTs RPKI Monitor tracks all signed IP-space globally and breaks it down into /24 equivalents.

As shown in this graph, right now AWS is the largest combined owner of signed IP-space on the internet. This is data comes from the United States National Institute of Standards and Technology NISTs RPKI Monitor that collects ROAs. The graph shows AS16509 and AS14618, the two largest AWS ASNs that AWS uses as origination on the internet, together have a signed space covering about 215,000 /24 equivalents.

Figure4: Graph of the amount of address space covered under reach RIR CA

Figure4: Global statistics from RIPEs CA-stats on all current ROAs under each RIR

The preceding graph (Figure 4) shows amount of address-space covered under a valid ROA over time under each RIRs Certificate Authority. 2019-2020 were two good years for ARIN, more than doubling adoption. AWS’s rapid progress on this effort was a contributor to the large influx of new address-spaces covered under a ROA during this period, especially visible in the ARIN graph in mid 2020.

AWS was the first major cloud provider to launch a Bring their Own IP (BYOIP) feature, and our customers are rapidly adopting that capability. The creation of ROAs are an integral part of the process, used by Amazon EC2 as well as AWS Global Accelerator not only to verify ownership but also to help customers protect their assets through the RPKI ecosystem.

Origin Validation

As noted previously, AWS has finalized a global roll-out of Origin Validation in the AS16509 network. Which for us is the second big step in using the RPKI system. ROAs are generated to give other networks on the internet the ability to determine if a prefix is valid or invalid. With Origin Validation, AWS will discard any RPKI-invalid routes learned on the internet via its peering infrastructure, which includes thousands of unique ASNs. When doing RPKI Origin Validation on an inbound peer, three different states for every prefix announced are possible: Valid, Unknown, and Invalid. “Valid” is when the prefix an operator announces has a ROA that corresponds with the route and origin announced. In this case, we take no further action and let the route progress further in the ingress policy-chain, accepting it if no other problems are found. “Unknown” is set when there is no ROA that corresponds with the prefix announced. In this case, we take no action, other than letting the prefix continue the policy-chain to be accepted later if otherwise OK but then cannot be protected by Origin Validation, a less reliable option. The third state is “Invalid”, and that is when the prefix announced to us actively breaks an ROA we have learned from our validators – either by having the wrong netmask, or by having the wrong origin ASN. This route will be immediately dropped at our border and we do not use that announcement in our route selection progress as it might be an active hijack. This ultimately protects the end user of that network by not letting our network re-route to this new invalid destination.

RPKI State Description Outcome
Valid Correct masklength and origin AS according to the registered ROA Accept
Unknown No ROA found Accept
Invalid Incorrect masklength and/or origin AS according to the registered ROA Reject

Table1: Current RPKI policy in the AWS network

If you operate a network and think we are dropping your routes due to RPKI, use our operational contact-details at PeeringDB to reach out. We will work with you to solve the problem.

Collaborating with our peering partners to enable and promote Origin Validation

Improving the security posture of the internet is a group-effort where no technical solution can be successful without widespread deployment. Therefore, we are working to accelerate this effort with all our peering partners and the larger ecosystem. The AWS global infrastructure interconnects with thousands of networks around the globe in hundreds of different places. Ensuring Origin Validation is in place with the biggest networks is key to bringing the safety, stability and security of RPKI to the internet. This is especially true for large Service Providers that span multiple continents and must make sure that invalid route announcements are minimized or eliminated as much as possible.

Today we see that large networks, such as NTT, Telia, GTT, AT&T, and others, perform Origin Validation on all of their peers. These large networks implementing Origin Validation improves the overall effectiveness of preventing route hijacks on the global internet. We track anomalous BGP activity on the internet and one such hijack attempt this summer had 30% less propagation within ROA-covered space compared to Non-ROA-covered space. This is a very encouraging number! It shows the value, even at this early stage, of global Origin Validation. With each network that adopts ROA and Origin Validation, that percentage will grow even larger.

We strongly encourage all networks on the internet to implement Origin Validation as well as signing their IP address space with ROAs.

Figure5 : Graph that shows the number of invalid prefixes in large provider at a given time, with an overall positive trend in all ISP networks

Figure 5: Graph that shows the number of invalid prefixes in large provider at a given time, data from RIS-live, and Routeviews datasets. Implementation of RPKI Origin Validation over the last year resulted in the reduction of BGP prefixes with an RPKI state of Invalid

Some of the larger service provider networks have implemented RPKI Origin Validation in the last year. This can be seen in the preceding chart (figure 5) by looking at the reduction of BGP prefixes with an Invalid RPKI state accepted by their networks. Telia Carrier deployed in February, and many other large operators followed suit afterwards. The number of signed ROAs has increased drastically over this year, so the possibility of more Invalids has gone up significantly while the total number of Invalid routes has gone down. This is a positive sign that the ecosystem is making progress.

Johan Gustawsson, head of network engineering and architecture at Telia Carrier, shared his feeling on these changes saying:

“We are thrilled about the current momentum around routing security amid key networks committing to implement RPKI Route Origin Validation and sign ROAs for their IP resources. For Telia Carrier, directly connecting about 60% of the global internet routes and being an early adapter to fully deploy RPKI ROV, it is seen as the first and constructive step in moving away from the fragile trust-based model. BGP hijacks have time and time again, irrespective of being intended or unintended, wreaked havoc on the internet. RPKI has proven to have an enormous impact in preventing the propagation of such illicit announcements, but requires all major networks to assume their responsibility in order to establish a globally secure routing infrastructure. And we are happy to collaborate with AWS on this.”

– Johan Gustawsson, head of network engineering and architecture @ Telia Carrier

We are also participating in important industry working groups, such as MANRS , operated by the Internet Society. AWS has been part of the MANRS Cloud and CDN working group since the initiative began. AWS is an active participant in this group and we are working to help accelerate the important work it is doing. Since the start of the programme, AWS, along with participants from the Cloud and CDN working group, have wanted to take RPKI a step further. Specific to RPKI, we have been working on an extension to the programme since December 2020 in order to put more emphasis on routing security going forward with explicit requirements on RPKI deployment.

Future enhancements

We are excited to be part of future enhancements of the RPKI ecosystem, and advocate for a more distributed method to publish ROAs (instead of relying on the current model with a RIR-hosted solution).

There are many other new additions and adjustments to the RPKI standards coming out of Internet Engineering Taskforce (IETF). These include Autonomous System Provider Authorization (ASPA), which will help protect entire AS-paths and not only origins. We are also paying special interest to the draft-ietf-sidrops-rpkimaxlen-05, an upcoming standard designed to address a common problem with DDOS-mitigation techniques used with RPKI integrity.

We are continuing to enthusiastically work with the Routing Internet Registries (RIRs) to build out and enhance the underlaying infrastructure of RPKI. The RPKI CA of the RIR is now a critical part of internet infrastructure and must be operated as such. Being an open, transparent, and a good member of the community is an important part of the process.

Getting involved

We encourage operators and end users that are looking to deploy RPKI to look at the community-driven knowledge-bank at readthedocs. This documentation goes through the ecosystem holistically and gives a full rundown what software you need and what steps you must take in order to be able to secure your routing ecosystem.

ICANN has also released a good technical analysis on the RPKI ecosystem and is a must-read for anyone that wants to deploy RPKI at scale.

The not-for-profit organization Stichting NLNOG maintains an excellent guide on good-practice route-filtering. This includes practical implementation guides with configuration snippets for most common network vendors.

Widespread RPKI adoption will advance internet security for everyone. By closing the gaps in the way networks exchange routing information today, routes will be more trusted and BGP hijacking attempts will have far less space to propagate. This work is part of our ongoing commitment to security. We will continue to work with other industry leaders to increase internet security and put a stop to BGP hijacking.

Conclusion

With the cooperation of our partners across the industry, AWS has reached these important milestones. Widespread RPKI adoption will advance internet security for everyone. By closing the gaps in the way networks exchange routing information today, routes will be more trusted and BGP hijacking attempts will have far less space to propagate. This work is part of our ongoing commitment to provide our customers with industry-leading security. But there is more to do. AWS will continue to work with other industry leaders to increase internet safety, reliability, and security and improve where needed to make sure we can put and end to BGP Hijacks once and for all.

Fredrik Korsback headshot jpg

Fredrik Korsbäck

Fredrik is a Senior Technical Business Developer at AWS working with Peering, Routing and BGP for the AWS Global Network, recognized in the routing-tables as AS16509. He is passionate about routing-security, route-control and development of the BGP-protocol to strengthen the capabilities of the very core of the Internet.