1st December 2023
At 9 a.m.
MOIN RAHMAN: Good morning, everyone. Hope you had a nice dinner, I know you are all tired from yesterday but we are also jam packed with two different presentations and three lightning talks, please do not forget to rate the papers to shape the future programmes. We like to know what the community is expecting from us and what you feel about the presentations we have already chosen for this meeting. Today's session will be chaired by me and Doris. Doris, do you have anything to add?
DORIS HAUSER: No.
MOIN RAHMAN: I will hand over the mic to our first presenter Thomas, who will be presenting on xBGP, faster innovation in routing protocols.
THOMAS WIRTGEN: Hello, good morning, everyone. So I will present you xBGP which is a new way to add programmability and innovation in writing protocols, so this is a joint work with my colleague with Laurent and Randy Bush.
So, the first question we can ask us is why we would like to bring programmability inside BGP? So you might know in modern network you have routers from different router vendors for many reasons so the first one are for economical and stability reasons. For example, if you have one vendor that have probably a crash or bug there's less chances other vendors have the same problem. So this is the first problem because all the vendors do not implement the same set of functionalities and this is really restrains the network operators about what they can do on their networks.
And so, well, operators would like to constantly tune their network for the viability of their networks but they are limited by two main factors, the first one is the network operating system so the OS of the vendors, you can consider it as black box, where you have only the interface provided by the vendor. Instead you cannot do anything with your implementation and your protocols so BGP are standardised by the IETF so this is something which is really constrained.
So, for example, if I would like to add simple functionality where I would like to analyse the visibility of the BGP control plane, this is really complex, because if I have an routers that would like to prefer, for example, route among the other, I have no clues. Yeah, I agree that you can do this with BGP communities but this is really cumbersome and this is ‑‑ you have to add a lot of configuration of all your routers of your network and this is why there is a draft that has been proposed in 2016, where they simply add a new BGP attribute to add latitude and longitude of the router that receives the route.
So, the idea is really simple, each time you receive BGP update from your peers you add the latitude and longitude of the router or receive the update, you spread the information inside the network and then to avoid with your custom attribute you remove it from the edge routers of the network. This is quite easy to understand and maybe implement but not at all, because there is a lot of process to achieve this feature inside BGP. So the first one is the standardisation steps, so in order that all the routers must talk the same language, I would say, so there are process by the IETF, and here on the slide you have the standard delay of the 40 RFCs related to BGP and you can see in you have to wait roughly four years to have one standard for one feature. But that's not all because you have also to wait vendors to implement your feature inside the router, and finally, you have to update your routers.
But the problem is if you want to add a new functionality as a small network, let's consider Belnet, in Belgium, you cannot influence step one and two because IETF people do not have the same idea as you would and do not shall are not interested by your feature, and are vendor side is also complex because you maybe not have the right licence to ask for new functionality inside the network and finally, you have to update the router but this is something you can fetch from the vendor.
Okay, so let us summarise the main problem inside BGP or routing protocol in general. So the first problem is that you have routers from different vendors. Then, protocol extensions are not implemented in all routers and you have also the slow upgrade process, as I explained before.
And with xBGP we would like to bring innovation and programmability back inside existing protocols.
I will explain the basics of xBGP and let me introduce, we can update in classical way our routers, so this is the vendor that must add and implement it inside their routing ‑‑ their routing stack. The new feature they need to compile and it general the network operator can fetch and update to its network. But during this process, the operators must to flash the router and allow the session and they will loose all the labouring session, the route erring peers they have established with other people.
And so, yeah, to bypass this classical update I would say, we will leverage eBPF which is two core components, so the first one is a byte code, which is multi arch, because routers have CPUs from different architecture and you have a run time environment that will fetch the byte code and isolate it inside environment so you can think about kind of Java machine.
With this eBPF technology, are what we can do is, first, as in the router vendor you have to add it inside your routers and then if you want to add a new functionality as network operator, you will simply write your own plug‑in and compile it inside on eBPF and inject it inside the machine that will interact with the vendors so they can fetch the right data structure to compute the functionality.
And thanks to this method, you don't have to reboot or flash a new image in your router.
Okay, with this view we would like to shift from another paradigm where now you have something which is more black box where vendors have full control on their router, so something more grey where you have the BGP code which is close for the vendors but you cannot interact with this code with the eBPF machine and so the network operator have a way to influence its router.
Okay, so how we manage to make xBGP compliant and BGP implementation, for that we will rely on the BGP workflow that has been defined in RFC4271 and this workflow normally is implemented on all BGP implementations, and BGP workflow is the following: Each time receive BGP messages from your peers goes to RIP‑IN, apply some policies and after all the we go to the Loc‑RIB and you have the BGP decision process that will select the best to route each prefix, export filters, go to the RIB out and you announce the best route to your neighbours.
Okay, so this is the workflow but how to access this workflow, because on traditional implementation of BGP you have only the interfaces provided by the vendor, the famous one RCLI or SNMP and with xBGP we open the lid of BGP and we expose this BGP workflow so that the xBGP virtual machine will execute your extension to the this basic BGP workflow. If I take back my geographical location plug‑in that I introduced previously, it needs to be executed somewhere and for that we introduce insertion points so those are the green circles here on the figure, are where you can execute any code from the operator.
And so, yeah, the geographical location extension is divided into parts, it must cover the whole BGP workflow and you have to first decode geographical location that has been sent from your peers, you have to add it inside the route, use it, code it, and will go to the right insertion part and will manage to execute this part of the plug‑in to the right location of the BGP workflow.
Okay, so thanks to that, you have to implement once the geographical location extension, then it will be executed on every implementation of BGP or at least those that are. XBGP compatible and we manage to modify BIRD and to add BGP layer so the machine can execute any plug in that the operator can provide.
So, are you can think as xBGP as link between the operators and the vendors, so the vendors will provide the OS, so with the routing information table, the peer state, memory, IO, BGP state and stuff like that, and the operators will interact with the xBGP layers so that they will, they can interact with those as part of the network operating system.
So the advantage is you only implement once your xBGP layer on each BGP and that's it. You can now execute your plug‑ins, your xBGP extension.
So, this is just for your information, we have to slightly modify for routing and BIRD because we do not have all the information to execute our xBGP stuff so for your information we had to modify some line of code both inside FRR routing and BIRD.
Of course we have also implemented other use cases to prove that the solution is really, that covers a lot of use case that the operator might want to add inside their network so I won't explain them in details in this presentation, but I will just explain it, explain one of them which is the BGP zombies. What is this? This is simply routes that are no more reachable but still in the routing table in some part of the routers in the Internet. So normally if you have prefix P and you would like to advertise to other peers inside the Internet, you have to rely on routing protocols, here BGP, to announce your update to the rest of the Internet. But for some reason the prefix P is unreachable any more and so the routers that detect it will announce withdraw to other routers so the route is not reachable to the Internet. But for some reason, one router, maybe for a bug or something other thing, will not process correctly the withdraw and will not announce the withdraw to other routers, so there is a network that is splitted in two, you have some routers that have the information that the prefix P is not reachable any more and other routers that still have the route and this will cause a lot of problems such as blackholing is the one. But with xBGP you can actually add a new plug‑in that will check the routing table each night, for example, and if the route in the routing table in order you ‑‑ for example, four weeks then you can ask to your upstream routers to confirm if the route is still valid and if it is still valid then the router will make the corresponding exchange of message to say okay, so the route is still reachable or not.
So, this is quite special use case because it does not directly influence the BGP workflow of BGP but this is more related as maybe I would like to say background task and so we have kind of insertion point where you can execute some of these tasks for maybe for maintenance stuff and so on.
Now we have a way to add programmability and innovation inside network but one question now is: Does using xBGP have an impact on router performance and to answer this question, we have made a small experiment where we have made left routers to eject full routing tables to our xBGP routers and we measure the time of this green router and the first thing we managed to do is to check if there is additional overheads or not, if there is no xBGP programme that are executed inside the routers and actually there is a small overhead because you have to initialise some data structure related to xBGP and also to check if there is some plug‑in which is inside the implementation, so there's a slight overhead, and so we have considered worst case which is the reimplementation of route reflection. So we believe this use case is a complex one for xBGP and this is the worst use case you can have. And there is a quite high overhead I would say for FRR routing, you have conversion type of plus 13% and BIRD 8%. So the difference is explained by the data service between routing and BIRD, this is more easy for xBGP to convert, a structure that ‑‑ xBGP extension can use inside the plug in and also that the eBPF is not efficient as but it is style a prototype, there is some some improvement to be made inside the prototype of xBGP.
So the last thing I would like to share is, yeah, now we have something to add new feature inside routers, but the code that is executed and written by the operators is untrusted from the point of view of the routers and so if I take back my geographical location extension, maybe this one has not been correctly written, there are some other cases I note, I do not think during the development process, and this could crash the routers and this is something we do not want because routers are generally expected to run 24 out of 24.
So, we managed to develop the kind of framework on top of verification tools where you have to pass the source code of your extension and with custom verification macro that we have developed, and then those ‑‑ this source code with annotation will be passed to the verification tools, if all the properties are satisfied then you can compile it to EBGP bug code and eject it to your router.
There are a lot of properties and there are basic one and properties related to BGP so the basic properties are the termination, we do not want that the plug‑in is stuck in infinite loop and to completely break the router. There are also question of memory safety, so our prototype is running on top of C code but I agree if you write your extension in rust, for example, then the question does not arise.
And there are also some stuff about Victor machine isolation and API restriction, so for example, if I would like to use some part of the API of xBGP but I am not allowed there are time check that has made on our framework.
The most important one is the properties related to BGP, and I will also retake my example geoLoc‑when you want to add and when you want to advertise this feature and this attribute inside your routers they must satisfy wire format of header so what you have to do is write and annotate your source code with the corresponding values of the flag must be 8, type code 2A and so on, latitude and longitude valid location so you have to write it yourself and then the off‑line verification tools will make sure that the header is correctly formatted.
So, this concludes my presentation, so we believe that with xBGP, BGP implementation will become truly extensible. We have made the job for BGP but we believe that the same methodology can be applied to other routing protocols such as IS‑IS and if you are interested to know more about the use case and other stuff that we can do with xBGP you can check the corresponding paper. Thank you.
MOIN RAHMAN: Okay. Thanks, Thomas. Do we have any questions?
SPEAKER: Hi Berislav Todrovic from Juniper networks. I have just one question, I guess you tested this feature on a number of vendors, which vendors did you test?
THOMAS WIRTGEN: We only testing on FRR Routing and BIRD, we do not have access to others but we would like to be happy to have some implementation to put the xBGP stuff on real vendors.
SPEAKER: Okay, that's my question, because, for instance, some vendors, including my company that I am working for, have the possibilities to add some code that would be, let's say, a kind of a code that can be executed on the box and I am not talking only about scripting, programmable RPD which provides you exactly the access to the routing part of the router, to do the control plane where you can add your own feature so that might be interesting for you.
MOIN RAHMAN: Okay, nice to know. Thank you.
DORIS HAUSER: We have one comment from online from Sander Steffann from six connect and he just saying also more thank you.
THOMAS WIRTGEN: Thank you.
MOIN RAHMAN: Thanks, Thomas, are I think we don't have any questions.
THOMAS WIRTGEN: Okay, thank you very much.
Now invite up my old friend Robbie for an interesting presentation on indexing Europe's Internet resilience.
ROBBIE MITCHELL: Hello, hello. Thank you very much for coming and staying to the end of the conference, I appreciate that, for my other speakers speaking this morning.
Today, I am talking about the Internet Society's efforts to help decision‑makers understand the Internet's resilience and a little bit about the health and evolution about the Internet at the same time, and what I hope you leave with here today is that it's worth keeping abreast of the resilience of your and your country's Internet ecosystem as much as it is your own network, there's plenty to learn from the strengths and weaknesses and insight you can offer to improve the overall health of the Internet.
As an overview, I will discuss three case studies of how Internet resilience in North America, Europe and Australia has been comprised in the past 17 or so months. I will share a snapshot of Europe's Internet resilience and highlight some of the strengths and weakness. And finally I'll discuss the need to improve national and regional ecosystem Internet data resolution to allow decision‑makers to make better informed decisions.
So, Internet Society has been running its running the Internet project for three years, its outward facing product is the pulse platform which crates Open Source data to examine Internet trends and tell stories so that decision‑makers and others can better understand the health availability and evolution of the Internet.
Our current focus areas are to do with Internet shutdowns, so we have an Internet shut down tracker and recently released an economic tool to show the economic costs of Internet shutdowns, called net loss. The state of deployment of enabling technologies, IPv6, HTTPS, DNSSEC, TLS 1.3, the concentration of infrastructure services in markets and finally, the resilience of Internet in more than 170 countries.
So, regarding the last of these focuses, we develop the Internet resilience index which I will refer to in this presentation as IRI, and this draws upon around 20 open data sources, and uses best practice methodologies to calculate a snapshot of a country's Internet resilience in terms of its infrastructure, its performance, security, and market readiness.
I will speak to each of these what we call pillars as I go through my presentation and as well as the metrics within these pillars.
Before I do get into the IRI, I want to put this into context, and advocate how ‑‑ well sorry, put into context and note three case studies of where a country's Internet resilience has been shown up in the past 18 months, particularly in Australia, Canada and Italy. So the most recent of these incidents as many will be aware of happened last month in Australia, we are in December, where a minor technical slip‑up by the second largest operator caused one‑third of Australians to lose and distributed emergency services, hospitals, banks, ATMs, the lot. While there has been no root cause analysis forthcoming yet, plenty of commentary is swirling around as to what happened. One of my colleagues noted on the pulse blog that the warning signs of lack of resilience compared to other carriers were apparent, most notably the disengagement with local peering. Most likely case of this was down to business market strategy, but it does overlook the opportunity to bolster network resilience through diverse connection points, a metric that we consider as part of the IRI.
So, this is the Internet Resilience Index profile for Australia. While it does not offer insight at a network level it offers it at that national level and it shows how concentrated the market is in Australia. Looking at the upstream provided diversity and market diversity. If you are familiar with Australia's telecommunications it's two‑and‑a‑half horse race which isn't too dissimilar around the world.
Given the wide‑ranging outage the Australian Government ordered an inquiry to examine the roaming and emergency services impacted and have already put forward suggestions as to whether rival telcos can offer access to their services during network shutdowns. It plans to investigate the adequacy of Optus's communications on the day of the outage.
The second case study shares a similar traits to the first, including a chief executive, no longer a chief executive at either of these organisations. The Rogers outage in July 2022 spread across Rogers cable mobile and fixed line services and directly affected all of its 12 million subscribers and indirectly prevented all businesses nationwide were being able to accept debit card transactions, affected several government agencies, including border security, impacted timing of one‑quarter of traffic lights in the Toronto area and denied access to 9 is 11 emergency calls. Overall it was estimated to have cost the Canadian government 142 million USD and Rogers 150 USD in customer credits, Rogers has never been forthcoming except noting the outage was resulting from a configuration change that deleted a router flooding resulting in a core network failure with broad impact.
Rogers has acknowledged the need to plan and implement network separation of wireless and wire line call functions and it's currently spending 261 million dollars on doing this.
The interesting thing will be, since the merger between Rogers and Shaw is going through, whether or not that will also include Shaw being diversified as well under that umbrella.
If we look at the IRI profile for Canada you can see the upstream provider diversity is about the same as Australia, so it's 37%, but its market diversity is quite a bit higher than Australia at 59%. It will be interesting again to see how this will change with that merger between Rogers and Shaw. Like the Australian incident, I think the Australian Government was understanding of what happened in Canada and Canada implemented the things that Australia is now going through and looking into the inquiry.
The final incident that I wanted to draw your attention to happened here in Italy this year, where, again, nearly a third of Internet users in the country were left without Internet connectivity for five hours this time. Like the other two incidents, the ISPs in question haven't been forthcoming with root cause analysis, still thanks to analysis from the community including former colleague, Max Stucchi here in the front row, it was clear the issue was related to a lack of of redundancy on the side of the operator and its upstream provider sparkle which is part of the parent company as well combined with a lack of local peering. As seen here on hurricane electric BGP toolkit, TIM relies only on sparkle for international connectivity. And if we compare that to Vodafone Italy, it has five upstream providers. This set‑up provides plenty of redundancy when one of these up‑streams has an outage, as we know.
If we look at Italy's IRI, it's interesting to note how upstream diversity ranks highest of all three of these countries, which does point to a slight limitation of what this index is, and why you, as a community, also need to provide insight into tools like these in metrics like this when decision‑makers look at them and say upstream provider diversity, we look pretty safe for that but it's often common knowledge within the technical community that there are these pressure points that need to be addressed.
Okay. So, Internet resilience of Europe.
Europe leads all other regions in the world by a fair margin. Of the top ten countries nine are from the European region, with Switzerland and Iceland leading all. If we look at Switzerland's profile, we can see they don't have many weaknesses though we cannot note their market diversity is lower than of the previous countries. We do need to take into account Switzerland's population is a lot lower so it's always going to be difficult to increase market difficulty in smaller populated countries.
The one interesting outlier I found from this, though, was their DDoS protection is very low and that can be something that can be addressed fairly simply. Similarly, Iceland has many strengths, with IPv6 adoption and again market diversity being among its weakest resilience metrics. It's going to be difficult to see IPv6 increasing there again at low population, low diversity of networks, they have already got plenty of IPv4 I would imagine, but as the rest of the world connects to IPv6, I am sure they'll catch up.
Of note with both countries, they have needed to overcome differing challenges with establishing their prominence in Internet connectivity, one being landlocked country, highly mountainous, so there is a lot of physical infrastructure with trying to overcome that and one being in an island and highly volcanic, and we are looking at how natural and climate changes we will be able to involve that into the resilience index too.
The one country I wanted to highlight of Europe's top is Bulgaria, which ranks fifth in Europe and sixth globally, and of course first in Eastern Europe.
This is interesting when considering, as per the Brookings Report from 2021, Bulgaria still lags behind the rest of Europe regarding adopting digital technologies, so they are right down the end there. Pour digital literacy and skills, low levels of investment and research and development, incomplete digitalisation of public services, perceived to be holding the country back, and yet they have one of the most resilience Internet
ecosystems and the ICT has grown to be top export in services.
I just read also that last week or the week before the government has committed another €260 million in increasing its broadband infrastructure to rural communities, so they are not sitting on their laurels.
And with that, there's still plenty of room as well for them to make changes and even lead Internet resilience for the rest of the world. We could imagine to see 5 to 10 kilometre reach within those new investments but they can start deploying IPv6 which will get greater increases in had their security pillar score, as well as DDoS protection, just like Switzerland. And increasing its peering with seven IXPs which is quite high for, again, the population, which I think Bulgaria is around 5 to 7 million. Someone can correct me.
If we turn to the attention to the countries with the weakest resilience Internet ecosystems in Europe, we can see there's plenty of opportunity for Europe to increase its over all resilience checktivity. Most so, we can highlight the importance of resilience at a country level by looking at the country that has had perhaps had the ecosystem, Internet ecosystem come under the most amount of stress in the past 18 months.
So, Ukraine has been a role model for Eastern Europe for many years and has consistently scored relatively high on the four pillars of the IRI. Having said ‑‑ having a solid base has particularly helped, particularly in its infrastructure and security, and market readiness, has helped it maintain its Internet connectivity during the war.
Four metrics that I wanted to draw your attention to are the number of IXPs, it has 27 IXPs in country, it's Meta score, rating MANRS at 72%, upstream redundancy and market diversity, all of have these have helped insulate from the targeted attacks that Russia has made on the Ukraine infrastructure.
As you can see here, it has strong local peering fabric. Importantly, this hasn't changed very much, even though 100 ASs have transferred out of the country, most to Russia.
And while key of star is the dominant ISP in the country, multiple other providers provision three‑quarters of the remain traffic, this has been especially important given it has experienced increased latency and decreased throughput over last 18 to 22 months. Notably, 71% of networks are implementing best current routing security practice. 99% of which have documented their routing announcements in IRR and 40% in RPKI so there's good practices happening still. It
While we don't yet show exit points in the index, we recognise it as an essential indicator of a country's Internet resilience, to give you some idea of those metrics, we don't yet involve it because we don't have enough data points to be able to show for a certain amount of countries.
Ukraine has no submarine cables and relies purely on terrestrial links with its neighbouring countries, which it thankfully has many neighbouring countries that aren't at war with it.
So, limitations, I touched on earlier. What I have shown you here is merely a guide and some of you may have scoffed at some of the data and scores and sources and you may have your own data and sources that say something totally different. What we are trying to do with this tool is to make it ‑‑ these open data sources more readily available to decision‑makers so they can make a sense of ‑‑ get a sense of that Internet ecosystem at a higher level and demonstrating the various aspects of the Internet as well. And I pointed to how we need the technical community as well to make sense of this along the way. By no means do I say this is a source of truth; it does provide a digestible view of what is going to help you and especially the decision‑makers, identify weak /‑PSs and hopefully guide research into validating what we see with those gaps.
To this point, we advocate for greater localised data sourcing and sharing, which provides greater resolution to what is really happening at the edge so more RIPE Atlas, more only probes, go. This is also the first version of the index framework and we plan to expand it to include new metrics including resilience countries ‑‑ how resilience countries' ecosystem is to natural and climate disasters, that is something that is planned for 2024.
To conclude, to understand what is happening upstream and beyond your country's borders is equally important as knowing your network's health. You may want to know that the networks you are peering with and the routes you are taking are resilient so you won't be up up for sudden costs to purchase capacity on the chance of any mishap. Some of you may also be servicing customers in other countries or considering doing so and again, knowing the Internet resilience of those countries and the links between them can help you understand the risks and where you might need to invest. Having an insightful and national measurement system in place helps validate this sort of data that we and decision‑makers are becoming more reliant on to help us address the weaknesses in the ecosystem.
And while it's unfortunate that others have mishaps, you can learn from them and should share your own mishaps and successes, in forums just like these and local NOGs so others can learn from you and them, too.
As the adage goes it's not a case of if but when, but the impact doesn't necessarily have to follow those who have already been impacted.
Finally, if you are interested in keeping abreast of all things about the project, please subscribe to our monthly newsletter. If you want to partner with us, or learn more, contact us through pulse at ISOC.org and I have a link to the methodology for the index so you can understand the sources and relative conclusions associated with getting to the scores.
MOIN RAHMAN: Any questions? Doris, do we have any questions?
DORIS HAUSER: Not currently.
ROBBIE MITCHELL: Easy.
MOIN RAHMAN: Okay, thank you, Robbie, for your presentation.
Now we have a lightning talk and we will call Mariano Scazzariello who is talking about ROSE‑T.
MARIANO SCAZZARIELLO: Good morning, everyone, and thank you for joining this presentation. Today, I will talk about our ongoing project called ROSE‑T in collaboration with others.
So, we all know that routing security is really important nowadays, and the MANRS initiative is playing a key role in making the Internet a safer place.
In this talk, I will mainly focus on the MANRS network operators guidelines and briefly, like, summarise them in this slide, they are coordination, global information and anti‑spoofing and filtering and these are the four actions that network operator must perform to verify MANRS compliance.
After reading them I ask, let's assume I am a network operator and I want to ensure that my network is MANRS compliant, how can I verify this?
So, the reality is that there is no automatic or comprehensive tool to verify MANRS compliance and so the validation is done manually by the operators and becomes a cumbersome process, because they have to manually verify and the configurations and policies and this also makes difficult to replicate the process in case this is needed. So, yes, while ensuring MANRS compliance is crucial, it's not an easy task.
So, to others this problem in this talk I will briefly present ROSE‑T which is the first Open Source tool to automatically verify MANRS compliance since the time is not enough I will briefly go to the ‑‑ inside the process.
So, are first, the candidate provides the AS number and from these we can get the IRR entry and parse the aut‑num objects and for now we rely on manually parsing of the RPSL syntax but we plan to switch to better services such as IRRd. We also take the latest IRR Dump from a route collector and from this we can filter out the routes originated by the candidate AS by checking the AS path. So now we have prologue script that will check if the networks announce to the transits in the wild are the same as the ones declared in the IRR entry and vice versa.
At this point, we can take the vendor configuration that is provided by the candidate and we need to transform it into an agnostic format that can be easily processed in the rest of the pipeline. So how can we do that? Currently, ROSE‑T we exploit bat fish that understands most and transforms them into a unified format but during the process we noticed that bat fish output lacks some important information such as its ‑‑ it doesn't support IPv6. So we use the bat fish output database line and we extract the missing information with our custom parser, we use the bat fish in this phase of the project to ‑‑ we plan to switch to a custom full parser made by us.
So, now we have data structure that tells the important information into an agnostic format but we don't know the relationships between the ASs inside these configuration and we rely on the IRR and to set the configurations between the aut‑num systems and for being compliant to network we plug a dummy OTT which is connected to all the providers of these small networks.
So now again we use the IRR Dump to add on each autonomous systems the prefixes that are originated by each AS so these sum up the small network that behaves like a realistic one connected to the candidate. In this example we only show how ROSE‑T handles the direct peerings but we support multi op peerings from the candidate AS towards other ought money news systems with completely complex syntax and logic.
At this point we transform this inter immediate representation into a runable networks scenario and to run this scenario we rely on the emulation and ‑‑ emulator based on containers. So each autonomous system of this representation is then converted into a router and for all the routers, except the candidate one, we see FRRouting as the routing suite. As for the candidate router we use the specific vendor container that is derived from the configuration provided by the candidate and for now we support Cisco, ISO and Junos but we plan to support others since the architecture can be extended to do that.
After waiting a bit for the network convergence we can start performing the actual verification. Before that, we add two hosts that will be used to perform the tests. So, for them reasons I will only show how the anti‑spoofing check is performed, for doing that we take the providers and for each one of them we first add a client inside the same autonomous system, we assign both POU and IPv6 to each client and the challenge here is to choose subnets that are correctly announced and mutually reachable from the information that we have at our disposal and from the emulated network.
Another challenge is to select valid random IP address to assign to the Internet host, and this address and subnet should not overlap with the other networks announced inside this small network and once we choose it this subnet will act as malicious subnet.
Now we perform the actual check and to do that we put a TCP dump on the Internet host to sniff packets and we craft ICMP packet from the candidate client and this packet will have the spoofed AP as the source and the provider client as the destination. We send this packet and two conditions can arise. If the packet is forwarded by the candidate router towards the provider then we check if the packet is then received by the Internet client. If so, it means that the configuration is not compliant with these action.
In another case, if the candidate correctly blocks the packet, so no packet is received by the Internet client then so the configuration can be like handled as compliant. As a double‑check, we also verify that legit packets are correctly forwarded and received by the provider host.
So, to conclude, I briefly present ROSE‑T, which is the first tool to automatically verify MANRS compliance. ROSE‑T allows network operators to finally test configurations without relying on manual procedures and this will surely help in the adoption of the MANRS principles if this project becomes widespread, which I hope. Indeed the work is preliminary and should ‑‑ surely there is a lot of future work. The first step is to extend the verification from a single router to multiple routers. Then, we plan to extend the verification to all IXPs and the CDN Cloud providers. Moreover, the tool was born to verify MANRS, but we also plan to have additional support for example, to validate the RPKI deployment of the candidate. Another good feature is to release a code, a certificate that is tangible proof that a certain network is MANRS‑compliant. And lastly, the tool is right now CLI based but we plan and hope to into a web UI.
I am at the end, thank you for the attention and as I said, ROSE‑T is an Open Source tool so you can check more on the GitHub repository and any communication by the community, thank you.
MOIN RAHMAN: We have a minute for some questions.
ANDREI ROBACHEVSKY: I would like to congratulate your team, that's a great tool and we think of something like this, we are developing, there is a Working Group called MANRS+ to check more profound checks, feel this last piece of the puzzle allows operators prepare for MANRS and also to verify that their configuration is correct. Thank you very much and I hope we collaborate on that.
MARIANO SCAZZARIELLO: Sure, thank you.
DORIS HAUSER: No questions online.
MOIN RAHMAN: Okay. Thanks for the presentation.
There was a huge outage at AMS‑IX on 22 andened 23rd and November and Stavros is going to present on this.
STAVROS KONSTANTARAS: Thank you very much, morning, everyone. I am here on stage to present you not a very nice topic, but I hope you had good night yesterday so at least we can compensate on that.
I had because my voice is not very good.
Okay. I am going to explain you briefly what happened last week in Amsterdam internet exchange where we experienced a big outage and I am going to take you via the technical details of this outage.
As you can see I am a senior network engineer from the AMS‑IX network so NOC with this case.
I would like to mention ‑‑ to say to you guys that some vendor names will be mentioned here and that's about to make your life easier to understand what happened and in which part of the platform and network, but is not about to blame anyone, we are not blaming people or vendors here. It's about sharing knowledge with you, the community, with our colleagues from other IXPs and already we had seven migrations, are putting new machines in the network from the new vendor, in such a short period and everything was super good and smooth so sometimes things can go wrong.
So, let's go further. Quick, very quick recap of the AMS‑IX platform so far that you can go on our website and see, it's spine leaf topology, in a dual asterisk, has two MPLS cores and all the other are PE routers. And in front of every PE router there is a cross‑connect, we dough deploy two PE routers which works as demarcation point for our customers. The network consists of three generation of equipment, we have some very old Brocade MLXE, 16 /32. The second generation is extreme SL X 9850 which they support a lot of 100 gig connections and we started recently adopting the Juniper MX 10K 8 platform as Juniper is the new vendor for us for the future.
The protocols stack that we are using is not something unusual, we are a little bit behind compared to the industry, we are still using MPLS/VPLS, OSPF protocol to give some signalling on the point‑to‑point interfaces, we use LDP for label distribution and RSVP to provision strict paths between the machines and LACP for aggregating connections of customers but also in the backbone network.
So, what happened on Wednesday afternoon, where everything started right?
So, are around 7 o'clock, Amsterdam time, we noticed that customers on science park campus moved automatically from one peer router to another, that happens automatically with us by using the where we have scripts and systems that monitor and if the machine experiences a failure all customers show up automatically with this layer 1 switches to the other PE router so this is what happened in automatic way and then people started immediately looking okay what happened, perhaps a machine crashed, a line card or, we had these issues in the past, we are very experienced on handling them but soon we realised there was more meat on this case. As customers starting reporting loss of traffic, a lot of unstable connections and remote peers. So we saw okay, that's something bigger than we expected or what we have experienced in the past. And we started investigating but quickly we couldn't find that the issue was particular PE or co‑lotion because all over the platform. And when we were going through the logs we see lots of RSVP messages which means the network is not super stable. We had a small clue however at that point, we saw a lot of unstable link aggregation connections but okay it could not help us with the investigation. What we did, if you can understand all the NOC hands‑on the call, on the table, everybody mobilised, the crisis button was pushed and we started looking into the case, we immediately engaged extreme networks because we saw a machine and SL X machine had an issue so it was logical step to call extreme to help us and identify the root cause. We saw a lot of RSVP flapping at the moment. We had disconnected suspicious customer who connected to the network two hours before, it might be this guy that creates an issue. We disconnected him but the things didn't improve. So, we started looking further and we recognised known Interop issue we had at that moment and identified this ten days before and that is between extremis and Juniper mix case. I will talk about it a bit later. We placed RSVP policer to cut excess amount of messages between nodes which can result in slower convergence of the network but that helped us to stabilise the network at that moment and makes things easier, so we managed around 1 a.m. to conclude this work.
So, Thursday morning, as we started investigating what happened the previous afternoon, we started around office time so 9 o'clock, started working, the whole team. We started fine‑tuning the RSVP rate limbers and by 9:38 the issue came back, okay, we started rolling back all the changes we did, but this didn't help at all, the networks were still very unstable. At 10:22 in a desperate action we isolated the Juniper core router, but of course the issue wasn't fixed; we ‑‑ again, we didn't have clear visibility what was the real route cause and at 10:52 a colleague of mine noticed that okay the suspicious customer of yesterday was connected again in the network, are so he disconnected the guy and immediately the network started getting stabilised, becoming more calm and the issue actually almost gone.
And that created a lot of rings, a lot of bells, what is happening here.
So, we started analysing and I am going to present you of course a brief, what we saw. So this suspicious customer who was connected to one of our Juniper machines, he was sending us LACP packets but we didn't configure the Court as aggregation port, that was not clearly communicated between AMS NOC and customer that he wants to establish a bundle with us. He start sending us LACP packets and since we didn't configure on our side as perfectly, you know, LACP packets have multicast destination. The machine was receiving the packets and multicast address, I don't have an LACP, here you are, I am sending this to the network and were flooding to the rest of the network and reaching all the other peer routers of course. This is the whole network in this picture.
You might say okay guys you have some proEIX it is, we do, we do ‑‑ top of the customers and especially for all the customers that have LACP we have called out LACP actually protects the customers from this kind of situations, should not get leaked LACP frames. Unfortunately, this didn't work and the LACP packets that were coming from the suspicious customers leaked into other customers who had LACP protocol enabled. As a result, we caused the system ID and all the parameters didn't ‑‑ the normal customers who had LACP aggregations, the connections starting flapping, the LACP was re‑setting again and again, means the customers were connecting and disconnecting from the VPLS network, especially those who have a lot of private interconnection and other services. Which resulted in a huge RSVP storm towards the core and it was triggering the bug that I am going to explain you now (Interop bug)
We had an Interop bug for the last ten days but it was under control until that moment where the case happened. So the Interop packet we have is that, for example, when you have a simple MPLS network, so this is a diagram, very simple for you to understand, where you have a core that's normal MPLS router and let's say couple of PEs and one of your PEs gets disconnected for X, Y, Z reason, it doesn't /PHRAER, what happens is the PEs try to re‑establish, in that case PE 12. The PE router needs to follow the strict paths that we configure for every MPLS node in our network so we don't have dynamic so the paths are very strict. The PE router needs to follow those strict paths and tries to provision the session with a missing node. However that missing mode is missing and in that case the PE router sends to the other nodes path error message where the path state remove flag has been set and all the other PE routers get packet. The same as the PE 11 router gets a packet, path error message, it removes state. However it doesn't respect the retime /TER which is 30 seconds. In a couple of milliseconds later it creates a new state for that particular remote node and tries again to provision the LSP. The PE router gets a message, sends again the path state removed and it goes again and again in endless loop that creates store. The same happens however, we also noticed in similar behaviour again based on this Interop issue where you have unstable OSPF network and then some of the remote PE routers are via P 21 then that sends a message, path error message to the core router that says okay you try to provision LSPs through me but I am not a core router, I am an edge router or PE so it sends back to the /KAOER and again a path error message, however with different error, because the wrong destination look backer /TPHER that case and then the peer router sending back to the PE 11 and that receives messages and removes the state and creates new state immediately because it doesn't respect the try timer and creates the RSVP store.
So, we notice both behaviours.
So, are in short, what happened on that particular outage, are as I mentioned, we had LACP packets leaked into the platform and the LACP, belong in category called slow protocols and the IEEE E 802.3 describes how you handle slow protocols. We believe those packets shall be dropped in the port of the router if you don't have LACP‑enabled of course so the routers shall ignore those packets, unless you explicitly allow them to go via your network to reach remote peer because of business case. So far in our history we never had such an issue with LACP packets leaking, we notice that the other machine we used in the past if they don't have LACP configured they immediately ignore the packets.
Of course that was not enough, as I said we ‑‑ the LACP didn't work as expected, so wherever it was created this didn't work as expected. We had something similar on June OS, it didn't work but that was partially our fault because we didn't pay much attention to configure it strictly, it was a little bit loose on our side and of course all that stuff, on top of that we had the Interop issue that magnified the whole situation and created the chaos.
Once we understand the situation and the problem, of course, we managed to take immediate measurements on that, of course we updated all the Junos firewall filters for non‑LACP enabled customers to drop such packets. We immediately updated AMS‑IX systems so if there is a future port coming or connecting to the network those changes should be implemented over there and those packets should be dropped.
What else? We also ‑‑ we are going to work ‑‑ we started working on some filters that can drop such slow protocols that come from the pseudowires or ‑‑ not only LACP but S TP and other protocols so those should be dropped and that filter will be applied in the VPLS instance, exist only on Junos, we don't have it in the previous generation of machines.
We are working with both vendors to provide them all the data that are needed in order to serve the case in a more permanent access. If it is a bug it should be fixed and everybody should be happy that this bug will not exist in the future versions.
And before I thank the people over here for attending my talk I would like to take 20 seconds from your time because I want to say kudos to the whole NOG team for doing that. We had people leaving the dinner table, stop feeding the kids, coming to the Zoom calls, start working on the case, it was amazing team work, it was very good to see everybody on the call.
Also I am going to say thank you to people that made this RIPE meeting different for me. As I said to my team on Friday, I am going to do the work of shame on Monday morning when I go to the RIPE meeting because of this incident. This didn't happen, however, I will still remember Flavio on Monday morning giving me a hug as a father to son because of of that, Alexander Azimov for making me ‑‑
Alexander Azimov was telling me stories from his past experience trying to make me happy that this shit happens to everybody. Wolfgang from DE‑CIX, he says go go, go to the presentation, don't think about it, it's your story, shit happens to everybody. He was motivating me and make me feel, he's right, he's right. And at the end, yeah, okay, shit happens but you know we are a community here, everybody of you made me feel better and I want to thank every and each of you that made this RIPE meeting special. Thank you.
MOIN RAHMAN: Lots of questions.
WOLFGANG TREMMEL: From DE‑CIX. Also working in the XIP industry. Thank you for this talk, thank you for being so open. Also, thank you for the communication during the incident and lots of sympathy, I have been in your place ten years ago when we had a big problem.
STAVROS KONSTANTARAS: Thank you.
GERT DOERING: Not an IXP operator, just an network operator with lots of diverse gear in our network migrating from this vendor and three trying to talk to each other, why do we have a BGP loop here all of a sudden?
So, yes, shit like that happens. Thank you very much for bringing forward what happened and explaining it, so others can avoid doing the same mishap or if a similar situations happen they can remember, oh shit, it might be LACP, AMS‑IX had that issue, let's look there. So this really helps and it's not something to be ashamed of. Your team did a tremendously good work in finding the cause and fixing it and documenting it so that's something to be proud of.
STAVROS KONSTANTARAS: Thank you, you are right.
SANDER STEFFANN: Hi, six connect. Yeah, stuff always goes wrong, mistakes are made and it's not about what goes wrong, it's about how you fix it. So job well done, like Gert said presenting it here will help the rest of the community and I really like how serious AMS‑IX took it, like I saw the CTO fly out last minute to attend here, you set a great example, thank you very much.
STAVROS KONSTANTARAS: Thank you, Sander.
DORIS HAUSER: There are no comments online, but I would also like to just comment well done on your side, your team has obviously made made a good effort and has managed to do this and I am very happy that you are talking about this because in my eyes it also shows a lot of professionalism if you can own up to your mistakes and say this is how we handled it and this could have been made better and this worked fine.
STAVROS KONSTANTARAS: Thank you, Doris.
SPEAKER: I wanted to say everything has been said already, but you have my sympathy, and thank you for sharing the incident report because that's the most important thing, so we can learn from it as a community and make the Internet more stable. Thank you.
STAVROS KONSTANTARAS: Thank you, you are right.
MOIN RAHMAN: I want to add it's actually great that you have transparent ‑ to the community and you are actually addressing that publicly coming here and disclosing what you just presented, hats off for that. Any more questions? . Okay, thank you Stavros.
MOIN RAHMAN: The last was about one single incident and we will see how Emile is going to present remotely.
EMILE ABEN: I hope you can hear me.
MOIN RAHMAN: We can.
EMILE ABEN: Yes, I am a data scientist at the RIPE NCC and we looked into this outage and on the one hand when such an outage happens it makes me feel pretty bad because you know that maybe 20 kilometres from where I sit there's a team just like Stavros described and I want to add to the people who commended Stavros for being so open about this, I feel a bit like a vulture looking at such events but because they are so rare, they are also interesting because something like this doesn't map very often and it helps us sheds some light on the question of does the appointment route around damage but we did a couple of earlier case studies around the very large IXs AMS‑IX in 2017 and DE‑CIX in 2018, LINX in 2021 and I would actually two weeks ago I was half jokingly saying to my colleague Hasime, we are up for a large one in 2024 and unfortunately it happened a little bit earlier.
But yeah, what we can actually do with systems like RIPE Atlas is see what an event like that looks like from the outside and I would like to compare that to other fields of research, chemistry and physics where they shoot lasers through things they want to look at and we don't have laser beams, well, we have lasers through fibre, but if you compare that to what we can do with Atlas, you could see Atlas as, the Atlas probes as sources of lasers and where you point them are the destinations and the beams themselves are trace routes and if you just make them go through your infrastructure and in this case it was AMS‑IX but it could also be a large network or other IX, you can actually see how behave so if we calibrate RIPE Atlas and only take source destination pairs that reliably go through IX and reach a destination and we have end‑to‑end connectivity and we take that for a single day and apply that to AMS‑IX, we see in the order of 66,000 pairs of source destination pairs, that's around 7,000 probes, 1,100 destinations. And what you then see, if you then just look at these pairs of source destinations over time around when an event happened, this is what you can see, and I will explain the picture. What we see here is in dark blue, they are the source destination pairs that have end‑to‑end reachability, so we get responses in trace routes from the destinations and we see the infrastructure that we studied, in this case AMS‑IX, so you can see before the event that was ‑‑ everything was basically following this pattern and then when the event happened, yes, there's ‑‑ where the first one we actually see a drop, we don't see ‑‑ or roughly half of the source destination pairs. AMS‑IX and the other half still had end‑to‑end connectivity. So in as far as we can see, there was routing around damage. If there was a failure in end‑to‑end, you would have more red there. At the top of this graph we see the ‑‑ a little bit of red, which indicates that we didn't have end‑to‑end connectivity anymore. You can actually see the two distinct events that Stavros described and you can see a slow shift back and that's shift back ‑‑ well, it's either ‑‑ probably a combination of people manually configuring, reconfiguring, as well as BGP selection process where at the ‑‑ somewhere low down in the selection process you have all ‑‑ after an outage the new route via an exchange typically not the oldest.
We can also look at the alternatives that we see, so, the picture here on the right shows, it's the same analysis, methodology, but we then look at, did we see other IXPs in the path? And we roughly see an even split, between the path without IXPs, and it's the green line in the middle. So that's transit or lateral peering, we cannot really say from this and we see three major IXPs, we saw a little bit of other IXPs, but what you see here is NL IX, DE‑CIX and LINX taking over and these are expected alternatives given the sizes and the localities of these IXPs.
An interesting difference between the first and the second event is the huge uptake in NL‑IX here. We can speculate that this was manual reconfigures, people setting the preferences differently after the second time something happened.
So if we look back at the other events that I mentioned, you can basically see mostly the Internet routes around damage, same colours, we see a little bit more red here than in the latest at AMS‑IX. And I don't know what it is, it's probably ‑‑ there's two little points in this study and it's probably a function of where we have Atlas probes and who we are measuring. I have to give credit to Malta at IIJ for creating these independently.
And then, another interesting one here is the differences in how fast things return to initial state and again this is like four case studies, so I don't think you can draw any conclusions on that. But I find the differences interesting.
And a bigger question I think is, can we do this around other IXPs? Because we see there is resilience around these very large IXPs. Can we ‑‑ can we also see resilience around other ‑‑ hopefully these situations don't happen but when they happen can you see the resilience? And this is a prototype that we developed, developed together with my colleague where we look at how close or how many ‑‑ how much diversity around IXPs do we have with RIPE Atlas. If you just look at that from NCC terms of the number of ASs with probes, within two milliseconds you can actually see we cover the bigger IXs quite well so I am confident with sort of like how diverse the set is. But if you ‑‑ there's a lot of IXs, like I think for this we took PeeringDB in 648 LANs in there so there's some work to be done if you also want to measure more IXPs here.
So, take aways:
Well the first one as already mentioned, shit happens. This is only human‑made and as Robbie said, it's not if, it's when. In cases where we see outages at these large IXPs we see routes around damage. We might need Atlas around smaller IXPs to see similar things in their local context, and this is a measurement study, it's not an Atlas study. We don't know why this is, I have a guess it's a large local system, I don't know if it's the same for other locations.
And with that, I would like to open the floor for questions.
MOIN RAHMAN: Thanks, Emile. Any questions? Doris, anything in the queue?
PETER HESSLER: I can add a little bit of comment on why the return to normal from the most recent AMS‑IX outage. Why it took longer than possibly expected is because there's a morning of ‑‑ Thursday morning of the second incident was US Thanksgiving, the day after is black Friday which, as everybody knows, is a major shopping event in the US and around the world and so I imagine any large networks that were based in the US connected to AMS‑IX decided to either disconnect or depreference the AMS‑IX platform during the shopping event to minimise disruptions on, for either of their providing connectivity for the store vendors for their eyeball networks for people buying from the stores or anything related to that.
EMILE ABEN: I haven't thought of that, thanks,Peter, good insight.
PETER HESSLER: I think it could be interesting to look at other outages and response time and see if there's any major holidays or owe events and it could also be things like a sporting event where they don't want to disrupt during Super Bowl or World Cup or things of that nature.
MOIN RAHMAN: Thanks, Emile, for the wonderful presentation and that will bring an end so we can go for a coffee break and return for the closing. But before you go to the coffee break I would request you again to please rate the presentations and thank you, everyone, for being here.