DNS Working Group
29th November 2023
At 9 a.m.:
JOAO DAMAS: Good morning, good morning, everyone, please take a seat. Welcome to the RIPE DNS Working Group, first session on Wednesday morning. I am glad you could find the side room, it was quite a challenge at times.
I am the outgoing co‑chair of this Working Group, the other two current co‑chairs are sitting here, Moritz and Willem, excellent people to work with.
Just a few words. Up there, you can see the agenda that we will go through today. There is, well I would like to thank the RIPE NCC for having Anthony be the scribe for us and the transcribing people who will make these things easier with some speakers.
As you may know, there was supposed to ‑‑ there has to be a change of Chairs in this meeting, I am outgoing, I was term‑limited, I am done, I have been RIPE Working Group chair for 20 years in a row, that's plenty enough, someone new should come in. There were three candidates, all of them excellent I think and there was a request for input and the Chairs were reading through the mailing lists, I think everyone will agree there is a consensus that Doris Hauser is a should be the next co‑chair and she is sitting over there. Thank you very much for volunteering and keeping the things going.
With that ‑‑
WILLEM TOOROP: Sorry, I flew over here yesterday and at the airport, it's a very colourfully dressed person with black strokes on his face, he handed me this and asked me to convey it to you, so here you go. I think you need to read this.
JOAO DAMAS: Out loud?
WILLEM TOOROP: Yes.
JOAO DAMAS: We are going to run over time. Which one goes first? 29 November 2023, the year of Joe sinned ‑‑ so it's anniversary. Has been watching you work now for a while, from what I saw I cannot help but smile because like me you love to hand out presents by putting together programmes with the most wonderful content, during the pandemic organised the Working Group sessions together with some other people. On behalf of our community I am very grateful for the job you have performed so faithful, Moritz and Willem will take good care to follow you up, yours sincerely.
Thank you very much. Excellent. Thank you. I really appreciate that.
First speaker up is Ed, go for it.
EDWARD LEWIS: Good morning, thanks for finding the room, it took me a while to find it myself.
So, I am going to talk about DNSSEC non‑deployment something that I have been concerned about, and kind of some ideas what to do it about it, I am going to talk about how we got to be here and how we might go forward.
A little perspective, I will start out with this, about 1990, there was a researcher who noticed that the DNS had a vulnerability involving cache poisoning, it was a very vulnerable protocol, it would believe anything it was told and as a result some wheels were set in motion and in the early 90s there were contracts and work in the IETF to develop DNSSEC, and that's the protocol, the basic protocol designed happened about mid‑/# 90s, in a time.
In April 1st 1998, this was a small meeting where the person who sponsored all the work at the US Government asked the people in ISC who those of us who were doing the work how come no one is using it yet, and that sorted out the entire DNSSEC deployment for the past 25 years. We haven't gotten very far in deployment. The assumption was operators needed to be told the Ben fifths of DNSSEC, they should want to do this, it's great to have security and easy to adopt.
So 25 years I have been watching this stuff, I have lots of measurement data on its non‑deployment but I started looking at RPKI and I notice noticed it took seem to be lacking deployment at least for the past couple of years, it's picking up a bit know. Maybe the problem hasn't been the operators all along, maybe the operators are doing the right here by rejecting DNSSEC, maybe it has some flaws that should be addressed or maybe in future protocols should be written with operators in mind so I began looking at this problem, this is what I am going to in these last couple of months. In this slide deck here I don't have solutions yet because I haven't had enough time to study the operators' side of the problem. I want to talk to them to try to dredge out reasons why DNSSEC is not deployed in places, or other technologies, what's holding things back.
I have learned quite a bit in the past few months so I am going to try to bring some of that up here.
Now, I start out with deployment numbers as being my kicker, that isn't the goal, I don't want to see 100 percent, I want a usable protocol, to be useful, to be rolled out and to be put in play all the time, in with that deployment numbers come afterwards. Now, again, as I am talking, I am in learning mode right now, still starting conversations about how to get out of this.
Now, one of the things ‑‑ the word operators is thrown around a lot, I have this problem too, I think of them as being one kind out there but there are a bunch. DNS hosting operators, TLD operators out there, ISP operators, public DNS operators out there, even individuals who do this for themselves or just for a small group or just their company, so I want people to keep in mind the word operators is a very far‑flung meaning and anyone touches DNS here but it's wider for the Internet.
Now operators, I have run into some operators who, they are operators first, Internet second, where they just run black boxes and don't need to know what's in the box, want to make sure it's works, to respect and understand they are driven by.
There are some experts in the service they deploy and that was the problem with DNSSEC, the early operators deployment were people who knew the protocol inside and out.
So, what do operators want?
They have two rules they go by: One is whatever it is they are running, keep it cunning and make sure it doesn't go down, keep the capacity up and keep it refreshed, they are also ‑‑ the second rule is if it ever breaks come back fast, as quickly as they can. Meantime to repair is a very important parameter. And those are the two things that operators look at in services or protocols or anything they are running.
Now, operators do make changes, I mean a lot of us have been led to believe operators once they roll something out will never change it, that changes risk but in operations land change is necessary, they do tech refreshes, they do look at long term trends out there, they are not going to stick with the same thing forever, they know if they do that it's going to be outmoded at some point. Part of keep it running means long term maintenance but there are two things about changes that we have to keep in mind, they have two rules, one has to improve things, either has increased profit, increased revenue, service or value or number two, or maybe also number two, decreased cost, so revenue and cost are the two things we are looking at here, benefit and cost. Now, operators are a service, they are not so much interested in how it's done, they are providing something to the outside, they are a customer, the operator's customer is generally the employer, it could be the regulator that's put them into the service, it could be whatever business they are part of, the customers, they handle are the people out there that rely on their services so customers here also an operator, two direct actions up and down.
Now, DNS operators, we are in a field that's pretty young, about 25 years of DNS operations, go in the first companies were incorporated to run DNS as a service. It's still evolving area out there, and this is one reason why probably now is the time we are looking back and saying we want to do something, all about the same time DNSSEC has been out there so we have a good reason to look back and say you should take a look at operators really want.
I have been asked over the years about DNSSEC, why it's so hard to operate, it's built on top of DNS which is no joyride itself and you know all know that. DNS is an old protocol and has lots of quirks to it and it's hard to understand, it's a simple some aspects but it gets very complicated quickly and DNSSEC just ampifies that.
So, operation reality, the other factor I want to bring up that we often don't recognise is that the big thing about operations is the staff of people they hire to do this. They are going to have people at the terminal who come and go, these may not stay for many years or may stay. They have to make training pretty quick, they have to be roll over staff pretty quick. So this is going to come down to some of the impacts of what I see later on for protocols. The operators also report to service owners or regulators, that's an important thing. They are going to look to the regulator or the owner of the service more than they are going to look at standards out there, they really want to make sure they make their customers, people paying their pay cheques happy and they have a wide range of environments they exist in, they have so different many different inputs to what they do, it's not just the technology of the protocol we think is simple, they have to deal with all the regulatory things out there and the will not give up zone file for example, they came up yesterday in the talk.
What's going to make a protocol deployable, I am not exactly sure yet but I have a few hints. One thing is simplicity, a protocol has got to be summer, we keep churning out protocols which are Swiss army knives is one adjective has been given to me, where they do everything, but that's not very helpful when you are in a panic situation and to get this thing to be done right, you want a tool very simple because you may not have had time to train the operator at the things were happening. Clarity is important, maybe the situation existed as clear. Don't keep refining things, how records are read, that means you don't know how long ago was software involved, how was the data going back and forth. And one statement that was given to me by an operator that was very interesting was complexity causes centralisation. As much as people don't like centralised DNS these days a lot of times only a few can afford to run it correctly any more because the protocol has got quite complex. These are sound bites, things to keep in mind, I am not sure how we are going to put this into action but these are things I have been running across so far. If you hear these things and you think I am right or wrong, bring it up at the mic or to me afterwards in the discussions, I am really trying to learn and make sure I get a good solid compilation of these lessons.
DNSSEC where did it go wrong? I think the ideals were solid, it wants secure things, I am not going to read all the words, I want to pick up the pace a bit.
In the 1990s we had a good.understanding of the protocol, we understood digital signatures, we had a good idea of how to do scaleable key distribution and somehow all of this has gone horribly wrong.
So, what I looked at, I want to spend some time is 1990s network environment, this is what we were solving for when we had DNSSEC on the operating table. I want to go through these individually, host security was weak, that's a big thing, zone administrators ran everything, end‑to‑end networking was still everywhere and network abuse was Dos, not DDoS, cryptographer fee was a different thing back then, how we thought about it.
We had very weak host security, hosts were knocked over all the time because of that there was a rule against ever having private key material on a host attached to the Internet, if you look at the original DNSSEC the special cases we expected someone to sign the zone off‑line and transfer it magically across an air gap to the Internet where it would be put on a name server and spread out. The first impact is all of the answers had a had had to be pre‑computed and this is a term I didn't realise was true until Daniel Karrenberg reminded me last night, pre‑computed all the answers, that was because DNSSEC required that.
Now, pre‑disputing answers is great if you have the answer you want to give out there, it's not so great for a negative answer and because of that we have had NSEC and NSEC3, those two things came out directly from the fact we had pre‑computed answers, you could ask me a question and I don't know the questions ahead of time, if I don't know the answer I have to have a way of proving to you, the amount of work is crazy. It's not just a matter of ‑‑ it's a matter ‑‑ how do we do it? First we had to prove to you what I had, here is what I do have, you can see it's not there. We had to arrange things order, that causes a lot of problems in the name server, we had to put things in a heap of a tree internally, a hash table and so on. What if we had online keys? We can see the commercial providers actually have online keys, they have been working, no one has had a problem with that yet.
Tell it to the query responses would give us a new kind of negative answer that could be much simpler than NSEC or 3. We could improve the internal storage hinting at the tree versus hash table data structures.
Some administration was something else was much different, we assumed whoever ran the zone ran everything we didn't have the notion of multi‑signer, or multiple providers of DNSSEC for a zone, that has had a lot of impact also. We are trying to put into DNSSEC the provisioning of DNSSEC, we want keys to go up and down the tree between the parent and child, we didn't realise at the time we didn't have things like provisioning protocols, APP for example was not yet invented when this was going on.
In fact, secure dynamic update also was defined just about the same time as DNSSEC, we could probably have made use for that for putting keys up and down, so what if DNSSEC came after EPP we might actually have shoved a lot of the complexity of key management into EPP or made it more January than it is today so we could go up and down between registrar and registry protocol, it could be used between any two end points.
End‑to‑end networking was threatened at the time, there was a big fear about firewalls, they were just coming online, we had a problem with that in my company where we started signing our responses and the kicked all of our data out for two days and no one knew about it, we went two days without hearing from our sponsor until they called us and we realised DNSSEC was being kicked out. We were worried about response size at the time, none of the engineers, we didn't conceive that would be a problem, today we talk about all the time. What if we had started on stream based protocol we wouldn't worry about this so much, designed for QUIC and so on but they didn't exist back then.
We also didn't see that DDoS was going to become the big deal, DDoS was helped by DNSSEC or initially the attacks about ten years ago, ten plus years ago, you ask a small question of the DNS and it comes back as huge whopping amount of data, that helped amplify all the things out there. We do not concern ourselves with response size even then until much recently we did the key rolls.
Now, code availability for cryptography has changed quite a bit, when we started DNSSEC there were no public libraries available, there is SSL AY, I don't know if, Erik A young was the person, he and another person eventually did open ASL in late part of the 90s, we didn't have a standard library. We didn't think all resolvers and would have the same code for assigning and validation. HSMs did not exist in the DNS world at that time either, we had no concept what they were, they came later, they weren't part of the protocol design, for operators saying we want to have our keys protected. Patents were a big deal, I didn't access to that code until just about a year before we started going larger scale in DNS distribution, with the DNSSEC prototypes. Also, it was enlightening when it came on because we were aware of multi protocol situations but we had no way to test it and we find a lot of code paths had gotten it wrong and there were legal restrictions back then about export which I will just lay out there.
So, with cryptography, is a one area if you went back and looked at it what could we do about it. What if we knew there were certain protocols that could be widespread. We thought people would pick for each data depending on what they wanted to do with the data, pick one and just go with one at a time? We didn't conceive that have in the DNS. We also didn't conceive of the idea that when you roll from one algorithm to the other there's some special things have to happen. A lot of the protocol elements are silent about algorithm change because we didn't think much about that, the lifecycle of the operator.
What if we ‑‑ nowadays there is a proposal to make use of that, so things are going that way.
So, where do we go from here?
So, this goes back to recognising that there have been a lot of calls for DNSSEC to be pushed aside, because is it worth the money and effort to do this stuff? Is there's been effort to chip away at the issues out there. For example, child parent prong right now, upload and download, the proposals out there, it's in operations in some places but not very many, frankly, about five operators who are actually doing it TLD level, it hasn't gone much beyond there and when it was asked to be spread to other places there were questions about the completeness of the specification so we had to go back, what happens if you have a policy situation and feedback errors. Multiple back end operators that's another one for major consumers, they want to have more than one provider of services because they don't want to depend on any one. The DNS protocol now has so many rules saying everything has been got to be coherent across everything and makes it hard to have more than one provider at a time and it's hard to change, when you change providers you want to have incumbent be there for some amount of time and bring on the new one and have that running before you switch so, we didn't see that back then.
Switching algorithms, I have mentioned that a couple of times, we didn't have any way of dealing with changing the cryptographic algorithm out there. Right now double signing is the proposal and that means larger response sizes, things we didn't anticipate in our protocol.
The last two I threw in here more recognition of operators, one is figure out how to avoid DNSSEC mistakes, how many times have you heard someone has rolled out a new and because the new zone is out there the new data gets bottled up? Sometimes it's because you do a tech refresh of the whole pipeline and your policies don't match up, it could be software bugs where it starts panicking where it doesn't see the right alignment of things. We don't have tools that let us look at the changes between zone 1 and 2, we didn't look at that as part of the things which should be considered as operational help here.
And further, recovering from mistakes, in the meantime to repair, I don't think we are spending much time on, we have negative trust anchors, about all we have done in that area. There's operators out there ‑‑ are afraid of rolling out because if they make a mistake can they get out of it, that's the number two, they can't debug or manage the time out because we haven't spent much time on recovery, how do we turn off DNSSEC when it's gone bad.
A couple of ways to get there, I have some ideas. First of all we have to know where there is, I don't have a there in mind right now, I am looking for help, rather than protocol developers, need requirements or goals to get there. If you don't have requirements you don't know when to stop working, when we hit requirements we can stop working on this. Of there is one path looking at things we have out there right now, and my argument against is that I get concerned we take a record and it changes definition and roll out new software to understand new definition, now the operator gets confused because was that record written by an old speaker or new and what was their intent. It's not clear when you change a definition of records out there, even though it's easy to parse it, it may not be that easy for operators to handle this.
Now, grading new records and new code paths are things that we say will take forever to roll out but I don't know that that's any slower ‑‑ it's a better way to avoid confusion because all brand new, you know what's going on out there, we need to know how to transition from the future and doesn't have a rugged transition mechanism, NSEC3 was added a long time ago through the use of new algorithm numbers in the keys out there, if you remember that bit of history, that was the great algorithm ‑‑ sorry, the DNSSEC security algorithm roll at one point, that's why we had algorithm 5 and 7 I think are the same cryptography but different meaning for NSEC records.
So, consideration, the IETF delivers documents which describe protocols, the RFCs out there, we have to remember the RFCs are good for software developers and not necessarily helpful to operators. I had one operator say to me a 70 page is no help to the operations community, they are not going to write the code but they want to know how to run what's out there, what default values out there, which security algorithm should I use? When I talk about I shared a pie chart showing which ones are popular, I can tell you what other operators do. What this points to we don't have operational profiles, there's no DNSSEC operational profile, this is a guy saying yes we have a special case for the, everything over here which should you actually stick to, which are are the best default valuers, we rely on software developers to put on their Open Source software.
Perhaps there's a need for the series out there for this, for DNSSEC, and I had one thing I didn't have here, I have realised in most companies we have protocol, we have project managers, they are the folks that put together project looking at the technologists at one end and everybody else involved in the process together. We don't have that in the Internet, we don't have a project manager for the Internet, there's no one running this thing, that's why the Internet is built, it's difficult to coordinate long term transitions from one to the other.
So, with that, I have run out of slides. But I am still up here. That's the contact information.
I am going to open up now, if the Chairs permit for some questions and comments and if we want to make up time because Joe had to read all that congratulatory material.
SPEAKER: Rob Carolina with ISC, this is going to be a personal question, but as I am listening to your very good, very interesting talk about the history of DNSSEC, I am sitting here wondering what problem are you trying to solve with it or what problem are we trying to solve with it? And the reason I'm wondering that is we live in an era where I guess something like more than 95% of online traffic is going over TLS, so if the user community already has an authentication solution at the application layer, how many operators are going to want to take the significant investment in people, equipment, hassle, reduced response times etc., to deploy another authentication protocol? What is this adding or what problem are we now ‑‑ I know what problem you were trying to solve in the 1990s, I am trying to figure out what problem are we trying to solve now?
EDWARD LEWIS: In terms of DNSSEC should be solving, I will get to that first, we want to have a secure computing base, ultimately. You would like to have the DNS be secure for whatever the upper protocols are. There are those that want to use things for DANE not tied to the web, they want to be able to secure the DNS that way. I hear promoters of other services wanting to have secure answers coming out of the ‑‑ whether it's for the web or not for the web.
SPEAKER: To follow up, I don't want to belabour it too much, that's a good explanation of what some members of the technological community would like to achieve, what I am not hearing what unfulfilled need at the end user is not being met at this additional layer of security, what's the huge desire ‑‑ I don't, I don't understand what's trying to drive people.
EDWARD LEWIS: I will give you a short answer, DNSSEC is like seatbelt in a car to me, most people use them because they will want to but they will save your life because there's an accident. DNSSEC is not going to have an end user value that's obvious, we take about this more often but I am mindful of time.
JOAO DAMAS: Next up is Lars‑Johan Liman.
LARS‑JOHAN LIMAN: Thank you, Ed, this was a very good and interesting presentation, I am sorry I am not with you, I got a stroke of Covid, I am recovered now but it doesn't make sense to go down to Rome this late in the process. I think much of what you say is very spot on, I would like to caution a little about involving EPP in the DNSSEC circus because I think that, I think that that's adding another Swiss army knife to the one we already have and that's not going to simplify things and what I ‑‑ a gist of what I hear you ‑‑ in the meantime to repair I believe a very important part of what you are looking at here, but in order to repair something, you need to understand the thing that you are going to repair, so we do have a large challenge with education and the other approach to that is to make it more black box and I will compare which I sometimes do with motor cars, where, in the 1920s, you had to carry your own toolbox to repair your car but today, if your car breaks down you just call for service and someone will come and rescue ow on the highway there so maybe that's the future for the DNS as well, we will have a number of, you know, tow‑trucks that are ‑‑ the problem is how long does it take the tow‑truck to arrive at the point where the DNS service is needed?
Just a prophecy for future. But you cannot repair something that you don't understand how it works so this is a challenge and a balance that needs to be struck. Thank you.
JOAO DAMAS: Thank you. Peter, if you can be brief, that would be nice. If people is taking time to switch let's go with Jim.
JIM REID: Freelance consultant. I think you have given a good description of some of the bad decisions and things we didn't think out properly when the protocol was being developed over so many years, but there's a much more fundamental problem I think we have got here is the economic incentives for deploying DNSSEC are completely perverse. You get no benefit from signing your zones, everybody else does. You get no benefit from validating, everybody else does. So I think that the business case for deploying it is very, very hard to make because you have got costs, you have got no clear benefits and some cases you are adding new risks to your enterprise which wouldn't otherwise be there and I think that's one of the major failures of DNSSEC and that's I think that's probably why we see we have only got 2 or 3 of the busiest Internet websites sitting behind secure DNS and that's not changed a lot in the last ten years or so and it would be good to try and get some information from the operators, things like CNN.com and Amazon to find out why they have not deployed DNSSEC when it's been around for so long. The second consideration in IETF terms, we have got all these tweaks and going on and let's make another little tweak, maybe that's the thing that will drive deployment but in actual fact my view is hindering deployment because a lot of people will look at this and thinking this thing still isn't quite stable enough so let's not make a decision about deploying it yet.
EDWARD LEWIS: There's no business case for using a seat belt and trying to figure out, what I am trying to do is tease out whether or not this protocol could have been an easier decision to make to deploy, technology‑wise than to try and answer the business case.
JIM REID: Cheers Ed
PETER HESSLER: I have been running DNS for myself for 20ish years and I have been working in operations for most of my career, often for smaller companies that have very few resources and in many companies I have been the only person who knows what a computer is, let alone can do anything with DNS. One of the issues that I have seen many, many times is my zones are entirely in text files, they are served by BIND or NSD or pick your favourite and they are static zones, they don't change very often, maybe once every two or three years and so doing DNSSEC on top of that is quite complicated because you don't have a database to keep your new DNSSEC information in there signed. Also, in a lot of places it's been your authoritative servers are often even somewhat recently, are ‑‑ a zone file on each authoritative server that you have added in manually and of course doing DNSSEC there is painful.
The other issue is that there is often zero access to the registrar for me, that is owned by the CFO, the CFO don't understand what DNSSEC is and doesn't care. Signed up for it once and they pay the bill every year and that's it. So I want to add this operator perspective
EDWARD LEWIS: Thanks for that. That's very true, going back 20 years ago none of that existed and that's why we have the issues today and let's be aware of why we have this. We have actually come within entire way to run the DNS based on what we thought DNS was in 2000 and I think we need to recognise the scaling from what your experience to the major players, put it that way. I think we have to come with a scaleable way to do that.
JOAO DAMAS: I think these are all good points, Robert and other people's, if you remember what we have now is the second version of DNSSEC because the first one was theoretically perfect and then completely undeployable, even for us technical people
EDWARD LEWIS: Right. Yeah. Actually, the current version is third version.
JOAO DAMAS: Okay.
EDWARD LEWIS: That's for trivia buffs only.
JOAO DAMAS: Thank you very much. So, I guess as sort of follow‑up to this one, we will have Dave to come up with new idea how to change DNS.
DAVE LAWRENCE: I have a couple of responses for both Robert and Jim but I will not usurp my time at the microphone on that. As Joe just said if you hear people referring to Tail, and you don't know me already, that's me, tha's my nickname. It is lovely to see you all this morning.
So, legend has it that the Internet is built on the back of a turtle and when you are trying to resolve a main name that will give you the name down in the stack to ask for more information about it and he will in it you were refer you to another name and it's turtles all the way down. What if it could tell you something more than the name it's giving you, in fact it's not another turtle you have to talk to but a tiger, something else to resolve your domain name. There is where the new proposed DELEG record comes in, that's it and thank you for this talk!
I want to go back though to clarify something about the metaphor, I wasn't particularly fond of what the turtle V tiger implied because in fact the domain name system has been tremendously successful, fast, efficient, highly resilient, if I built a metaphor around cockroaches it would probably not have gone as well.
Brief history, how we got here, Peter has started at the last IETF hackathon that we get together and discuss what our big wishes were for the DNS, if we were unconstrained about the current protocol how could we move forward with addressing everything from nickeling issues like if it should only have one TTL, to bigger things like how can we scale, transport, be able to hand millions of zones?
So, a lot of, you may have heard your business executives calling us a big BHAG, if we are going to get anyone to talk this new way of imagining doing things, how could DoH we get them over there? If something like that is going to succeed you can have something like a flag day or a separate completely parallel protocol that you expect people to be able to switch over to through certificate endipty but you needed a way to get the new stuff out there without disrupting the legacy DNS in a way that resolvers could kind of serendipitously score it on their own without having to find it.
That's how we came up with DELEG. It turns out somebody else had already come up with, Tim April had posted a proposal for how we could do this essentially leverage the service binding record which had just was in the process of being standardised at the time and was clearly on track to getting its RFC, he wanted essentially to take the ideas behind service binding and turn it into how do we have additional information at the DELEGation that could indicate the authoritative server that you are trying to be directed to and so this is just an example because we are still actually working on the draft to get it out to DNSOP but this is the basic idea of how a DELEG record would work. We will skip over that one in the record at the start of the RDATA but it gives the name of another server you are supposed to talk to and you can get that address in a number of different ways but including here you would see how it could use the glue in name.
So, some of the very important features of DELEG:
Top of the list is opportunistic discovery. As you can see in that previous sample output, that the DELEG record, we expect to be able to be handed back right in the normal DELEGation with the NS record and DS record and so a modern resolver would be able to just use that information, all the resolvers that don't are believed to and I will get back to, believed to just ignore that extra unknown record type. That's what makes it transparent to legacy resolvers. It's also ostensible with key value pairs, the service binding record format has these service prams keys and value in the RDATA. It's also a parent side only record, and this was a key feature both of Tim April's original proposal and also I want to mention that Joe Ably had been thinking of very similar issues in a draft for making a parent side only NS record but retained all the semantics of original.
We also are pretty confident this would have minimal implementation need for authority servers because there's not going to be any special processing, all you have to do is accept the new RR type in your authoritative zone.
And one of the other big features if you are familiar with the service binding record it has something called AliasMode which allows you to say I want you to look at this other record that describes all the attributes you need to know in order to process your ongoing resolution and handling of this.
Finally, we would also continue to allow legacy DNS and sub‑delegations, say you did get diverted from a turtle to a tiger, you could still come back to a turtle further down in the hierarchy.
In direction piece is like, it is spell priority of zero that was one in the previous example, there is still some debate going on about whether we even need priorities but at the moment we are using the same format as service binding record and 0 was a special value that says if you are going to be resolving this name I want you to go ahead and talk to this other, find out the information from this other name, notably this allows some of the operator in direction problems to be solved where we currently, many commercial operators for example rely on customers to have to interact with their registrar in order to get new DS data published in the registry and so one of the big things is that we'd be able to leverage this to continue to build your chain of trust but using the operator's DS so they could be in charge of one DS.
Notably also it allows for alternative transport including operators. You are an operator and now you want to support dot or dock, right now, there's no good way for resolvers to know that they can talk to your servers. The resolver doesn't know how to get to them so there are a couple of ways this is done about currently, one of which is through additional look‑ups for DNS binding records, usually it's been done through out of ‑ configuration, for example, famously Firefox had a list of servers had knew it could talk to DoH to but soon, we want you to use this underlying transport type, ALPN, application layer protocol, and here is the information you are going to need to know in order to connect to it.
So, one of the other problems we have had for a long time is we have specified these other protocols and seen ways that they could be pretty effective for addressing resolver to ‑‑ or client to resolver communications, but we have had a harder time securing the resolver to authorities path and this would be a way to leverage that.
So, there were a lot of ideas in the BHAG list that we first brought to the hackathon. One of the big reasons it's starting to succeed now, this idea of how do we get unshackelled from all the existing constraints was we had a critical mass of people, even though a couple of people had ideas before, we got enough people sitting down at a table that said we really should be able to do this and so there was a lot of excitement and energy around that and so now you can do things like if you had DELEG deployed in the network and a way to leverage it you could, in this example, say, talk DNS protocol Version 2 and you might have a brand new wire format, better synchronisation for extremely large zones or very large zones and one of the big things I want to say to Ed and to Robert, separate from some of the other comments, but imagine a fully secured DNS push, you could have example .net tell you about names in example.com by sending out all the records that it needs when they resolve dubdubdub.example .net, I bet you are want the image's host name too because you are visiting that website and you have a push that all the DNSSEC signed records are fully verifiable and coming right from your DNS server that knows the usage patterns and what to expect.
So, the proposal is not out just yet, it is being actively worked on, as I said IETF just happened a couple of weeks ago and unfortunately one of the co‑authors got back and was sick and then American Thanksgiving happened but we are close, currently it is split up into three separate documents, the core definition which basically was a very first record, transport would be the one that specifies how to use dot and DoQ and other ALPNs and the DNSSEC draft would be the one that describes how you can maintain your security through an operator's DS. We are also going to need an EPP draft because as mentioned we do expect there's going ‑‑ in order to get acceptable deployment levels, the parents have to implement it and so that's going to fall or registrars and registries to accept the data for publication in the zone.
As I said there was a lot of energy about it, I think we finally hit that critical mass and it's going to move this idea forward, many people have observed we have talked about DNS V2 for 25 years and it never really got off the ground and I think this is providing the path that an a number of people, if you look through that list you will see registries represented, registrars, implementers, people from across the DNS community are really interested and what I found really encouraging as I brought up this idea to people outside the immediate community, web folks, they were also like this could be really helpful to us.
So, I don't know what's about to happen here but I will say at least walking around the halls of IETF there was a lot of positive feedback, please pursue this, this sounds like it's going to be for the betterment of all. We do need to test and discuss things, we know that is true about a couple of time or we believe it to be true based on implementer assertions but we haven't tested them and some of the implementations that were not represented by other people.
And also, if this DELEG record is in process and say an alternative DNSSEC path for example how would existing forwarders and validators handle it? Right now because of the way transport got slit on to a separate document the DoH 53, that is DNS over conventional UDP legacy DNS is not explicitly indicated in the DELEG record, it is just assumed, if you don't have another transport mechanism specified, there's still some ongoing debate about whether it should be explicit. Based on our testing would there be any conditions for which we would return DELEG or not, do we unconditionably, that doesn't seem wise especially for any that are still using UDP 512 packet limit. And so, maybe we require it only if we are pretty sure we are talking to at least a modern enough resolver that it asks for an ABNS buffer size that's a little bit bigger. Then of course we'd have other conditions of which DELEG probably should always be returned even despite those conditions.
Also then another topic of ongoing debate internally has been well do we allow side delegations or not, delegations or not, over here but then example give you a delegation to another server you wouldn't like that, could be a way to offload if you are under one of those TLDs and some are notorious, if you really wanted people to be able to use your dot server how would you tell them to get to it if you were in one of these TLDs? I am one of the advocates for we should have a way to side load, there is some argument about maybe this is not the best strategic choice right now, we are going to have our usual bite shedding going on. Personally, I like blue.
Oh, so, when I made my slides in Google slides I had marked this one as skip, but apparently when you export to PDF it shows up anyway, but as long as we are here I might as well say it. I have some young kids at home and we like to play the why game sometimes and you can see that the why game usually ends in nonsense because you just end a path of whys that can't go anywhere else. But this was still kind of true and this is why I wrote this slide originally, it's like so why are we doing this? Because we are trying to make the Internet work better for everybody. One of the things I again want to say about the success of the DNS is many of the people here that are in the room have worked on it for years to make it the fast efficient, resilient system that it is and hundreds of millions of people every day use the work that you do. And all we are trying to do with DELEG now is kind of pump it up to the next step to say how can we keep growing on that path to make it even better for people. The jury is still out on whether the whole Internet thing was a good idea or not but here we are.
Qs and comments?
JOAO DAMAS: Thank you. I think there is a question online.
MORITZ MULLER: Niall O'Reilly asks: "Is this a better idea than the NS records we took so much trouble to chase out of the execs TLD zones?"
DAVE LAWRENCE: Which records.
JOAO DAMAS: Years ago people used to put MX records in the TLD zones and that made for a lot of headaches.
DAVE LAWRENCE: Right, right. Again so as I mentioned, one of the big things that came up in the big wish list of the brainstorming was originally was we wanted to have separate NS records, a different parent‑child relationship, the same record, the same type of RR set being in both parent and child led to a lot of problems as with these other situations. DELEG by being only a parent side record like DS should allot of those problems.
JOAO DAMAS: I see the need for this, I think that it is good, but we were just discussing about Swiss army knives and there is other records that have become part of the DNS recently that also seem like a little bit of Swiss army knife, you can configure so many parameters, it can mean different things and I wonder if that's a consideration you need to keep in mind?
DAVE LAWRENCE: It is, one of the slightly limiting factors on that, it's not completely arbitrary. There is a service binding parameters registry which indicates what the values are and the service binding document also specifies how those parameters are meant to be interpreted, so it's a numeric registry and has a text representation but the limited ‑‑ it's not anybody can throw anything in there.
JOAO DAMAS: Absolutely.
Erik. I was shocked about the ADD stuff, right? Two questions, and basically more curiosity because I like the idea. One, it seems so big effort, does it deserve BoF and Working Group do you think in the IETF?
DAVE LAWRENCE: So our original belief is that, and we will see but we have talked to the three co‑chairs of DNSOP already, they are well aware of this, in fact the one thing I didn't link yet because we are hoping to move discussion, we have a chance that DNS masters most that has four or five dozen people in it at this point from all across the DNS discipline but in talking with Suzanne, Tim and Benno, the initial sentiment is not completely solidified but it seems like there's a decent chance we could get DELEG through DNSOP but the much bigger job of pursuing the BHAGs would warrant a BoF and there are already some people working on alternative wire form it's but I think really pursuing it it has a lot on its plate and this is big enough effort we are going to be searching for separate Working Group for it.
SPEAKER: When you give out information to the AAAA records and DELEG what will happen if the information given is different?
DAVE LAWRENCE: So, that's actually one of the considerations that's considered a feature. For example, you could give dot addresses that are completely separate from your legacy DNS addresses, you know you can build ‑‑ the operator can build separate infrastructure for pursuing them so the glue records that might come through the regular legacy delegation are not necessarily expected to be consistent with those. There's a separate question, well if you did provide offloading in the child and that record got desyncronised from the parent you know, is that an issue because it has been an issue with NS record sets? My position is it doesn't really matter, right? But that's a discussion, an argument to take place further in DNSOP wherever.
JOAO DAMAS: Thank you very much, Shane on the resolver task force, right.
SHANE KERR: Good morning, everyone. I think we are a little behind time so I will get right in. I work at IBM, I am the chair of the RIPE DNS resolver recommendations task force and this is an update about the work that we have been doing for the past year or so.
So, very brief history about where this came from and why we are doing it. Public resolvers like Google, 184.108.40.206 Cloudflare, were getting more popular a few years ago, more people using them for their DNS. All of these large services were based on companies that were outside of Europe, outside of the EU. This is probably a little bit worrying for regulators and government officials, if you have critical infrastructure which is ‑‑ outside of the control of your regulation, it can be worrying, maybe everything is fine, but if it's not then it's not.
So, what did the EU decide to do about this, in very EU fashion they decided to spend some money on a business and what they did is they make an RFP, request for proposals, of someone to run a public resolver based in the EU, following all the EU things that we like like privacy protections and respect for filtering and all these kind of things.
So, the RIPE community got wind of all this and decided we didn't want anything to do with any of that stuff but we do have a role and important place to play in this whole idea and so Joel agitated to create a task force and the basic idea is, what we can do is come up with a framework and come up with a list of recommendations for what we consider the right way to run a DNS resolver and then once we have that, we can then encourage people to take this and set up either public resolvers, new ones or ‑‑ I hope that wasn't me ‑‑ set up new public resolvers or evaluate existing ones based on this or set up resolvers in a smaller scale which is probably even better. Go back to the days when every ISP had a resolver and let's keep doing that but have it run in a way that provides a service that we would like people to have.
Anyway, that was the history and why we created the task force and what our goals were, so what I am going to do now is go through our draft document, which I sent to a few of the mailing lists, the DNS Working Group is the one where the discussion should happen and feedback should happen.
So we have a draft and I am going to go over some of the contents of that draft briefly here and talk about where we are at with it and what we want to do.
Here is an example from the current draft, so this is he how it would look basically and we have a section which says what's it's about, packet fragmentation avoidance and we have a recommendation which we tried to be very clear, I make it in bold so hopefully you can't miss it, you should be configured to avoid fragmentation, which sounds good. We have a brief description of who that recommendation is for. Within the document we target people not only running public resolver but any resolver and so some of the recommendations don't make sense for everyone, that's why we specify that.
We have a very brief discussion about the issue in general, in this case it talks about the dangers of fragmentation, it can cause reduced service and all kinds of problems and what you can do to minimise those problems very briefly and it gives a reference to a document which goes into a lot more detail, and the idea being that an operator can kind of look at this recommendation and say okay, I need to check out MoU high server is configured with respect to this.
As you see in this case it's a draft, we are not even pointing to a best common practice document or anything like that; that's because we are trying to represent the state‑of‑the‑art, what really current operators are trying to have is their best thoughts for this stuff.
So here is the different example, and this is a little bit less focused because we don't have nice ball points and say who it is for because in this case the recommendation is about something that's not a very direct technical thing that you can set so in this case it's talking about transparency and in the specific context it's about transparency reports. This is kind of industry‑wide best practice, it's I guess about a decade or so old now which are kind of explaining, in a public way, what interactions you had with law enforcement people and things like that. RIPE NCC for example, every year publishes a report about all the requests that they have gotten for legal actions and things like that, it's a really good idea, it's in the documentation but it's a different kind of recommendation because it's a different kind of thing.
What are the details about the document. Well, we have got a bit of framing text to help explain what the document is about for people that read it. The idea being that the document itself is kind of self‑contained and it might get copied around so I thought it would be ‑‑ so it's clear for people getting the document, is this something I need to worry about, is it a matter for me, things like that. We explain who it's for and so on.
It's not for Evan, if you are running authoritative server, my company now we spend most of our time running authoritative DNS, these recommendations don't really matter for us, that needs to be clear as well, that is in the document. . We I will talk about this at the end of my presentation here hopefully, it's not a simple checklist. Even that first recommendation that I showed, it basically has a lot of flexibility in how you can do this right? It could be just very simple and saying you must set the EDNS buffer size to 1280 bytes whatever, 1,400 and something, anyway, we don't do that, we give a bit more leeway. So it's not something that you can use as ‑‑ you can't run a scanner on a service and say have you met these requirements? That wasn't the goal, that wasn't the intention, we decided it was better to have a little more flexibility and move that stuff out of scope. That's kind ‑‑ that was the introduction and explanation text and we actually get into the specific recommendations, I am going to go through each of those in great details and I will be taking questions later. I am kidding. I am going to show you the list and just to give a general idea. First section is about how you set up the systems and the networks and how you make sure that you can handle the load, how you can make sure that your system doesn't go down, how you deal with security problems, are certifications interesting for you, things like that. And then we have a whole bunch of very specific I call them DNS knobs that you can tweak, DNS is very old protocol, there's a whole bunch of different things you can do, people have tried to improve it a lot of ways so we have a lot of things here. As I mentioned before we can see the fragmentation stuff, it points to a draft. Other things point to documentation which never went far in standards making organisations at all or quite commonly deployed and things like that. It's all in there. And everything is up for discussion but we think that things are pretty okay.
We also have ‑‑ we finally ended up with discussions about privacy, about logging, and in this case it's not the technical aspects of which port do I set my logging server up and TLS, how long do I keep the logs, what information do I include and how do I comply with local regulations and things like that. So that's the context of that. Similar discussion for filtering and blocking. Basically, our recommendations for these are, we try to move in the direction of user‑focused information so basically, don't filter unless you have to, don't log unless you have to, and things like that. And if you do have to filter, try to make it transparent, try to make it so users can opt out, you may not have those abilities, laws may not allow it but if possible we encourage to you do that. We finally end up with a discussion about human rights considerations.
Yeah. So that's the stuff that's in there. It's a lot of text, it's not a huge, huge amount, if you are interested in this stuff I encourage you to have a look or if you think you are knowledgeable, also I encourage you. We have gotten a bit of feedback already, and I have a couple of slides here going over that. One observations was made that the document doesn't do anything to encourage people to run a resolver, and in fact, some people said maybe it's too much, it might discourage people from running a resolver. You have this huge ‑‑ 15 pages of technical recommendations I have to implement, that sounds way too hard I am not going to bother. We may need to add a section what the benefits are for you and the rest of the world, I don't know. Another thing is, we originally included a section on high availability which got eliminated basically because I think we didn't discuss it very much, I guess possibly we need to add that back in. And as I mentioned earlier we don't currently have a way to measure each of these recommendations, I'm not sure if we are going to actually implement that but we need to at least take that feedback into account and respond to it so we are going to have to discuss that as a task force, we have a typo in one of our RFC numbers, we made a recommendation to ‑‑ we made a suggestion that you don't have to have all four digits IPv4 be the same, you don't have to be, you know, 13 dot 13 dot 13, you don't have, it's okay. There's some feedback you should do that because it's cool and we are going to have to figure that out.
I suggest we discuss it tonight over a drink. Also, there's a suggestion that maybe we shouldn't require aggressive NSEC caching because it has limited utility and maybe it doesn't help. Also, maybe we shouldn't be encouraging DNS cookies because similarly, it was an interesting technology but maybe it doesn't help. There was a suggestion we should omit RPZ which is a way to transfer blocking and filtering information and just leave it as generic statement saying you can block, figure it out, I don't know. These are all feedback, I officially am totally neutral on all of these right now, we haven't discussed it as a task force, that's the next step. That is the next step.
We are going to review the feedback, we have to figure out what to do with all them. It means there's going to be at least one more iteration on the document, a new version of it, probably that will be the last one, I'm very eager to get this work done and move on with my life. And then the ultimate goal of the task force is to of course to publish a RIPE document and that will be something, once we have all the feedback taken in and we are happy as a task force with it, we will hand it over to RIPE NCC for edits and publication and it becomes a document. This is not a RIPE community document, it is our responsibility to produce this so we are asking for feedback but it's not going to go through a policy development process or anything like that, we don't have to achieve consensus, I am the Chair and I can say we are going to do it tomorrow. We want feedback and information, I want to make clear how the process is going to work.
After that we have a big party and then we have a document and then what?
So, maybe for me that's the most interesting thing to discuss here today, two things I think we have to discuss, the first is the state of the document, getting it in good shape and finishing it up, once the document is complete which will be hopefully be in a very few short weeks, what then? As I said the whole reason for was because of this impetus from the EU and this discussion around public resolvers. It turns out according to measurements, fewer people are now using public resolvers. I don't know if that's ‑‑ I haven't looked at it too deeply, I basically trust those. Maybe this isn't a huge problem any more. It's still super important to have this document, but we can go a lot of different ways. We as a RIPE community could start to figure out how do we encourage people to run their own resolver services, we could figure out, we could come up with more operational documents that are more specific, we can consider training, we can consider interacting directly with friendly public resolver operators, maybe we can engage Quad9, I work at IBM since a few months now, who has a very close relationship with Quad9, maybe we can use that to review their practices and both improve the recommendations and maybe improve the service so we can go in different ways and it's really up to the community to decide what the priority is and where we want to put our energy. So, with that, what are our priorities and where do we want to put our energy?
JOAO DAMAS: Okay, thank you, Shane. And thank you ‑‑ it is what it is. And thank you for actually making it happen, because I couldn't. Is there ‑‑
SPEAKER: Can you please go back to your slide on the history, history slide. Because this is something which seems a little bit confusing to me, so for the benefits of the audience because you were saying that all based outside of the EU, actually Quad9 is based in like Switzerland which is again looking, if you are saying EU like as region or EU like the European Union, it can be, but you also mentioned like privacy and actually Switzerland has strong privacy protections, so I just wanted to highlight this part ‑‑
SHANE KERR: So I don't know, Quad9 moved, right? When ‑‑ they were ‑‑ were they originally based in the US? And they moved?
SPEAKER: I don't know the exact date but it was some time ago, yes
SHANE KERR: It was some time ago.
SPEAKER: Yes. And you were saying history so this might be like some time ago but I ‑‑ I wanted to mention for the benefit of the audience.
JOAO DAMAS: The initial reaction was to the EU ‑‑
SPEAKER: Three years ago.
JOAO DAMAS: The original action was by EU, not Switzerland. The fact of Quad9 being in Switzerland kind of came up to see how far could the EU reach because as you are aware, the fact that you are in Switzerland didn't prevent you from being sued in Germany for privacy‑related access to data. Does this extend further? Yes. In the end, this triggered the thought process but the thought process is the ‑‑ the goal of the document is to be generic, to be applied anywhere, it's a recognition for public resolvers ‑‑
SPEAKER: No, no, because it was mentioned like public resolvers and EU and privacy so I just wanted to ‑‑
SHANE KERR: That's fair.
JOAO DAMAS: How it started only.
SPEAKER: From dot RS clearly. I just wanted to give a comment and I don't have a question, and I support your work and thank you for doing that, that's very important on all levels. If we talk about public resolvers that is done and I believe there will be more of them to share, not only use Google 50% of Internet and rest doing the other ones. It is important for ISPs because I have seen ISPs which does have wrongly configured resolvers and this will help, validation of one server, TCP and do not validate on another one
SHANE KERR: Yes
SPEAKER: So there is no reason of doing that. And also it is, could you imagine it is not checklist but is good list of recommendations that should be followed. So thank you and I support your work.
SHANE KERR: Great, thank you very much.
JOAO DAMAS: Thank you very much, thank you, Shane.
SHANE KERR: Sorry for being late, please come with ideas if you are too scared to share at the microphone here.
JOAO DAMAS: We will definitely be running over the allocated time, so let's go with Tim Wicinski co‑chair of the DNSOP giving us a summary of what's going on.
TIM WICINSKI: Do you want to run the slides there or shall I?
JOAO DAMAS: We will run them from here. I don't have many and I will sort of move things along.
I could chair DNS operations inside the IETF with Benno and Suzanne, next slide.
JOAO DAMAS: Apparently, you have to do it yourself, I'm told.
TIM WICINSKI: Oh, I have to do it, oh, okay. There we go. So, you know what, we are three chairs, you can't blame Benno for any of this talk. The DNS never sits still, as you saw from David's talk earlier always good stuff going on. So currently DNSOP has 15 Working Group documents in various stages and we have published 7 RFCs in the last two years and it's a bunch of work and people get overrun by it and I try to organise this talk based on how I view all the work and I view it in different views, I use views in the very DNSy kind of way.
So, you know, from what sort of view, point of view is the zone operators, it's like what kind of RFCs and what kind of work are we doing that helps zone operators, that help ‑‑ or just implementers as well examples catalogue zones which is recently published, caching resolution failures is a big thing for operators, things like the zone version which is putting EDNS option in there to signal a version of the zone that operators is finding this could be very useful. And then things, something like extended DNS servers which is been out for a couple of years but it's starting to gain traction and a very good example of this is something and you will see this is something that I knee in different places like if you look at these different views, it doesn't show up, it's used by different groups, right? Implementers love this, focus that run a lot of zones care about this and users care about this.
And just on the DNSSEC deployment, you heard Ed's talk, I got Ed and I can argue for hours about some of this stuff but we have done a lot in DNSOP but how to make it more robust, how to deploy stuff and DNSSEC in a more robust manner, a lot of it is either doing multi‑signer, trying to work on automation boot strapping, the consistency discussion about the DS key, things about NSEC3 or even aggressive TTL stuff that is debatable, I guess, a good way to put it. This is a lot of stuff that I think it's tried to sharpen up the squishy parts, it's gotten a lot more sharper, a lot more focused and robust. People are doing this stuff. And that's good. That makes it more easy to deploy. And then how do you scale stuff from applications point of view? And this is stuff that I don't think DNS people think about, things about domain verification techniques, I am a person that looks at a lot of data, I look at a lot of DNS data all the time, and I just, I guess I care what goes in there, right? And it's interesting stuff but I don't think like zone operators care ‑‑ people that implement stuff and people that operate zones, they don't care so much, as much as sort of the app users or the end users. It's the big thing, it's ‑‑ I think that's really going to sort of take off in more places. I see a lot of use in that and it's growing. Willem made a good point when we were talking last week, he didn't see a lot of focus using the encrypted client hello and I think some of that is because that's still churning down the pipe, I think it's still going ‑‑ gain traction, sort of thing, and hopefully it will happen, who knows. And this is restructured errors, the structured DNS servers is oh we are going to return JSON, right? When you get a DNS server. Which kind of, it breaks my mind but at that same point I know how useful it is so it's ‑‑ DNS people just, you know, the more you think about it, if you are in air ‑ situations you find this super interesting is sort of the way to say it. It already happens today in error prices.
And another point of view, there's stuff in the root/TLD operator's point of view, it doesn't interest focus that run on a daily basis, dealing with priming queries or trust anchors, all the rollover stuff it, it's very important but focus that run DNS on a daily basis are trying to like, that's not our view. So, they don't really plug into that, is a good way to view it.
And then we do ‑‑ we do work on updating and improving the standards, terminology updates are happening all the time. Big discussions on glue, right, if you want to talk to me about glue it's we can go on for hours, kind of thing.
And of course, there's the DNSSEC BCP was basically taking those documents from 20 years ago, 4034, 4035, 4036 and putting them into one BCP that people can reference. It's a lot of, I think, cleaning up, as we all say people going around 25, 30 years, sort of clean up after ourselves and we are seeing a lot of that and it's not ‑‑ it's not sexy work, but it's stuff we have to do, I guess the good way to view it, right?
So, that's just giving sort of overview of the many views of all the stuff we are doing in DNSOP and instead of trying to look at all the stuff we are working on, I try to view it as ‑‑ I put in sort of almost like buckets, different types of stuff that people focus on and people care about and instead of trying to take it all in, and overwhelm, it's a very good way of approaching this. So, just to give you a heads‑up. And that's how I sort of view it, when I do all this work and I am hoping that can help people because I think some people see all the stuff we are doing and they are just like oh there's way too much stuff going on and we do have a lot of stuff going on but we take in a bunch of stuff and we are the ‑‑ Suzanne calls us the DNS business patch of the IETF, right? We welcome the delegation discussion and we feel that we will push to have a BoF/Working Group stuff, really quickly, as pointed out. It's very good way of viewing it.
So, that's just ‑‑ I think there's even more.
And Willem had this great idea doing fancy Venn diagrams and I tried and totally failed so I suck at all that, so I failed all of you for that. But that should get us back on track a little quickly. That was just my quick update on what we are doing. Talk to us, talk to any of the Chairs, please, you know, always willing to listen to what focus are doing, sort of thing.
JOAO DAMAS: Thank you, Tim, thank you very much for that.
I am always kind of surprised because I remember there used to be two Working Groups DNS ‑ and ops and one point came and someone said stuff is done and focus on the operations and I think there was one of hold my beer moments because since then, we have been rewriting the whole thing.
TIM WICINSKI: Yes. Oliver told me that for years.
JOAO DAMAS: I don't see anyone going to the microphone here and if there are any questions online? I think this is more of a review and there is homework to be done now, so thank you very much for providing the road map for that, thank you, Tim.
Our last talk today is Anand from the RIPE NCC on the RIPE NCC DNS update.
ANAND BUDDHDEV: I am from the RIPE NCC and I will be doing a brief update on some of our recent activities and incidents.
So first I would like to talk about our authoritative DNS cluster expansion, we have been talking about this for a while now. And unfortunately, we faced issues with hardware delivery, after Covid there was a lot of delay and lag in receiving hardware because all the various companies were going through their own problems but I'm happy to report that we finally did receive all the hardware for this earlier this year and we are on the verge of adding a new site to this cluster and this is going to deployed in Tokyo alongside our K‑root instance there.
So, hopefully by the start of next year, 2024, this site will be fully operational.
Next, I'd like to talk about an incident we had just recently, I think most of you, if not all of you, will have noticed this, and will have been affected by this, because we broke our own zone, ripe.net. And this happened on 1st November, mid‑morning, Netherlands time, and the issue was that most of the signatures in the zone had expired.
And this was a very unfortunate case of being a perfect storm kind of situation, and it started with a human error and somebody accidentally introduced a record with a TTL of 10 days instead of 1 day, by adding an extra zero. And this then triggered the signer to, you know, spew errors and refuse to refresh the signatures, so I mean, I believe that the signer should have refreshed all the other signatures and ignored this particular one, but unfortunately, it failed to refresh signatures.
Then our monitoring was monitoring only the SOA record and the signatures in there instead of every signature in the zone, and so the SOA record was being updated and being incrementally signed but the rest of the signatures were not being refreshed so we did not notice this until, you know, it affected everyone and went public.
So, we fixed this and we also published an incident report and the link to this is here in the slide so you can download the slides and click on this, this was also sent to the RIPE DNS Working Group mailing list so you can also find it in the archives. And all the details are in there.
So, we of course want to prevent this kind of error in the feature and one of the things we've done is, we've introduced more checks to prevent this particular human error. We have also been in discussion with the signer vendor and we have asked them to modify their code and of course they have an opinion on how the signing should be done and we as an operator have our views so we had to come to some kind of compromise but they have a fix and so hopefully this can be avoided in the future and basically, they agree that they should continue signing and just log errors for specific records.
And then we improved our monitoring by updating our checks, so we now transfer all the zones out of the signers and perform a full check on every record and every signature in there, and alert if signatures are, you know, not being refreshed.
The next thing I want to talk about is Zonemaster, so Zonemaster is software developed jointly by the Swedish registry and the French registry and it is used for checking the correctness of DNS zones and you can use this to check the correctness of zones that have already been deployed and you can also use it for pre‑delegation checks, so at RIPE NCC we make use of Zonemaster for doing pre‑delegation checks on all Reverse‑DNS delegation requests that come into the RIPE database via domain records and we expose this to our users as well, but we are now integrating the Zonemaster user interface into RIPEStat because we would like this to be, you know, the one place that people go to for looking up information about their IP address space and related DNS.
So, if you go up to RIPEStat, and you look at the top bar ‑‑ bar at the top, the search bar at the top I mean, you can now actually type in Reverse‑DNS zone in there, so I have an example which is 14.0.193 be in‑addr.arpa dot, and this will trigger a Reverse‑DNS check using Zonemaster and the results are displayed and if you look at where the arrow points, then it gives you a summary of any issues that might be found. If there are warnings or notices that is usually okay, but if you notice errors or critical then you should check your zone and see if there's anything you need to fix in there. At some point early next year we are going to deprecate the Zonemaster native user interface and redirect users to RIPEStat so that this will become the only interface for users to use the RIPE NCC instance of Zonemaster.
And finally, I want to talk brief about software updates, so I mentioned at the last RIPE meeting that we are aiming to upgrade our DNS servers, we currently run CentOS 7 and we wanted to upgrade to Oracle Linux 9 and we are ‑‑ we have started this process and we have already updated some DNS servers, a couple of things we had to do is to relax the default crypto policy on Oracle Linux 9 because it doesn't allow things like verification of SHA‑1 digests and things like that, but we still need to do those verifications from time to time.
And we noticed this when trying to use utilities like K zone check which is part of Knot DNS because when we fed it a zone it checked on SHA‑1 and ghost digests, so we have spoken with the CZ.NIC people and they will fix this.
And we are soon going to upgrade more of our servers after this RIPE meeting and we hope by the next time we meet in Krakow, most of our servers will be updated. With that, I come to the end of my presentation, if you have any comments or questions, please let me know.
JOAO DAMAS: Thank you very much, Anand.
I believe Liman is in the queue.
LARS‑JOHAN LIMAN: Too many buttons here. Hello, Anand, thank you for the presentation. Are you using the default things for Zonemaster? To me Zonemaster is useful tool but it's jam packed with people's opinions rather than in fact because DNS is a minefield of opinions so if ‑‑ how shall I phrase this. Does warnings prevent you from having your reverse zone delegated?
ANAND BUDDHDEV: Thanks for that question. Zonemaster comes in with certain opinions, as you say, about how DNS should be. We have deviated slightly from the default policy for a couple of things that are users felt were too strict so we have over writes for those in the policy and this is one of the nice things about Zonemaster, you can overwrite the default policy so we are doing that. And what we do is that notices and warnings do not prevent Reverse‑DNS delegation so they are emitted and people should check for things but we don't prevent, we only prevent delegation in case of errors or critical messages from Zonemaster because those actually mean that the zone might be broken or might not function properly.
LARS‑JOHAN LIMAN: That's a very fair state, so thank you for doing that. Thank you.
JOAO DAMAS: Thank you. I don't currently see anyone else so thank you very much, Anand, for providing this update.
And with that, we are done, sorry for running a little bit over time, I think the idea was to make sure you didn't have to put up with the queue of people at the barista, see you in Krakow in May.