Open Source Working Group
30 November 2023
9 a.m. ASNs

MARTIN WINTER: Good morning. Or at least the ones who didn't come to the whiskey BoF yesterday good morning. His fault that we are at 9 a.m.

MARCOS SANZ: I just saw it was a 90 minute slot and I was so happy we could get a 90 minute slot so I said yes, that's for us.

MARTIN WINTER: We know that open source doesn't happen in the morning. But unfortunately this is RIPE and we have to make it happen now in the morning, so thank you for the ones of you who made it up. I hope a few more people wake up and walk in while we are going on.

So, let's take a look at the ‑‑ at the agenda (it's not coming up here again.) Ports are wrong probably)
This is like the welcome and we have an agenda, the agenda has been online, so there is no any other business, thank you very much. Then that's the agenda for today.

Approval of the minutes from the previous Working Group meetings. From the last one is also online and we haven't received any comments from that, so thank you very much for the minutes, for the scribe to take the minutes and Credit Committee see the minutes are approved.

We are going to have a very, very interesting item in the agenda, the one with co‑chair election. We'll get back to that in a second. And moving on, we also have presentation from Valerie
We'll have a presentation about what the challenges in the area of QA are.

MARTIN WINTER: As you know always in the fall we have co‑chair elections and that's the first time in the history of the open source Working Group where we actually got more candidates than opening slots, which thank you, I have to say like thank you to the ones who stand forward. It's great. I started feeling that I may have to do that forever.

The candidates all agreed that the only.candidate for the open slot, so we have basically one out of these three people which get elected.

And with that I basically what I want to do is a quick round of introduction, each of the candidates will come up, have a quick intro, explain what their histories or whatever they want to say. What they may want to do better different, and then after that, you here have a chance to ask them questions if you have any.

The elections itself will be on the mailing list afterwards so there isn't an election here in the room. It's mainly there we'll give you two weeks afterwards on the mailing list to express your support for one of the candidates or multiple ones. And then we'll pick up like wherever who is elected there.

And if that we start with the first candidate, which is Christian ‑‑
Orts must have changed again.

SPEAKER: So, my name is Christian Sheila. It's pronounced, you better call me Chris for convenience. I'm 51 years old, and live in Ben I am Germany which is nearly the middle of Germany. I started networking, designing networking and networking solution using Linux and open source software about more than 30 years ago ‑‑ so my first job was at a really small ISP, which was coming up, it was called knack a.m. a, so there are people who might know it and I was helping them building the first infrastructure. /*RBG owe.continued to work mostly with open source solutions, helping customers to get rid of Windows at this time. So I joined RIPE, at RIPE 70, so I mostly tried to be here every RIPE meeting. And I learned a lot in Working Groups, especially I really have to appreciate the work of the IPv6 Working Group, and this is why ‑‑


Open Sources Working Group ‑‑ Part 2.

AUDIENCE SPEAKER: Peter Beller, Working Group Chair on IoT and speaking for myself. So I know Christian since 20 years and he has inspired me the last time '86 to come to Rotterdam, so today, after the whiskey does not seem so inspiring, but he made me to really take over and I think Christian is somebody who really encouraged people, so, give some hard time to developers, but also explain the reasons for things and solving the layer 8 problem. As I know it here. So...

Not being only the developer but also make purpose and get things developed. So as you understand, you live 20 years from open source and that is also a good sign.

MARTIN WINTER: Any other comments, questions?

MARTIN WINTER: I just want to add to it, obviously I mean the timing that something with the RIPE, it's not a working group, but from what's always the challenges obviously to get interesting talks, we depend on the people here in the room to have it, but sometimes I appreciate whoever becomes it, has good ideas to sometimes no projects or some interesting stuff to talk about it and get something in there.

Okay. Thank you.

Less about the specific will projects and talking about specific software, but trying to find generalities, commonalities, cross cutting specs as we call it in software. And this is what we have tried today to put an agenda that explores that kind of things. And specifically, we want to deal now with how to credit for contributions to open source projects. Valerie is going to make a very interesting presentation on that, and she is going to try to crystallise what it might be best practices on the topic. So, Valerie the floor is yours.

VALERIE AURORA: I want to talk about maybe best practices for credit for open source software contributions which I didn't realise you needed.

So people disagree about credit for contributions a lot more than most people think. I have two specific examples here. The first one can be summarised as I was robbed of my first Linux kernel contribution. And the second one was "I was unfairly accused of stealing code I didn't even look at."
I'm suggesting today that we work together as a group to create an example written policy that open source projects can use to make decisions about this. I'll talk a little bit about some of the issues, how to decide who deserves the credit, unquote. And how to give credit in the way that best shows your goals, hopefully you will have some discussion: I am only twelve years working an an inclusion consultant and my work has been plagiarised multiple times. I have had to correct people for giving me credit incorrectly as well. I once wrote an article about B FF RS, people thought I had written the file system. No.
All right. So, here is a concrete example. You might have seen this going around the Internet about two months ago. How I got robbed of my first kernel contribution. Aerial Miculas fixed a 6 year year old bug that corrupted the task instruct on power PC in the Linux kernel. Mailed some minor changes, basically moved an inlove code around and checked it in with himself as author and gave aerial credit reported by. Remember it was reported six years ago. This is Aerial we say take, my first contribution to the kernel was really frustrating and discouraging experience dealing would people who do not think it's important to get proper recognition for your work.

There is a sort of a happy ending after writing this blog post. It got a tonne of online discussion over 700 comments on hacker news. It inspired this Working Group session. The maintainer did apologise. I am sorry about the way I handled your patch. I should have spent more time working with you to develop your patch. I agree that the tag doesn't properly reflect the consideration you made. I should have realised that at the time.

I'll point out the commit still doesn't show aerial Miculas as the commit err.

Counter point here. This is also an experience that happens frequently. "I was unfairly accuse evidence stealing credit" I private story anonymised. Sometimes a contributor is wrongly accused, somebody submits a patch, it goes into the giant pile of unreviewed commits sometime later another contributor writes, fixes the same problem, and the first contributor is like hey you stole my patch. Why did you do that? I just want to acknowledge all of us know, open source contributors are overworked. You are not paid to do your entire job and you are especially generally not paid to mentor new contributors, reviewing is a huge issue.

Just to be very clear here, I want to contributions to open source projects come in many forms, things like dev ops, bug reports, design, documentation, event organisation, mentoring standard definitions something we might do. A very specific example was IETF RFCs and Working Groups in which there's some documentation about people using drafts without giving credit.

This obviously happens to more than just women. It happens to every single person out there.

All right, I just want to be super clear. People disagree about how to give credit and who should get credit. Just some comments from the hacker news thread. Somebody says there would be plagiarism. Someone else says there is no guarantee of getting any accreditation. Another person says it's unethical not to give credit. And another person changed their mind. I used to think this was fine to take someone else's contribution and then create a total rewrite of it in my own branch and purge that. Somebody did that to them and they were like wait, this is horrible, why would I ever contribute to a project that did that to me?
There is a bunch of reasons why people disagree. I'm trying to make the case for writing this down.

Open source software depends on community contributions. Getting credit for their work is the main reason people contribute, even if they are being paid, they are only being paid if they get credit. You can't be like, I wrote a bunch of software, but trust me, it was really mine.

There is a bunch of reasons why people don't get the credit they think they deserve. It's mostly overworked, people making mistake, but people do take the credit, like I said, I have my code stolen multiple times. There is people who want to show up, do a drive by or something that's not so important, or they steal somebody else's code. There is multiple different ways this doesn't work out.

I just want to point out. This is I didn't realise. Somebody wants to exclude people from contributor status. There was a lot of argument on the hacker news thread about you don't deserve to be called a kernel contributor. So, just for one patch that fixed an incredibly different bug with a tasks struck that required hardware debugging. Anyway whatever. I'm fine. You can be a kernel contributor if you contribute one patch.

I want a solution that reduces work for maintainers instead of increasing it. Sets expectations for contributors and reduces fighting with maintainers. Attracts productive contributors who actually make your job easier as a maintainer. And repels unhelpful contributors who want to have lots of arguments or steal other people's patches.

So, a written public credit policy would do you will of a these things. One thing I notice is that there is a lot of discussion from people who said I was the bad guy once, and I was just so busy I now realise I could have checked it in with the other person as the author, or I could have done some other thing to make this a good experience for the contributor and get another contributor to my project. We're all busy, write it down, you know in advance.

Assess expectations so people at least agree what's going to happen and they are not surprised when what the thing they think should happen does not.

Fewer arguments attracts productive contributors if you have a policy which rewards people for doing the behaviour you like. And it repels under helpful contributeers, people who want to show up, do that thing where they lightly rewrite somebody else's work and submit it to your project, they are not going to show up to your project, you are only going to get people creating new good work.
I want to talk a bit about who deserves credit. This is a philosophical issue which I don't think we need to resolve. By convention, plagiarism, when it comes to written work, in the forms of pros or poetry or art or things like that, it's well defined, people still do it, but there's a broad agreement in general, and there are some great areas, but many fewer than in software.

Software, I think, is particularly hard because let's be real, it's all collaborative work. Everything we have written is built on a giant stack of things that other people have created: Idioms, best practices, standards, force code to resemble other code. If you followed some of the court cases the list of error codes is going to be the same, no matter what.

And it is best practice to copy similar code and slightly alter it, right.

So, who wrote a line of code? Who deserves the most credit if multiple people contributed to it? I say I don't care, what behaviour do you want to reward? And what I would like is for people to think about that and make a policy based on this. So, new contributors, is that something you are really into? Contributions in general, collaboration and creative problem solving. Mentorship and peer support, that would be amazing. So, when in doubt, give credit in the way that rewards these behaviours.

So, a specific example. Give a code contribution whose first version was written by a first time contributor and whose final version of rewritten by a long time contributor, what is the effect of giving prime credit to the long term contributor or giving primary credit to the new contributor, any thoughts which one of these would be more to your attest?

2. I agree because I want to have more contributors, it's likely if you give it to the long time contributor it's likely to discourage the new contributor. It's unlikely to change the behaviour of the long time contributor. This wasn't their breaking point. If you give it to the first time contributor, that's their first patch. I had a party for my first Linux kernel patch that got into the kernel, it involved a lot of coconut rum, a swimming pool and I don't remember the whole thing. It was amazing, but ‑‑ I don't think the long‑time contributors going to be drinking way too much with their friends for that.

So, I'm looking for collaborators on an example credit policy. My idea is to create a menu of options rather than prescriptively you should do this. Be like hey, is this the behaviour you want, put this line into your policy. It should be specific to that people don't have to think about things. There are difficult situations, they happen all the time. I am sure all of you have had them occur in your projects.

I would like to encourage people to come up with creative ways to give credit. There a lot of things that don't look like the authored by field in the get blame output. That's very but there is multiple ways.

If you are interested in contributing or itor leading this effort, I don't want to lead it but I will if I have to, please e‑mail the Open Source Working Group or me at this e‑mail address and we promise to give credit accurately.


MARTIN WINTER: So, we have time for discussion, so I would appreciate especially ones here involved really in the open source, maintaining projects, I would love to hear how you you have an issue or if you have seen that issue. I can say like especially in the AFRINIC community we had this discussion in the routing and we probably didn't handle it perfectly all the time either. It's also, we had a long discussion how to actually correct it because a classic way is to signed offline but signed offline for us means basically you stand behind those from legal, like listens things, and if I change someone's code, I cannot have, basically, from that you potentially the signed offline in there. So, it's a bit of a discussion. We had discussion maybe at Board in the is that correct? So I would love, I am sure Valerie too, like, so solutions.

AUDIENCE SPEAKER: Hi. DE‑CIX, thanks for bringing this here, I think this is very relevant, and first one is a comment though.

Regarding first or long‑time contributor, I think it's way harder to come up with anything new, and iterating is easier, so, I think for me, there is no question that the first time contributor also gets the credit and that's just the common part.

The second one might be diverging a little bit and we can take that to another place or to another time, but what I was like inspired by your talk was the question whether people using open source are giving those open source projects the right credit they deserve. Like, in actually doing our day‑to‑day work, and this is probably something people here in the room might have an opinion as well. So, that's like another perspective, but I would be also interested in that. Thanks.

AUDIENCE SPEAKER: From the meet Q&A, it's a comment, not a question, from Alexander: "Please let the policy be short. Long policies can stop contributions too."

GERT DÖRING: If I have to read 50 pages of policy I will not contribute.

VALERIE AURORA: Good policy is also a message.

GERT DÖRING: Also a long time open source enthusiast. I remember the first patch from me that was accepted into tailor UCP like 20 years before GIT so there was no question of author tags and anything, but I was very happy. So, I fully share that sentiment.

Having such a policy that a project can refer to and say, this is how we do credits, is helpful in situations where it's not obvious. The author thing is complicated, because, say, you have a patch that's 100 lines, of which 95 lines are good and 5 are bad and the maintainer changes some other 5 lines and introduces a new bug, and then commits with whoever author it sets to, the name of the author of the commit sort of stands for the quality of that code. So, if I change somebody's code, I definitely need the agreement of the original author to check that in with his name. So it's not as clear‑cut. In OpenVPN we try to avoid that situation so either we try to get the original author to resubmit the Version 2 with the things we should be changed, being changed, so we just take that patch as it is and commit it. Or, if the original author said, yeah, but I found your bug and I have no time to work on it, somebody of the maintainers will submit it with his name but with credits in the comment. Which is both not perfect, but, I don't know, having a policy to look at might help, might be helpful in these cases, or not.


GERT DÖRING: I would definitely follow this. I'm not promising actual contributions but I already have no time for all the things I have promised, but I will definitely follow it. Thank you.

VALERIE AURORA: I think those are two good points. I want to say, one of the things is you can have a cascade of best option. This is not working. Second best option, that's not working, third option and something if you are like hey, if you submit a patch this is the process you have to reply and sign off within X days, right. You could just improve your expectations.

Briefly, the second thing is, that the I have thought about the oh, no, you introduced a bug. Well, my name is on a bug right now that was inherited from the previous code; I just had to move the code around and slightly change it. It's never going to be perfect.

AUDIENCE SPEAKER: Nat Morris, NetBox Labs. Two questions. Firstly, I enjoyed your presentation.

What's your opinion on projects having an office file?

And the second question is: How relevant do you think organisation having a CLA or a contributor licensing agreement is off‑putting for first‑time contributors?

VALERIE AURORA: First one, what do you think about an author's file? So I was thinking about the problem of oh, I just realised I checked that in with the wrong author name, how do I fix it? Well, there is a thing called GIT mail map, and that is just a full like e‑mail plus name remapping situation, a very good one. You could have a GIT credit map which remaps credit for commits. I do think that adding any particular new way to give people credit, such as an author's file, that makes sense. Sure it gets out of date. I don't really care, what's your goal. What was the second one?

AUDIENCE SPEAKER: Do you think organisations that have a CLA ‑‑

VALERIE AURORA: So I totally understand the thing where people want to have copyright ‑‑ it definitely puts people off for sure. Like I think twice when somebody says you have to sign your copyright.

AUDIENCE SPEAKER: Maria, BIRD developer. This is not my talk yet. I am just ‑‑ I just want to return to that, what was that, the one line code which is plagarism. Well, there is a joke about somebody going with car to a service and that service for them just tightens one screw and then the invoice is for €100 and that's exactly it. You don't have to author a big piece of code, you just have to know where to put one single line and even a one byte patch is sometimes a result of five weeks of debugging. Been there, done that, please don't repeat my faults.

VALERIE AURORA: I completely agree. That's actually the kind of code I love the most. So my most important contribution to the Linux kernel is relatively time and it's five lines of code and to me that is extremely important. When I see contributors who spend a lot of time checking in tiny little nitpickey patches, I know that they are looking at their GitHub graph. So...

All right. Thank you.

MARTIN WINTER: I want to add my side. I think the other thing which many times gets forgotten, there is next to the people contribute, we have also welcome people who review things and comment things, and there is normally no way for getting any recognition for that part. So, I would hope maybe there is some ideas how to get that in there too.

VALERIE AURORA: Code review is in such short supply because it's not rewarded. Dore Tor that is something we do in OpenVPN. Every patch that goes in has to be formally acted and said by an established contributor. This is a bit fuzzy but somebody has to say I have tested this and I say this is fine and this gets recorded in the GIT commit f you go through the GIT log, you actually can see who had this so the most active reviewers get their piece of credit. I get nothing. I just test it and merge it.

VALERIE AURORA: Yeah. I think one of the ways to look at this is people who are working on your project, some of them are doing it for the fun of it, and some people are doing it for the fun and plus getting paid. They need to come back to marry manager and say here is what I was doing all day long and, creating ways to make that easier for people to make that case is helpful.

MARTIN WINTER: Thank you. So please remember, contact Valerie, I hope there is a good discussion. I am looking for to some ideas and some policies or something, if you don't remember, I would appreciate sometimes sending and update on the mailing list how that's going.

With that we are going onto the next talk from Maria about...

MARIA MATEJKA: Hello. I am Maria, I am developing and basically leading the team of BIRD. I like to very appreciate that we are today speaking about the common things for open source projects, because well it's not only new, it also looks like helping others and it's not like a commercial for one and a commercial for another. It's really solving the real problems.

Also, I want to say Valerie count on us, we are going to help with those policies.

Anyway, let's start with it.

Oh, sorry. There are many projects that are open source only formally. There are projects that are open sourced with a meaning that the source is open, you can read it, but you cannot, in fact, modify the code, or if you modify it, they won't accept your submission. So you get the source code, but the modifications don't get back to the community, and also there are some projects which, well, they were required to go open source, or they need to go this way because they have to be compiled in place. But in realise, this is an obfuscated code and you have no idea what's going on there.

Well partially about obfuscated code, it's also a BIRD thing, but let's return to it later.

There are projects that accept basically only minor fixes. I have seen some of them, but the majority of the projects are moving on by the original authors, or by a team of developers, which is for most cases also what we are doing in BIRD, because, well most of the work is done by the core team.

There are some reasons for this. I think we'll return to that later.

There are situations where you get large submissions and these large submissions start to get some problems. And the problem is about the quality of code. You can get a good submission, which is completely okay, makes complete sense, it's perfectly mergeable and maintainable. Because you have to support those features if you are upstream that, if you want to accept a patch which has a feature, you have to support it, or you have to make reasons and find reasons why not to support it in some future versions. And this is getting quite complicated. When people are expecting different things from your software, you are having different options. You can say, well, we are only accepting minor fixes and we want the big things to be developed by ourselves. You can say well, we are going to accept this, but it must make sense for us, or, well, this is not what I wanted. Or, you can do many, many things, but it also boils down back to the contributor policy, not only in giving the real credits, but also in saying what actually we are expecting from the contributors.

Because, what happens now, it's time for ‑‑ now it's time for this slide ‑‑ you can get a complete abuse of internal API. You get a perfect submission which works for this specific version and it won't merge with something you are currently planning, you are currently implementing, which is what actually happens in BIRD for several years, because we are having an off spring, we are having a second branch which is a bigger work and moves the internal API a lot.

And then, when somebody comes and doesn't create a code which is easily mergeable to those principles, you are screwed. You are completely done, and you are going to merge the code for five weeks.

And I think I'm underestimating this. There is a code which I refused to merge because it was abusing the internal API so much that I would have to rewrite it almost completely.

So, well, they solved it somehow in Linux kernel. I don't exactly now how, but I have some ideas how this works. But, well, in Linux kernel, there are, like, thousands of developers contributing different things to different branches, it all gets merged somehow, and this is, I think, one of the only projects in the whole world which can be said that it's a truly community thing, because there is not only a maintainer, there is not only a group of maintainers, it's basically, well, the authors are ‑‑ have so low contribution percent on the whole body, that ‑‑ that's it.

So where to go?
Does your project have programmers' documentation? Do your contributors know how to fix things in your software? Is it up to date? Is the programme's documentation really up to date to the last things you did, to the last changes, to the last API you actually completely worked several years ago? And now you find out that what is stated in the programmers' documentation formally shows that you are using some function, but actually the function is doing something completely different? And do you like incoming patches when you maintain the software? And do you actually merge them? Which is something which also happens to us. And this is a psychological thing. Many people work in a way that when you get something which is too hard to merge, you just keep putting it back and back, and you basically never merge it because you never get enough energy to overcome all the problems which come with this. But it's not only a problem with merging, it's also a problem with the initial review which also again boils down to the lack of some persistence, to the lack of programme's documentation and to the lack of some document saying what we should do. And I have some experience from a completely different field where it's basically essentially to have written rules not because somebody was to blame, but when you have written rules, you don't have to think about it when you have to apply them.

So, I will advocate for having such rules for not only giving credit, but also for the contributing, and it should not be a manual for the contributor; it must a manual for the maintainer, and it's not a rule which could be used for ‑‑ by somebody to sue us. It's a rule for us to not have to think about it once again and again and again, because it's still the same, it's still the same problem and we should be consistent, we should not have to remember how we handled this same problem yesterday, last week, or five years ago.

And what others are you dealing with?

Your time.

MARCOS SANZ: Thank you very much, Maria. Comments, questions?

AUDIENCE SPEAKER: Valerie, my own bad self. One of the things ‑‑ so, I'm trying to write a novel and it's really hard to do work by myself. So, one of the things that I immediately thought about was, well, first, the way the Linux kernel solves some of these problems, it says we don't ‑‑ maintainers don't do merges, it either applies or it doesn't, so that's a thing you can document. But the second part is, reviewing, telling people how to redo it. And/or merging. Better done as a group. Maybe there is a review party every month. That sort of thing.

How can you make ‑‑ how can you make it fun to do the miserable parts of open source? And of course there is the bit where it's ‑‑ people need to get promoted, and as usual it's very hard to service the work that doesn't seem so glamorous. So...

MARIA MATEJKA: Yeah. You led me to pointing out that yes, the reviews are fucking hard. And if you get more and more patches, which you have to review and basically point out the same things that are not going to work well, for example, because of performance reasons, and those people just can't see it, well, that's a reason why to write the policy, to save money, save time for both parties, because you can then point out to the policy and say: Well, this policy ‑‑ we are trying to adhere to this policy, and this policy says we are, it says that we are checking the performance, and if the performance drops by your change we can't merge it, for example.

MARCOS SANZ: It has to do the one thing pointing to the document or the primary documentation, the other aspect is the mentoring aspect, that somebody comes over and over again is like, hey, come take by the hand and I am going to explain to you how to do this better, so that next time it's better. But how much time do you have left for that?

MARIA MATEJKA: Exactly. It comes back to the programmer's dock, because if you don't have programmer's documentation nobody is going to know how to contribute.

AUDIENCE SPEAKER: Martin Winter. The challenges is always most people think like you submit something, a patch or a feature or something, and that's it. But at least in our community, if you write a new feature, you must include the test, and you also talk like well, it's also like new features, you must include some documentation. The person who just wanted to write the small features has to figure out how to test it and now he has to figure out the documentation system and whatever that might be, that might in our case need to get compiled to other things and so it gets potentially put a few people off to. Or some people like say here is the fix but I don't have time for the rest. So, it's a challenge there too that keeping the documentation up to date, it's nice to have I think, but it's ‑‑ I don't know, I would love to hear a project who succeeds on that.

MARIA MATEJKA: That comes back to the authorship question, because when you send a patch and do not document it, and not write a nice commit message, but we exactly know that it is right. Who is then ‑‑ can we get the full credit to the author who didn't document it and we had with to write the documentation and we had to write the commit message which explains what is actually happening?

AUDIENCE SPEAKER: Hi. Sascha Romaine, maintainer of IRD amongst other things. I feel your first point is a bit of personal attack. I have programmers' documentation; it's not up to date any more. However, one of the challenges I also find on the early points, documentation or any of the things passed done open on IRD that someone said I don't know how to deal with your testing framework, now the build is failing because it's CI. But then I can help because that's doable.

What I find one of the more challenging things, when people try to make their corroborations is that for them it's a feature they contribute and it works and we get the documentation and testing right and we can merge it. But now I have a new problem because I have new codes, and now it's my problem because I got to maintain this potentially forever, someone else, maybe the person who contributes it, is not a supporting customer.

Now, my participating customers are going to depend on this, they are going to expect it to keep working, interact with all of the other components that it might influence. That is sometimes one of my reluctance, they are, I like this feature, this should be a thing, but I don't want to maintain this for the next five years.

MARIA MATEJKA: Yes, that's a big problem. And sometimes, as I was saying, it completely breaks the internal API and we got such a submission and it was a real pain. So, I definitely feel this very much. Any other questions?

So, thank you for coming and discussing. We are done and if you want to speak about anything more about this, just reach out to me, I am here until Friday evening. Thank you.


MARCOS SANZ: And now finally, we have plenty of time. Martin offered, in spite of the early time in the morning, to talk about a specific challenges in the area of testing with the example of FRRouting that he has found in general, but specifically more in open source, and thank you very much for taking the time to prepare this.

MARTIN WINTER: You know, as a working group Chair, co‑chair, I had at least the luxury to say like I want to be the last, so I have another extra hour to wake up. So that's the benefit of becoming a co‑chair.

So, I am curious, who in the room here is actually involved in testing either open source or even like commercials of their own thing? I am just curious.

So, okay, quite a few.

So, I want to talk a bit about the challenges. It's not ‑‑ this is not about FR routing port check. I am involved there, I take as example, I want to talk about some of the things which go well, some things which don't go that well, just to give some separation there but the port check is just serving as an example here.

I don't think I need to talk about that. I basically work on FR Routing forever, and I test code and stuff and routers and stuff like even longer.

I work for NetDEF. That may be something to keep in mind. We have a bit of a special role. So the FR community is not one company controlling, it's multi.companies who work together. We put the thing together from Quagga. We have basically, the non‑profit company, the one behind you that doesn't have a product to sell or something, so and we are running the CI just basically because I started building that up and testing the whole thing.

We are the maintainers of the code.

So, before I start talking about the challenges I want to go through a bit how, how the FR routing community works and how that that looks like.

Classic work flow. The key thing is we are all GIT driven. FR routing sometime back basically that was part of the work we decided we want to be all GIT model, constantly merge the things, we ended up picking GitHub in the past there, and the whole work flow, the issue tracking, everything, goes to basic GitHub.

So, when in our case it's basically submit, any one is welcome to submit anything, so it's completely open. Anyone who feels like they have something to contribute, they can open a pull request there on the system there. The key thing afterwards, what we talk more in details, after you open a pull request, automatically the CI will kick in and will do its testing. All the automated testing at that level. That goes a few hours. And then if it's not okay before we ring the person, normally it's expected the auto will go back, look at the failures and fix it. Common issues, people obviously they develop only on one platform, maybe they only have their Debian Linux and they never try to compile it on free DS B, that's the classic built things, but there are thousands of testing running there.

But the idea is we want to try to avoid causing a load on the maintainer, so I need person basically until the code has quality reached.

Once it gets to that level, then it gets reviewed by maintainers. And it ends up in there we have weekly meetings also where people are welcome to join in, talk about the issue, sometimes help ‑‑ ask for help too if they can't find somebody reviewing it. That's sometimes not optimal, because there are never enough people to maintain things.

And then you may get back another few passes like asking for fixes, changes, they are based on the maintainer's input. And only ones that all is done, then a maintainer who is from a different organisation, then the author is allowed to merge it. So we make very clear you are never allowed to merge something which was contributed from your own company.

So, the talk today is mainly about the block on the building and testing. There is also more testing which we are doing on release stuff coming up there as well obviously.

Test framework. A long time back I picked a CI system, Atlas AN Bamboo. I know people have other preferences, go for what you wanted. It was like when I started doing it was very stable. I looked at a few other ones, I didn't feel like I wanted to debug the CI system, I wanted to debug our code.

In the meantime, we have grown quite large. As you can see in there, Atlasian, it's a commercial product, it's based on a lot of open source but if you have an open source project or a non‑profit, then you get free unlimited listens from Atlasian and all their software.

So we have quite a large system, we have currently grown about 600 plus agents. An agent is normally one executing agent which is CI controls, which can be doing some testing and stuff.

We have frequently about, on average, I would say 150, 200 agents running at the same time.

We don't use any of the Travis, GitHub actions. There are a bunch of online hosted services for open source which are free. In our case it doesn't work. We don't have the scale. I think the last time somebody came up and like why don't we use that and they added it, we reached something in the first day. It doesn't work for us any more.

We run our own equipment, we have basically tested the equipment is about six racks in the US and in Switzerland basically distributed.

From building, FR routing is specifically only kind of Unix centric, so I know there are those in here who have to test on Windows. So we are testing on ‑‑ I basically try to compile on all these different systems you can see here, we do some basic checks there. You will notice sometimes there are quite a bit outdated, part of it is a challenge when you have so many systems because there is a disupdate coming out all the time and it's a discussion, is somebody using that? Should I go update it? Do I have time? When do I get to it? So that's unfortunately with the pain.

There are also some platforms in here like the RedHat which I am currently questioning doesn't even make sense because with their policy change, I am not sure if anyone really cares much about them or if they should move over to something, which potentially will soon not be compatible any more.

We have our own framework developed quite some long time which is one of the integral part. Topo tests. It's quite hated in the community and I am the person who started it, so throw the tomatoes now.

I'm not that proud any more, but it was a start. We learned a lot from it. It's still used. It's still kind of works. It's based on pie test. It used the minute ate framework back there for building multiple name space, building up network topologies. And we normally require, when somebody does like a new feature, then they have to write the test on it.

It will be executed on different platforms. As I mentioned in there, 32‑bit, 64 bit, Intel, we have like also a bit arm in there too.

We have unlimited uni tests. Not something wore we are proud of. Just for the ones that are not familiar. Uni test is where you test specific functions. Normally it's written in the code you have, like in our case you have this make check after you are building it and it runs to and does a few like, tries a few tests in there. The Topo test is more like a system test that we bring up whole network topology and verifies some features.

So, it's mostly historic. I don't see much development on uni tests at all.

We have some other testing. We do some ‑‑ we have some commercial framework like the IX I A, ANVL, which is an RFC compliance tester. It's an expensive tool.

It does basically verify your code against all the RFC standards, so we have all the different protocols we run through it. It gives interesting feedback, it's interesting for me learning it too because it deals through the CLI if it configuration the boxes for each. Basically it goes through the part of the RFC and for each paragraph it tries to figure out a test and reconfiguration from one setup to the next setup. And I find actually quite a few bugs too where the problem wasn't the main protocol code, but it was like somewhere in the CLI when you configure or unconfigure something, there is something not correctly removed or not getting there. So it's quite interesting the bugs I found in that direction as well.
We do fussing. Cover tee, is like a big thing. If you are open source port checked, I can highly recommend that you sign up there, they do quite a good job where you can submit things out there and they do testing it.

There is also the Google, open source fussing programme, which they do at least for some open source port check where Google internally does some fussing. We use that quite extensive too. And they send us frequently like basically secure the reports when they find things.

We do static analysis there. You are probably familiar here. Address sanitiser. I love it much more personally than the other. The other sanitiser has far less false positives. If the other sanitisers find a bug then normally it is a bug and it basically does it which crashing the code and giving you some extra information, like what happens after.

We also do some package. So when we build the code, I build packages for every pull request, every change basically I build complete packages for most of the platforms.

CoverT, just a quick overview. They give us statistics where you can look and it and it gives you an ID how your density is, it gives you this to try to look into it. At the beginning you also have a lot of very trivial once and you get somewhere like more and more difficult to try to figure out is that the real issue or is that the false alarm.

We also have Topo test because we had a lot of complaints. There was a lot of things it merged so the framework wasn't perfect obviously like everything at the beginning. And a lot of people started changing it around and/or Topopato test is now about four frameworks, which people use it in a different way and it makes it very hard to maintain or look at someone else's test and try to understand it. We came back to that and we tried to build a new framework, we called that Topotato. We are still in testing, but it probably starts going live potentially quite soon. We did interesting thing is that part of the framework tested up, you are basically force that the build the network diagram there, you have the information in there just to build up the network in the framework. It's good enough that it can create the graph from t we also added a form like get the better timing, we changed our code that ‑‑ because we want PCAPs normally what's going on anyway, but even the lock messages from the different routing duty mains, we pushed into the PCAPs as well so we have like one like PCAP file which doesn't just have the network message but we also see lock message, configuration messages, is what gets there, so you have a nice view like what's going on in which order.

But ‑‑ so that's anyway how we are doing it. I want to talk about the fun things we learned and sometimes the not so fun things, and the things which we could probably improve.

Part of this talk is I had the feeling there is a lot of discussion here between different code and what features people can add in open source. I don't see that much discussion, how do improve the testing and I had the feeling every community does basically their own test setup. A lot of them find out some cool things and they struggle at some other parts, and ma my thing would be that maybe we can better learn from each other, figure out the way how we can exchange information and see that. So, please, after the talk, come to me if you have some ideas, if you want to know something, maybe there is things we can know how, maybe we can share resources, maybe you have another idea how we can improve things.

So, first of all, when you talk about testing, is when you look at the whole complexity. I mean, we are open source. We are not allowed to spy on the users. If I talk to commercial vendors sometimes, let me figure out if that feature is used, and then they go back and some magic and they say no, out of our users like only of them uses that feature.

Unfortunately, in open source, nobody will accept that we have spies. So we have no idea what operating system our users are using. We don't know on what hardware they are using it. We have no idea what dependence libraries they are using, we don't know what features they are using, we don't even know what they changed there or how they compile it. That makes the testing very hard. We tried a few times to ask, but we get a very small snapshot. For the different distros I tried to look at our own package systems like our RPM and Debian repositories, but most people obviously use it from the public repositories, so I have no idea what they are doing, or even they build it themselves.

So, it's a very hard thing to try to figure out what to do it, and you have to make like some guess sometimes.

So, on the hardware, If I talk about that.

Obviously now what you are testing. Obviously some of them are simple. Intel, 64 bit, sure. 32‑bit, well if you are running VMs, that's trivial, you can set it up, it's a no‑brainer, you don't need much resources. But then what else? So arm was like a big thing which came up more and more obviously, so yes, arm, we have to do it, so how about power PC? How about potentially risk 5 soon? It's a big question coming in there. And especially when you look at all these platform as, you look and the question comes up big engine or little engine systems?

If you look on here, most of this is all little engine and it's very hard these days to get a big engine system. But if you write network code obviously the author or if the network is quite challenging and we had a few times when we ran into problems for, like, if the code didn't work on big engine, people made mistakes and nobody noticed it like for sometimes months or years.

There is also the memory. Sometimes it behaves quite differently if you give it a lot of memory or you don't have and the timing is completely screwed up, or it does something else or it potentially it grows too much because the memory consumption grows because of a new feature. You may want to look at that part too.

So, as an example. Here is like we started about 2016 when everyone like, oh, this cloud service, they are using FRR, we should probably ‑‑ unarm boxes on the high end arm services. We should test it too. I initially ended up with some hosts arm servers, that was like this, French companies Scale way, maybe somebody is here from there. I started testing on them. They had some hosted Arm server, worked kind of okay, their Arm boxes were not that reliable and then later it started I need to scale up. In about 2020, we decided, like, we tried again buying arm server, classic ones, if you don't buy at least 100 server, you cannot get high end arm servers. And they somehow couldn't give us one either.

Because it's their own property thing.

So, we ended up with raspberry started hacking around, you see there on the left side this picture on the shelf where like about 16 raspberries nicely packed together, we basically were all raspberry force, with a small AS T connector, with a USB mainly because the SD cards will wear out way too fast. And we had about 60 of these raspberries too, to like run the whole thing. It worked reasonably well. Once we figured out what especially AS Ts we had to use, which are reliable, then we didn't have much outages any more. So it worked quite well there. However, even a raspberry 4 with the CPU and the memory and everything, I mean we started using the high end 8 model but I ended up, the build task was just about 40 minutes just building, do a normal build and a building package on Rasberry and it was just getting ridiculously long.

This year, end of last year, we finally heard from our supplier there is now super micro arm servers, you can finally buy one, yes we sell you just the single one if you want to. And we ended up getting, like we have now two of these servers, that is Arm, the Max, the 128 core, beautiful boxes and really nice fast, so what it took me these 40 minutes on raspberry, if the same number of cores, like limited the same number of cores on these new boxes went from about 45 minutes down to about three minutes. Just by having faster memory, faster like dynamicses and stuff, the whole connection is beautiful.

That solved a lot and that looks well.

Big Endian, little Endian, that was an interesting challenge. Try to go buy a big Endian system these days or find one which is hosted anywhere. There are basically none unless again you are like a big company who may be using FR routing in their core. We found after a long times, we found some old power PC boxes. And we found we were lucky so far we found two of threes free scale development systems which are the highest end power pieces we could find. These are like 24 core systems. We were looking at. Unfortunately we only have two. So I cannot really put that in the normal CI, just it's not enough resources for our CI unfortunately.

I am still looking at how the best way to test it and run some simulations on there, so we can actually verify that yes the big Endian stuff still works. I don't know if it will be still relevant in the near future on the other side.

So, I'll talk a bit about software. That was always the discussion too. Like, when do you develop your own thing? How does the whole thing scale? How about the quality, the performance? It's challenging thing as probably here testing it.

So, the key thing is especially in the open source, preferable, I can hand the framework out to everyone and they can do their own test. If I can't, I hopefully if I find errors, I can at least document, give.them the exact details what went wrong.

There are some things where that doesn't work like the one I mentioned there. Their policy is, I can publish which RFC we are violating, I can publish basically what section of the RFC, I can say this chapter basically, I am broken on that one, I cannot say how to test it. Because they claim that's how are secret so source to figuring out that.

It helped us a lot to do that because it gave us a large framework initially to start testing on it. Now, we had a few times should we replace it? And we had discussion can we replace it? Because obviously I know exactly what to test. If I were to write my own framework, could they potentially claim that I cloned their framework? So, it's not that easy. So, if you are going for any commercial software, think very clear if that's the right way to go.

It might be solving a lot of things in the beginning but in the long term it might not be the right choice any more.

There is also sometimes open source frameworks. We picked something with that whole minute yet part of it and modified it making their own thing. There is obviously online service like GitHub where we ended up by now so far integrating into it, that is unfortunately not that easy for us to move to another GIT if we ever have to. It's always interesting stuff. I mean the CI, I am personally still happy. I know a few other people think they could do it better, but yeah...
The scaling is also an interesting. As I mentioned, we do all these different platforms, you see here the outdated. I mean we shouldn't Ubuntu 18.04. I mean, they moved mostly as a standard platform Debian 12. The problem is for the Topo test we have about 12 hours of run time on it. I split it up in multiple things. We are looking about getting more course that we can speed it up a little bit. So currently I split up everything in ten pieces, so that mean that tends with building which may be half an hour, sometimes up to 40 minutes, and so all the platforms up to plus another two hours for testing so we end up currently about three hours just for a single pull request for the person to get the result, which in my opinion, is too long. My old rule was always it should be less than two hours, which is already quite a long time, and we are like working there, trying to see how we can improve it.

We did a lot of things at work changing other to LX C, mainly that I don't have full B Ms any more. Some things work easily. Especially it's a classic Debian Linux then I can have just a container separate. If it's a RedHat, I can do somehow, I mean the building is quite easy, I can do cross compile, that works quite well too.

I can do at least that part.

It's a bit more challenging if I have things like FreeBSD, net BSD, open BSD, it could be that I just don't understand the tools well enough.

The scaling testbed was always an interesting thing on that testing itself. So we initially, on the Topo test we ran them on the raspberries and each raspberry was just running one test at a time and that we were limited obviously at the rasberry performance. Then we changed over some of the things to LXC containers now on the large arm server. The problem is now for the ones who are looking to that LX C, they use one kernel and network wise on Linux that ends up in central lock so. That 128 core arm server I cannot get over like about 5% CPU load, and about 100 gig of RAM use. The best is basically sitting idle because adding a single IP address at that time may take like five seconds just because the kernel lock and the weight there. So that's basically I am stuck there. It's still faster than before, but I lose a lot of resources, so we are now hacking around with micro VMs, something which I realise not that many people did. If you are not familiar, micro VM the idea you take a kernel you remove everything you don't need, and you slim it down to the minimum required. You get boot times. I have now a boot time of like a bit less than half a second to boot up the VM. I have my own kernel again. And I can easily also change kernels around so I can have the kernels. They are obviously not the exact kernels like you see in that specific Debian distribution, or in that specific sent to us or RedHat distribution and that's still sometimes a concern, I'm not sure how big an issue it is.

But it is an interesting thing.

The other thing which we specially started look into it, not yet on a good solution yet, when you test network protocols there are a lot of timers that have more or less a fixed time because of the RFCs, what they say. And I am looking at it by going to VMs again, hacking around that I can speed up the time inside the VM. If you do like work with Q M U, you can actually change the time is like runs at the higher speed inside the VM. Obviously, you lose all external network connectivity if you do that, with makes it a bit challenging for potentially controlling it while it's running.

Another big thing, if you are a community‑based, and you push the test right in the community, there is a lot of bad test coming out, which is very hard to review. The classic example is people write the feature, they write the test, the test pass gets submitted. A few months later, somebody else changes and something doesn't pass, and then you end up with error messages here like there, OSPF case RIS start help, basically the message is the active RIS star account is true. And you try to figure out what went wrong. Or like, I mean it's a classic thing to be that they add an actual message what they were looking for, but ‑‑ yeah, it's hard to verify it because nobody seems to be testing a new tester submit for the case when it fails.

Stability. Obviously a big challenge always. Once you run up to like a large number of VMs and other systems, things go wrong. There can be network going wrong, sometimes the connection to GitHub, it breaks. You have a lot of different distros, you have RedHat decides again because I do roll backs on the VMs constantly that the licence check, something is wrong and they deny me the download packages in the scripts and I have to fix the VM again. There are tonnes of things which can go wrong, and unfortunately it cause as problem on the trust that the users at the end, oh, the CI failed, it's not my code failed, it's the CI failed. It's hard sometimes to keep the trust in the community, to explain that like if things is like failing, you should take it seriously. And sometimes it's maybe not their code it's from someone before and because of changing timing, something else starts failing.

And the final thing there, obviously, there is a big money issue. Running an infrastructure like that is not cheap. And it's people want good testing, they come back, like, but I can build it in or run a test framework in 20 minutes, why can't you? Because they put like 64 cores on it for a single thing. If I run, 20, 50 parallel, I cannot potentially put that many cores on one test run. And I may have to split the resources. There is also a lot of people that are happy to sometimes pay for support. They may even pay for features getting developed, but when you ask them how about maybe participating for the CIs, that sometimes quite a hard sell. So we have, as an example, maybe a few donations which help us for running the CI, but we get way more questions for support or feature development.

And that's basically it. Just on the good note, I just want to end on that one.

The great thing I like, I tested for baggage vendors before, they never want to publish the bad things. We are in the open source community, people accept that I can publish like errors, mistakes, things which go wrong.

And running as a non‑profit, everything, it also helps the we are clearly independent of it. We can do things which we don't really need to worry about it.

And yes, we can pick the tests what we think is useful.

We probably don't have much time for discussion. So he told me I'm allowed to take a bit more time.

MARCOS SANZ: Maybe we have time for one or two questions...

MARTIN WINTER: But I would love if you can somehow figure out a way to changes ideas, share resources, talk, share ideas on things. Feel free to ping me, feel free to send something to the Working Group list and see if we can maybe figure out a way to share them more.

AUDIENCE SPEAKER: Gert Döring. In this case speaking as OpenVPN. VPN we try to do decent CI, decent test coverage. We don't have such a huge budget yet, so we're not running like 600 VMs. We spoke about this at lunch two days ago.

Something I have been thinking about is the environmental impact of this, and this is worrying me already for my small test environment, because I see how much CPU I am burning, and if I, as a human, look at the patch I can see, okay, this patch is a comment, so as long as it compiles, I'm sure there is no funky characters in it, but I don't need to compile it everywhere and I don't need to run everything. This patch is, if Linux section, so there is no need to run it on the other platforms. I am a bit smarter than my CI so far, so I can see this, but maybe this would be something interesting to evolve the CIs so ‑‑ for the initial patch submission, only run a subset of tests, and like once a day run the full set on what you have in the repository. So we do this for parts of our tests. Some get run all the time. Some get run just twice a day because it's not properly integrated yet basically. But also, it makes sense due to test run time, because the daily test just runs so long, I want to run it in the night and look at the results in the morning.

MARTIN WINTER: So, we talked about it before from the power consumption, it was a very interesting thing. I thought that they may want to go back and try to first try to figure out how much energy I'm wasting for like one run is kind of an an interesting idea. Figuring out what to run, I think it's a good idea. I'd be honest, I have currently no idea how I can do that reliably automatically. If you ever figure something out, I would love if you share that.

AUDIENCE SPEAKER: Alexander: "I wonder what costs more the infrastructure or the people?"

MARTIN WINTER: The people. That's quite easy because running, maintaining it, I'm currently ‑‑ we don't have that many people working on the infrastructure because, in difference to classic companies, we cannot afford people that easy. So we have to automate and make it as much as possible way more than any commercial company I have seen. Just because the hardware is far cheaper.

MARCOS SANZ: Thank you very much Martin.


And with that, we are done. We want to remind you to vote, express your interest of support, one or the other co‑chair candidate to the mailing list. It's open now with an official e‑mail to the mailing list, a period of voting two weeks from now. And that's it. See you in Krakow.

MARTIN WINTER: We'll do our best even before the new Working Group Chair starts to move it back to the afternoon.

MARCOS SANZ: See you in coffee.

(Coffee break)