-
-
Notifications
You must be signed in to change notification settings - Fork 297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial support for BitTorrent distribution of ISOs #2126
base: main
Are you sure you want to change the base?
Conversation
For reference, you can't just use the Fedora tracker blindly. The Fedora Infrastructure Team must submit every torrent. For the web seed this is blocking range support on the downloads. https://community.cloudflare.com/t/public-r2-bucket-doesnt-handle-range-requests-well/434221 |
Indeed. It would need to be coordinated with them somehow. But also automated and overall handled very differently than their regular procedure, which means it would require even more assistance from them. So that patch definitely needs to wait. (It's a draft and it shows.) But if that turns out impossible (e.g. because of licensing issues), we can either forgo a tracker entirely or use some general-purpose one instead. I guess we will want to avoid ones even remotely associated with piracy, so something "Linux-related" would be the best. Maybe https://linuxtracker.org/? The documentation is pretty sparse, so from a quick glance I can't really tell if it's suitable. It seems to rely on using the web interface and accounts to submit new torrents (which could probably be automated with a simple
Yeah. I've now found a few reports like that. Seems to be a Cloudflare bug (or at least an edge-case in the HTTP specification which doesn't mesh well with many clients). Nothing conclusive tho I was hoping to test how well Transmission handles it. That's why I would love to see that first patch tested in action. It would let me try downloading an ISO with just the web seed so I could check how reliable it is for me. But it has just occurred to me I can simply create this |
From my understanding, the web seed will not function at all because it depends on the Range header being respected and working correctly. I have not tried making a torrent to test webseeding yet, but I have confirmed the range header does not result in expected returns from the download cdn. There does seem to be fosstorrents, which has an open tracker; they also have a discord for having official partnerships. One of which is ultramarine On the topic of whether or not to have a tracker, DHT is not something everyone can use. Depending on NATs, some clients require a tracker to function at all. |
I joined their discord in december with the intent on investigating - seems like a good fit for us, might be worth investigating and then dropping over some cash to help for infra? |
I've now tested this with Transmission 4.0.6. I couldn't get it to work with https://download.bazzite.gg/bazzite-deck-stable.iso as the web seed, but whether R2 reliably supports range requests seems to vary between people, so it might work for someone else. Then I tried the same, but with the image served by Python's
Yeah. I meant that in case we couldn't find a suitable tracker, getting this done would still be useful to some people. But while testing these things out I found one more reason not to use trackerless torrents: |
I've reached out to fosstorrents and applied for support for our images. |
From what I linked before in the Cloudflare community support thread, R2, where the ISOs are stored, does support range requests. However, when using a custom domain, it applies Cloudflare's cache to the files, which does not support range headers on files over 500 MB. Setting a page rule for the (sub)domain to bypass Cloudflare's cache fixes the issue, as all requests directly hit R2, which will handle the Range header right. |
Here are two examples for anyone who wants to test them out.
They were created with these two commands: mktorrent --announce={udp,http}"://fosstorrents.com:6969/announce" --piece-length=20 --web-seed=http{,s}"://download.bazzite.gg/bazzite-deck-stable.iso" .../bazzite-deck-stable.iso
mktorrent --piece-length=20 --web-seed=http{,s}"://download.bazzite.gg/bazzite-deck-stable.iso" .../bazzite-deck-stable.iso What to check for:
|
Co-authored-by: AzemaViator <[email protected]>
I've pushed a v2 and I will address your earlier replies tomorrow. |
I went over all the loose ends here and prepared a list of answers, questions and ideas. But it has been a long day for me and I'm simply too tired right now to turn all of that into a clear and coherent text. Unfortunately, that will have to wait for tomorrow. Sorry for the delay and thanks for all the assistance so far! |
Alright, I'm back working on this. Sorry for making you all wait for far longer than I had stated. There's a lot to unpack here, so I divided the whole thing into sections. Some of the later parts are general ideas on things I noticed when researching and working on this. These don't need to be acted on and some are frankly quite far-fetched. But I still would love to hear your thoughts on them (as long as you have the time and energy for it). I put todo checkboxes next to the points directly linked to and necessary for this PR. Cloudflare R2 vs Cloudflare proxy cache
You're right! When I had first looked at it I saw a bunch of seemingly conflicting reports but no official statement from Cloudflare support. For the record, here's the OP of that thread explaining the whole situation.
Other web seeding ideasBesides uploading the build artifacts to R2, the GitHub action also stores them within GitHub itself for other workflows to access. Could we use GitHub as a second web seed here? We can get stable URLs for them now, but these only work with authentication. Thankfully there's this GitHub app that proxies the traffic and enables anonymous downloads. Alternatively, we could just attach the artifacts to their respective releases, as these are already being created anyway. (rel) Tangentially, it would be awesome if some torrent clients out there could still handle web seeds that don't support range requests. Transmission handles such cases by just not using them (and then trying over and over I guess). I couldn't easily find what libtorrent does. I mean, when there's no other source available, trying to download the entire file in one go is a much better option than failing indefinitely. Probably more trouble than it's worth but I'll ask how hard it would be to implement that. DHT and trackersDHT most likely works, as my node had some activity even though neither of the tracker URIs worked on my end. It received about a megabyte and sent nothing (I have ~120MB total, mostly from the earlier LAN web seed tests I did earlier). Oh, and for a split second I also saw some peers connect (one from Hungary and one from France).
While we're at it, could someone else go over my logic here and tell me whether I understand this correctly? When using multiple trackers, you organize them in tiers. Several tracker URLs can be put into the same tier for the purpose of load balancing, as just one tracker from a tier will be chosen at any particular time. On the other hand, tiers are tried in order until a satisfactory number of peers is obtained. So if we find a second tracker, it should definitely be put in a different tier. But what about our case, when the tracker offers two URLs, one per protocol? Should they go in the same tier (for load balancing) or in consecutive tiers (to give priority to the protocol we consider better)? Or does it not matter because it will be handled reliably either way and there's no point in chasing any efficiency here (since this traffic is easily dwarfed by the transfer of the actual data). Though it matters much more for the server, so me should think it through at least a bit. What about HTTPS trackers? It would be one step forward in terms of maintaining privacy, but it's a public torrent anyway so it really doesn't matter, right? But what is the threat model here tho? And what about adoption / compatibility? Same goes for HTTP web seeds I guess. Testing
Before opening this PR I tried to run the workflow from my clone of the repo. It failed, which is not surprising since I didn't really have anything set up. Would I need to get this working or can someone else test it much more easily? Future stepsOnce the Is there anything we would like to coordinate with fosstorrents.com on? On the main page they have a "recent releases" section. Maybe we should help in getting Bazzite onto that list? (or maybe not, since it has a very rapid release cadence) Random thoughts and ideas:
|
Thanks for your work on this! I went ahead and made a donation to fosstorrents.com on universal blue's behalf (so we can get aurora and bluefin going too), I'm currently unable to spend time coordinating with them but ideally we can connect this work to whatever automation they have to make it all nice and magical? I've sent them the link to this issue but they have a discord if someone wants to follow up! (I'm too time limited right now to help!) |
Hello everyone! Before even becoming a user, I present you with this patchset. The aim here is to make BitTorrent an alternative supported way of obtaining Bazzite ISO images for installation.
The PR is marked as a draft because not everything here is ready to be merged just yet. While the first patch should work (and result in
.torrent
files being published for each built ISO), I wasn't able to test it on my own. When I try to run the GH action from a cloned repo, the build fails at the "Determine Flatpak Dependencies" step (docker: Error response from daemon: Head "https://ghcr.io/v2/mskiptr/bazzite/manifests/stable": denied.
).The second patch is a work-in-progress. I would like to mark each generated
.torrent
with the information which build number | version | revision it corresponds to. But how do I obtain that? I couldn't find it anywhere among all variables in that YAML file.Finally, the third patch adds the official Fedora tracker. It isn't necessary, since there's always the DHT and peer exchange, so the torrent should work without a tracker but adding one might still be beneficial. The patch is marked as a draft because using that tracker will have to be coordinated with the Fedora infrastructure admins most likely. (And btw, if this really needs a tracker, we can always look for some general-purpose one.)
Now, how will the swarm start up? At first I thought that for each variant and version someone will have to obtain the ISO manually and inject it into a torrent client. That would certainly be not ideal to say the least, but at least it should be simpler than getting someone else to send you parts of the file when the transfer keeps failing for you.
But it turns out we have a much better option. There's this thing called Web Seeds. It basically means that a
.torrent
file can contain a list of URIs that should be tried automatically, side-by-side with any actual BitTorrent peers.The apparent lack of support for HTTP range requests from the current CDN might throw a wrench in that. But even then having an official
.torrent
available would still be an improvement here. Another problem might be that currently the web download link is not unique between consecutive Bazzite releases. So if someone tries to download the ISO using a.torrent
file from an older release, the software might get confused.