Gitea Diaries: Part 1

First installment of a series of articles about moving Blender’s main development website from Phabricator software to Gitea.

Blender’s website developer.blender.org is the well-known go-to place for reporting bugs, submitting patches and checking for updates on fixes for problems people have reported.

It’s clear that a project the scale of Blender involves countless interactions between our users, artists, tinkerers, testers and developers, bugfixers and other contributors.

developer.blender.org has been doing this for our community for years now, and successfully so. Sadly Phabricator, the software running behind the scenes, has come to be discontinued; the announcement about this arrived a bit over a year ago as of writing this post. The writing on the wall was clear: we’d have to look for a replacement.

Early 2022, I have, among other things, been put on the task of looking into this, checking what’s available, what our choices are and help us get to a choice on what to do. This brings us to the title of this post.

From all the horses we’ve looked at, Gitea is the software we’ve decided to put our faith into, being able to provide us with a choice that’ll serve as our next-generation development-forge.

Over the first half of 2022 , work’s been put into understanding what options there are out there that we could live with, that would function well and that would be future-proof enough to not require such an involving task of moving as we face currently. There are the obvious answers out there, of course Github, Gitlab, Atlassian all have something that works well as a forge, and they assuredly have happy customers.

For the Blender project, however, just being available and usable are not the only requirements; it became clear that if we wanted to adhere to principles regarding the betterment of open-source software, there was an obvious option available to us that we should be taking seriously. Hence, Gitea has been chosen, also because it is mature, feature-packed and built around modern development idioms which give the promise of it being built to last us a good long time.

Next to this, it promises to deliver on a number of things we’ve been wanting to have Phabricator do for us; including more straight-forward project-management, better CI/CD integration, and a more well-understood/well-known process for people to use for submitting code.

This article aims to be the first in a series of posts about the journey that we have set out on, now with Gitea as the focus.

A journey which we aim to have us have a working replacement for Phabricator during the course of October 2022 (in time for Blender Conference, in fact!).


First Steps

Now that the tool-of-choice has been chosen, we get to look at how to get it to do all that we want of it. Does it perform? Does it handle a load? Can it handle our data? Will we be able to keep our development history after we’ve moved? How hard will that be? What compromises might have to be made?

For that, we need data. In particular, our current data inside of Gitea. It’s time for a first attempt at importing data into Gitea from Phabricator.

Sadly, no finished, off-the-shelf, importer exists for Phabricator here. Not many other platforms do, in fact. There may be several reasons for this, but one of them definately is that the way Phabricator “does things” is just very different from how most other platforms choose to do things. In other words, some of the core concepts of a forge like repositories, projects, teams, tasks/issues, labels, statuses have different meanings, different relationships between each other in Phabricator and Gitea.

The biggest hurdle that needs to be tackled is just how Gitea may or may not support importing pull-requests and review-data (called Differentials in Phabricator) in an ‘elegant way’™. Let me explain…

Gitea has support for the Fork → Branch → Pull-request → Merge workflow as is common on other platforms. This is the workflow we intend to use for future development; though there are some challenges to be worked out; particularly regarding the amount of disk-space required to support this workflow as in the default configuration of Gitea, it allocates a full copy of the data for each fork. With a current repository-size of around 800MB, it is easy to understand that this presents scaling challenges.

However, given that Phabricator has patches that dont have a ‘source’ reference, these would be somewhat awkward to import.

The main avenues of investigation regarding this issue are currently:

  • See if we can import them in a way that’d look the same as what you’d get if you’d accept a pull request from a branch in a fork that’d get deleted afterwards. That’s essentially ‘the same kind of thing’ (nothing to point to anymore).
  • See if we can make a ‘synthetic’ fork-repo of the main blender repository that’d reflect all the changes of each ‘Differential’ that ever got created in Phabricator. This’d be ‘technically a cool feat’, but not expected to be possible without a serious amount of hackery/wizardry.
  • Gitea might support the concept of an ‘imported’ Pull-request, migrated from a different platform, without needing to have the context originally available on the platform migrated away from; this would be an almost direct fit.
  • Last but not least: request doing some customization on the Gitea code to support having PR’s without context to refer to.

These are just the approaches we’ve currently identified and it’s clear there’s going to be some way to handle this in a way we’d like. It’s less a showstopper or even a hurdle than that it is just where we’re at at the moment.

There’s a good list of things still to be worked out; here’s a short-list of some of the things to tackle:

  • Consolidating (most!) users between Phabricator and Blender ID so that when you login to Gitea, you get to do everything as you used to, just via Blender ID.
  • Getting workboards into Gitea that are based on a query and not just ‘project’ and be able to save them for re-use on dashboards, etc. Allow queries to include ‘labels’.
  • Importing attachments in a way that their use as ‘inline media’ can be supported as was the case in Phabricator.
  • Badge-support (in conjunction with Blender ID badges).
  • As noted above already: Find a way to mitigate disk-space usage implications when forking a repository.
  • Have issues be able to belong to multiple projects.
  • Possibility of supporting ‘priorities’ in another way than using ‘labels’ for that.

Meanwhile

Wait; does all of this mean that all that’s been done is just hypothetical/theoretical work and no actual things have been made yet?

Well, no, not quite.

So far, this is what we already have:

  • There’s a Blender ID authentication module that allows you to login natively.
  • There’s a Python-based exporter/converter tool that takes data out of Phabricator and is able to mush things into Gitea’s ‘restore-repo’ format with (partial) incremental-fetch functionality (think ‘rsync’). This is being prepared for being hosted as open source project for re-use by others with the same challenge.
  • There’s a mapping of Phabricator projects towards Gitea labels
  • There’s running instance of all Blender-project issues (around 72K) imported into https://gitea1.dev.blender.org/blender-foundation/blender (login with Blender ID)
  • And, quite significantly, we’re in the finishing stages of having a support agreement with the Gitea project to have them support us in our migration by funding work on missing features, code and bugfixes that will be available 100% to Gitea users under Gitea’s MIT license.

Next Steps

For the coming week(s), most of my time will be going towards making sure that we can have the right feature descriptions and understand just how important (or not) they are for us so that we can understand what to work on first, what to get ready and make sure we have functioning before anything else.

There are going to be things we can leave for later; implement after a possible migration date; but a number of things are obvious ‘must-haves’ before even considering being able to switch. Getting that right is important.

With that, I also expect that somewhere at the end of next week, there might be enough new things to report that it deserves another code-blog post – Gitea Diaries: part 2 !

Until then, happy coding!
Arnd

24 comments
  1. Hey I’m Stan the project leader for the FOSS game 0 A.D: Empires Ascendant (https://play0ad.com), which also happens to use Blender for it’s 3D modeling needs :) (Still stuck with collada, though)

    We’re also using Phabricator (but with SVN) and migrating everything (Differentials, their comments), Commits (concerns and their comments), Pastes) It seems tricky, there is an old script for Gitlab that only supports up to 12… And if we go that way I also have to switch from Jenkins to Gitlab CI which seems like a very long process. And finally, we’d have to migrate Trac (Issues and wiki) to Gitlab. While keeping all the information about reporters and committers

    Since we have the exact requirements for being self-hosted I totally understand the struggle and wish you the best.

    1. Does anyone has a link to the Devtalk where the missing Gitlab features were mentioned ? (Maybe we suffer from the same limitations)
    2. What type of CI do you hook with Gitea? Maybe I can save some work by not having to migrate Jenkins, although it makes one more platform to maintain
    3. With regards to server load is Gitea lighter than Gitlab?
    4. Is there a place where I can get in touch with you, or at least follow your process (IRC, Matrix, XMPP, etc.)

    Best regards,

    • Hi Stan,
      Sorry to be slow getting back to you about this.
      In short, here’s the answers to some/all of your questions:

      1. There’s no link as such, but it’s more of a disperse set of conversations that’ve been going on about the topic in several places. A lot of it in personal talks with developers on blender.chat, etc. A big omission in gitlab OSS was the ability to have multiple reviewers per PR. This would be something that’d need to be added back into the software for it to work well for our workflow. This inevitably led to the question of what value there is in adding back features that are simply already present in the enterprise version(s).
      In the end, this ended up really not sitting well with us and we decided to put our money-where-our-mouth is with regards to supporting OpenSource solutions whenever they are available. Hence our decision to migrate to Gitea and to have it be to the betterment of both our projects.

      2. We currently have a BuildBot setup (see builder.blender.org). This setup isnt currently ‘hooked up’ to our Phabricator instance very intimately, but the intention is for that to be the case once we migrated over to Gitea. We anticipate that this’ll take some engineering as our setup is currently rather a-typical; but functions well enough in the meantime. If it turns out easier to move to a different CI that is better integrated, we may do that. This is an open question.

      3. Gitea so far has proven to be a very light-weight piece of software ; consisting of one Go binary that handles all the requests; connecting to an SQL database. This is not to say that all pieces of Gitea code are more efficient than Gitlab’s or those of others, but ‘footprint-wise’, it seems quite a lean piece of machinery. The only other piece of spinning code is the OpenSSH server and the Git-binary used to do Git operations with. Especially the latter might have room for optimizations. The brief experience I have had with Gitlab tells me that while it certainly doesn’t underperform, it comes with quite a few things ‘extra’ out of the box that starts spinning in parallel. If you use all that stuff, this might be exactly what you want.

      4. I’m available on blender.chat as ‘Arnd Marijnissen’ Feel free to reach out to me!

      Again, sorry for the late reply.

  2. I want to add regarding full copying mechanism of Gitea: `git-clone(1)` will actually _hard link_ clones made from the same machine (you can tell it to do so explicitly with the `-l`/`–local` flag), so all of the Git objects in `.git/objects` (where all the blobs—files—and trees—directories—are stored), won’t be copied and you won’t use much extra space (except for maybe, copying all the filenames for the new inodes, etc)

    • I’m not sure if Gitea uses this feature, but it might be worth investigating…

      • We’d have to investigate this. The internals of the git-fork process need to be looked at to see what it does, how and why and see if there’s ways of mitigating the expected disk-usage. The above suggestion is a useful one. Thanks!

        • Yes the default “nogogit” backend do use it, since it uses the git binary that does this by default …

          • And beside git everything else you can store in an s3 compatible object storage (like git-lfs stuff)

          • I wonder how this interacts with garbage collection. I would imagine that running git gc on each repository breaks the hard links? Or does it work to just never run git gc, and performance remains good enough?

          • Hard links are implemented in the filesystem so afaik deleting a hard link to data shared by other hard links won’t delete the data itself, unlike symbolic links.

            For example:

            $ echo foo > foo
            $ ln foo bar
            $ rm foo
            $ cat bar
            foo
            $ ln bar foo
            $ ls -lh bar foo
            -rw-rw-r– 2 b-fuse b-fuse 4 Jul 13 11:47 bar
            -rw-rw-r– 2 b-fuse b-fuse 4 Jul 13 11:47 foo

          • Right, I understand that hard links ensure the data remains there. What I meant is that if git gc runs on original and/or forked repositories, then those hard links will be replaced by new files local to each repository, and disk usage goes up quickly.

  3. Why was GitLab community edition decided against? I’m just curious

    • Without repeating the explanations given on Devtalk, the main argument there is that it currently lacks a set of features that we require that ARE present in the ‘Ultimate Edition’. This means that if we would choose to go for the CommunityEdition, we’d run into having to re-create features found in their commercial offering.
      It’s clear that these kinds of changes would thus never really become part of the ‘mainline’ code of Gitlab; and we’d forever be having to port these changes over to any new version of Gitlab to be able to upgrade.
      That’s rather a precarious situation to be in, next to being rather ‘thankless work’ that’s not re-usable by others. Missing the point of what it is to have something be open source.
      This is not to say that Gitlab CE isnt a nice piece of software. It just doesnt fit very well with what Blender would like to achieve in contributing to open source development.

      We have some features in mind for Gitea that we’d really need and would love to see implemented in such a way that everyone using Gitea will be able to benefit from those, too.

      • Thanks for the clarification. And sorry for making you repeat yourself, I watched that episode with Pablo where you explained why as well but I completely forgot 😅

    • If you are interested for an comparison: https://docs.gitea.io/en-us/comparison/

  4. Hey Arnd, thanks for sharing this. KDE folks did the migration from Phabricator I’m a not so distant past. Even though they ended up choosing Gitlab instead of Gitea, I’m pretty sure they can share some useful insights/tools to aid Blender on the migration. Have you tried to talk with them?

    • We havent though we’ve looked at their process towards Gitlab. They might have some useful insights about getting data out of Phabricator, however. The issue is not so much getting things out of Phabricator as it is on how to rework/format data into the forge we chose: Gitea.. So in that sense there might be not so much we’d be able to reuse.
      The intention to ping them is there, however.

  5. Proud of you guys

  6. gitea core maintainer here :wave:

    nice to see you choose us :)

    for migration:
    https://docs.gitea.io/en-us/migrations-interfaces (the interface you just have to implement …)
    https://github.com/go-gitea/gitea/issues/8689 (there you can find pulls that did add support for other forges etc …)

    I created a new issue ( https://github.com/go-gitea/gitea/issues/20344 ), where we can track stuff

    For CI/CD https://www.drone.io/ and https://woodpecker-ci.org/ are widly used .. but there are more (https://gitea.com/gitea/awesome-gitea#devops)

    by the way gitea will understand https://forgefed.org in future – if it is finally stable it would be awesome to see you federating too.

    • Heyhey! Cheers for having you join us in here !

      It is very unlikely we’ll be doing a ‘standard migration’ for our move from Phabricator->Gitea as there’s a few things that’d make that an unwieldly method.
      First, it’d take hours and hours to get the data over. That wouldnt be so bad if we’d only have to do it once, but for development and acceptance purposes, we’d want to be able to do a migration perhaps multiple times per day just to see what the result of a software-change in the import-logic might be.
      Next to that, the way that the Blender software has been put into Phabricator isnt ‘optimal’ and abuses Project-labels in a somewhat ‘non-standard’ way. For this, and some other kinds of data-types , we’re currently using an ‘export’ tool to get data OUT of phabricator , and a ‘convert’ tool that is able to collate/combine/modify the data into something that we’d like Gitea to import.
      This means remapping Phabricator ‘projects’ to things like ‘Projects/Labels/Etc’ in Gitea. It means augmenting the user-data that Phabricator exposes with info about Email-addresses (which Phabricator doesnt export by default), and …while we’re at it, matching the data with the correct Blender-ID oAuth login so that things will work ‘as they should’ when we stop using local login-info for our forge.

      So…while having a ‘Phabricator Migration’ support in Gitea is good; it’s likely not going to be directly how we’ll need to be doing things.

      On the CI/CD side, we already have a BuildBot stack that we intend to re-use. See builder.blender.org. There’s support for that in Gitea (addon), so we’ll be looking at that with interest.

      Cheers!

    • I am curious why you don’t use your own project to manage itself??

  7. Good luck…

  8. I have no idea what the Goal of this blog post is but I know one thing for sure. And that is… “I Love Blender” 😍

In order to prevent spam, comments are closed 15 days after the post is published.
Feel free to continue the conversation on the forums.