About three weeks in, it was time for another update on just where we are, where we’re going, what’s happened, etc.
Howerver there’s a few things that got in between; a short vacation as well as a bit of bad luck with Covid19; but everything’s alive and kicking. Thank god for online software development.
On the 10th of August, I recorded a set of short videos just in between getting better from Covid and before having to leave for a short trip of two weeks.
It sadly too until today for me to get to posting them publically.
Below is a playlist of 10 videos that give an overview on where we stood on the 10th of August.
Picking up from where we left off then, I am happy to report that since then there’s been some progress on some of the mentioned areas already.
Baseless patches
Specifically, the issue discussed in part 8 (‘Specific issues with baseless patches’) has seen some approaches tested with promising results that allow us to get them into Gitea’s data-model properly and neatly.
The upshot is that we’re iterating through the data we get from Phabricator on every pull-request it has, checking for any info it hands us. If we have a commit-hash for each patch-revision that’s part of the pullrequest/review , then we obviously have no real issues and can get the data properly into Gitea.
There are however pull-requests where there’s only one revision (typically the last) that has an associated commit-hash associated with it. In fact, there’s also quite a few that have NO commit-hash associated whatsoever. These are the troublesome ones, but it’d seem that using a git-bisect approach allows us to retro-actively come up with a *valid* (read: not necessarily correct) commit-hash that the patch applies cleanly to.
It is likely we’ll include some kind of attribute/comment to these to indicate they were ‘best match’ retrofits and not original data.
The issue of what to do with these kinds of ‘baseless patches’ was quite a complicating topic, so it’s nice to see that the first code-experiments of our suggested approach seem to be working out.
Next update
A next update is expected to land somewhere at the end of this week or the beginning of the next.
Thanks for the update, listened to these while working and find it interesting to know whats going on behind the scenes.
The issue of repositories and de-duplication is quite interesting, it’s unfortunate file-system level de-duplication isn’t a good long term solution (as you mention the underlying files diverge).
In Part 6 of your playlist, you asked for input, so here’s mine:
Yes, that sounds like a useful feature; but instead of adding an offset, you could encode them as odd/even: Task `n` would map to `2*n` in gitea, Differential `n` would map to `2*n+1`. The reverse mapping from gitea to phabricator is just divide by 2 and the remainder gives which type. If you have `k` types of data, multiply by `k` instead of 2.
A big thanks to you and all the other folks at Blender for all you do.
In order to prevent spam, comments are closed 7 days after the post is published. Feel free to continue the conversation on the forums.