Skip to content

Commit 5231cc0

Browse files
authored
Replace experimental path walk feature with upstream version (#5689)
This reverts the experimental version of the `git repack --path-walk` feature and replaces it with the version that recently merged with upstream. It includes `fixup!` commits for the reverts of the previous feature for easy reduction in the branch thicket during the 2.51.0 release window. > Note: In the current version, I have not performed any local validation, only resolved one text merge conflict.
2 parents 4d32d83 + 80c1092 commit 5231cc0

20 files changed

+188
-127
lines changed

Documentation/config/pack.adoc

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -156,12 +156,8 @@ pack.useSparse::
156156
`true`.
157157

158158
pack.usePathWalk::
159-
When true, git will default to using the '--path-walk' option in
160-
'git pack-objects' when the '--revs' option is present. This
161-
algorithm groups objects by path to maximize the ability to
162-
compute delta chains across historical versions of the same
163-
object. This may disable other options, such as using bitmaps to
164-
enumerate objects.
159+
Enable the `--path-walk` option by default for `git pack-objects`
160+
processes. See linkgit:git-pack-objects[1] for full details.
165161

166162
pack.preferBitmapTips::
167163
When selecting which commits will receive bitmaps, prefer a

Documentation/git-pack-objects.adoc

Lines changed: 17 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -10,13 +10,13 @@ SYNOPSIS
1010
--------
1111
[verse]
1212
'git pack-objects' [-q | --progress | --all-progress] [--all-progress-implied]
13-
[--no-reuse-delta] [--delta-base-offset] [--non-empty]
14-
[--local] [--incremental] [--window=<n>] [--depth=<n>]
15-
[--revs [--unpacked | --all]] [--keep-pack=<pack-name>]
16-
[--cruft] [--cruft-expiration=<time>]
17-
[--stdout [--filter=<filter-spec>] | <base-name>]
18-
[--shallow] [--keep-true-parents] [--[no-]sparse]
19-
[--name-hash-version=<n>] [--path-walk] < <object-list>
13+
[--no-reuse-delta] [--delta-base-offset] [--non-empty]
14+
[--local] [--incremental] [--window=<n>] [--depth=<n>]
15+
[--revs [--unpacked | --all]] [--keep-pack=<pack-name>]
16+
[--cruft] [--cruft-expiration=<time>]
17+
[--stdout [--filter=<filter-spec>] | <base-name>]
18+
[--shallow] [--keep-true-parents] [--[no-]sparse]
19+
[--name-hash-version=<n>] [--path-walk] < <object-list>
2020

2121

2222
DESCRIPTION
@@ -376,15 +376,16 @@ when writing reachability bitmap files with `--write-bitmap-index` and it
376376
will be automatically changed to version `1`.
377377
378378
--path-walk::
379-
By default, `git pack-objects` walks objects in an order that
380-
presents trees and blobs in an order unrelated to the path they
381-
appear relative to a commit's root tree. The `--path-walk` option
382-
enables a different walking algorithm that organizes trees and
383-
blobs by path. This has the potential to improve delta compression
384-
especially in the presence of filenames that cause collisions in
385-
Git's default name-hash algorithm. Due to changing how the objects
386-
are walked, this option is not compatible with `--delta-islands`,
387-
`--shallow`, or `--filter`.
379+
Perform compression by first organizing objects by path, then a
380+
second pass that compresses across paths as normal. This has the
381+
potential to improve delta compression especially in the presence
382+
of filenames that cause collisions in Git's default name-hash
383+
algorithm.
384+
+
385+
Incompatible with `--delta-islands`, `--shallow`, or `--filter`. The
386+
`--use-bitmap-index` option will be ignored in the presence of
387+
`--path-walk.`
388+
388389
389390
DELTA ISLANDS
390391
-------------

Documentation/git-repack.adoc

Lines changed: 2 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -259,17 +259,8 @@ linkgit:git-multi-pack-index[1]).
259259
See linkgit:git-pack-objects[1] for full details.
260260

261261
--path-walk::
262-
This option passes the `--path-walk` option to the underlying
263-
`git pack-options` process (see linkgit:git-pack-objects[1]).
264-
By default, `git pack-objects` walks objects in an order that
265-
presents trees and blobs in an order unrelated to the path they
266-
appear relative to a commit's root tree. The `--path-walk` option
267-
enables a different walking algorithm that organizes trees and
268-
blobs by path. This has the potential to improve delta compression
269-
especially in the presence of filenames that cause collisions in
270-
Git's default name-hash algorithm. Due to changing how the objects
271-
are walked, this option is not compatible with `--delta-islands`
272-
or `--filter`.
262+
Pass the `--path-walk` option to the underlying `git pack-objects`
263+
process. See linkgit:git-pack-objects[1] for full details.
273264

274265
CONFIGURATION
275266
-------------

Documentation/technical/api-path-walk.adoc

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,14 @@ better off using the revision walk API instead.
5656
the revision walk so that the walk emits commits marked with the
5757
`UNINTERESTING` flag.
5858

59+
`edge_aggressive`::
60+
For performance reasons, usually only the boundary commits are
61+
explored to find UNINTERESTING objects. However, in the case of
62+
shallow clones it can be helpful to mark all trees and blobs
63+
reachable from UNINTERESTING tip commits as UNINTERESTING. This
64+
matches the behavior of `--objects-edge-aggressive` in the
65+
revision API.
66+
5967
`pl`::
6068
This pattern list pointer allows focusing the path-walk search to
6169
a set of patterns, only emitting paths that match the given
@@ -69,5 +77,5 @@ Examples
6977

7078
See example usages in:
7179
`t/helper/test-path-walk.c`,
80+
`builtin/pack-objects.c`,
7281
`builtin/backfill.c`
73-
`builtin/pack-objects.c`

builtin/pack-objects.c

Lines changed: 47 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@
4444
#include "blob.h"
4545
#include "tree.h"
4646
#include "path-walk.h"
47+
#include "trace2.h"
4748

4849
/*
4950
* Objects we are going to pack are collected in the `to_pack` structure.
@@ -187,8 +188,14 @@ static inline void oe_set_delta_size(struct packing_data *pack,
187188
#define SET_DELTA_SIBLING(obj, val) oe_set_delta_sibling(&to_pack, obj, val)
188189

189190
static const char *const pack_usage[] = {
190-
N_("git pack-objects --stdout [<options>] [< <ref-list> | < <object-list>]"),
191-
N_("git pack-objects [<options>] <base-name> [< <ref-list> | < <object-list>]"),
191+
N_("git pack-objects [-q | --progress | --all-progress] [--all-progress-implied]\n"
192+
" [--no-reuse-delta] [--delta-base-offset] [--non-empty]\n"
193+
" [--local] [--incremental] [--window=<n>] [--depth=<n>]\n"
194+
" [--revs [--unpacked | --all]] [--keep-pack=<pack-name>]\n"
195+
" [--cruft] [--cruft-expiration=<time>]\n"
196+
" [--stdout [--filter=<filter-spec>] | <base-name>]\n"
197+
" [--shallow] [--keep-true-parents] [--[no-]sparse]\n"
198+
" [--name-hash-version=<n>] [--path-walk] < <object-list>"),
192199
NULL
193200
};
194201

@@ -203,6 +210,7 @@ static int keep_unreachable, unpack_unreachable, include_tag;
203210
static timestamp_t unpack_unreachable_expiration;
204211
static int pack_loose_unreachable;
205212
static int cruft;
213+
static int shallow = 0;
206214
static timestamp_t cruft_expiration;
207215
static int local;
208216
static int have_non_local_packs;
@@ -3291,6 +3299,9 @@ static int add_ref_tag(const char *tag UNUSED, const char *referent UNUSED, cons
32913299
static int should_attempt_deltas(struct object_entry *entry)
32923300
{
32933301
if (DELTA(entry))
3302+
/* This happens if we decided to reuse existing
3303+
* delta from a pack. "reuse_delta &&" is implied.
3304+
*/
32943305
return 0;
32953306

32963307
if (!entry->type_valid ||
@@ -3315,16 +3326,16 @@ static int should_attempt_deltas(struct object_entry *entry)
33153326
return 1;
33163327
}
33173328

3318-
static void find_deltas_for_region(struct object_entry *list UNUSED,
3329+
static void find_deltas_for_region(struct object_entry *list,
33193330
struct packing_region *region,
33203331
unsigned int *processed)
33213332
{
33223333
struct object_entry **delta_list;
3323-
uint32_t delta_list_nr = 0;
3334+
unsigned int delta_list_nr = 0;
33243335

33253336
ALLOC_ARRAY(delta_list, region->nr);
3326-
for (uint32_t i = 0; i < region->nr; i++) {
3327-
struct object_entry *entry = to_pack.objects + region->start + i;
3337+
for (size_t i = 0; i < region->nr; i++) {
3338+
struct object_entry *entry = list + region->start + i;
33283339
if (should_attempt_deltas(entry))
33293340
delta_list[delta_list_nr++] = entry;
33303341
}
@@ -3336,10 +3347,10 @@ static void find_deltas_for_region(struct object_entry *list UNUSED,
33363347

33373348
static void find_deltas_by_region(struct object_entry *list,
33383349
struct packing_region *regions,
3339-
uint32_t start, uint32_t nr)
3350+
size_t start, size_t nr)
33403351
{
33413352
unsigned int processed = 0;
3342-
uint32_t progress_nr;
3353+
size_t progress_nr;
33433354

33443355
if (!nr)
33453356
return;
@@ -3422,7 +3433,10 @@ static void ll_find_deltas_by_region(struct object_entry *list,
34223433
}
34233434

34243435
if (progress > pack_to_stdout)
3425-
fprintf_ln(stderr, _("Path-based delta compression using up to %d threads"),
3436+
fprintf_ln(stderr,
3437+
Q_("Path-based delta compression using up to %d thread",
3438+
"Path-based delta compression using up to %d threads",
3439+
delta_search_threads),
34263440
delta_search_threads);
34273441
CALLOC_ARRAY(p, delta_search_threads);
34283442

@@ -4489,11 +4503,11 @@ static void mark_bitmap_preferred_tips(void)
44894503
}
44904504
}
44914505

4492-
static inline int is_oid_interesting(struct repository *repo,
4493-
struct object_id *oid)
4506+
static inline int is_oid_uninteresting(struct repository *repo,
4507+
struct object_id *oid)
44944508
{
44954509
struct object *o = lookup_object(repo, oid);
4496-
return o && !(o->flags & UNINTERESTING);
4510+
return !o || (o->flags & UNINTERESTING);
44974511
}
44984512

44994513
static int add_objects_by_path(const char *path,
@@ -4521,7 +4535,7 @@ static int add_objects_by_path(const char *path,
45214535
OBJECT_INFO_FOR_PREFETCH) < 0)
45224536
continue;
45234537

4524-
exclude = !is_oid_interesting(the_repository, oid);
4538+
exclude = is_oid_uninteresting(the_repository, oid);
45254539

45264540
if (exclude && !thin)
45274541
continue;
@@ -4553,11 +4567,11 @@ static void get_object_list_path_walk(struct rev_info *revs)
45534567
{
45544568
struct path_walk_info info = PATH_WALK_INFO_INIT;
45554569
unsigned int processed = 0;
4570+
int result;
45564571

45574572
info.revs = revs;
45584573
info.path_fn = add_objects_by_path;
45594574
info.path_fn_data = &processed;
4560-
revs->tag_objects = 1;
45614575

45624576
/*
45634577
* Allow the --[no-]sparse option to be interesting here, if only
@@ -4566,8 +4580,13 @@ static void get_object_list_path_walk(struct rev_info *revs)
45664580
* base objects.
45674581
*/
45684582
info.prune_all_uninteresting = sparse;
4583+
info.edge_aggressive = shallow;
4584+
4585+
trace2_region_enter("pack-objects", "path-walk", revs->repo);
4586+
result = walk_objects_by_path(&info);
4587+
trace2_region_leave("pack-objects", "path-walk", revs->repo);
45694588

4570-
if (walk_objects_by_path(&info))
4589+
if (result)
45714590
die(_("failed to pack objects via path-walk"));
45724591
}
45734592

@@ -4617,7 +4636,7 @@ static void get_object_list(struct rev_info *revs, int ac, const char **av)
46174636

46184637
warn_on_object_refname_ambiguity = save_warning;
46194638

4620-
if (use_bitmap_index && !path_walk && !get_object_list_from_bitmap(revs))
4639+
if (use_bitmap_index && !get_object_list_from_bitmap(revs))
46214640
return;
46224641

46234642
if (use_delta_islands)
@@ -4767,7 +4786,6 @@ int cmd_pack_objects(int argc,
47674786
struct repository *repo UNUSED)
47684787
{
47694788
int use_internal_rev_list = 0;
4770-
int shallow = 0;
47714789
int all_progress_implied = 0;
47724790
struct strvec rp = STRVEC_INIT;
47734791
int rev_list_unpacked = 0, rev_list_all = 0, rev_list_reflog = 0;
@@ -4947,17 +4965,18 @@ int cmd_pack_objects(int argc,
49474965

49484966
strvec_push(&rp, "pack-objects");
49494967

4950-
if (path_walk && filter_options.choice) {
4951-
warning(_("cannot use --filter with --path-walk"));
4952-
path_walk = 0;
4953-
}
4954-
if (path_walk && use_delta_islands) {
4955-
warning(_("cannot use delta islands with --path-walk"));
4956-
path_walk = 0;
4957-
}
4958-
if (path_walk && shallow) {
4959-
warning(_("cannot use --shallow with --path-walk"));
4960-
path_walk = 0;
4968+
if (path_walk) {
4969+
const char *option = NULL;
4970+
if (filter_options.choice)
4971+
option = "--filter";
4972+
else if (use_delta_islands)
4973+
option = "--delta-islands";
4974+
4975+
if (option) {
4976+
warning(_("cannot use %s with %s"),
4977+
option, "--path-walk");
4978+
path_walk = 0;
4979+
}
49614980
}
49624981
if (path_walk) {
49634982
strvec_push(&rp, "--boundary");

builtin/repack.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1188,7 +1188,7 @@ int cmd_repack(int argc,
11881188
OPT_INTEGER(0, "name-hash-version", &po_args.name_hash_version,
11891189
N_("specify the name hash version to use for grouping similar objects by path")),
11901190
OPT_BOOL(0, "path-walk", &po_args.path_walk,
1191-
N_("(EXPERIMENTAL!) pass --path-walk to git-pack-objects")),
1191+
N_("pass --path-walk to git-pack-objects")),
11921192
OPT_NEGBIT('n', NULL, &run_update_server_info,
11931193
N_("do not run git-update-server-info"), 1),
11941194
OPT__QUIET(&po_args.quiet, N_("be quiet")),

ci/run-build-and-tests.sh

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,6 @@ linux-TEST-vars)
2626
export GIT_TEST_NO_WRITE_REV_INDEX=1
2727
export GIT_TEST_CHECKOUT_WORKERS=2
2828
export GIT_TEST_PACK_USE_BITMAP_BOUNDARY_TRAVERSAL=1
29-
export GIT_TEST_PACK_PATH_WALK=1
3029
;;
3130
linux-clang)
3231
export GIT_TEST_DEFAULT_HASH=sha1

pack-objects.h

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -125,8 +125,8 @@ struct object_entry {
125125
* as given by a starting index and a number of elements.
126126
*/
127127
struct packing_region {
128-
uint32_t start;
129-
uint32_t nr;
128+
size_t start;
129+
size_t nr;
130130
};
131131

132132
struct packing_data {
@@ -135,7 +135,7 @@ struct packing_data {
135135
uint32_t nr_objects, nr_alloc;
136136

137137
struct packing_region *regions;
138-
uint32_t nr_regions, nr_regions_alloc;
138+
size_t nr_regions, nr_regions_alloc;
139139

140140
int32_t *index;
141141
uint32_t index_size;

path-walk.c

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -503,7 +503,11 @@ int walk_objects_by_path(struct path_walk_info *info)
503503
if (prepare_revision_walk(info->revs))
504504
die(_("failed to setup revision walk"));
505505

506-
/* Walk trees to mark them as UNINTERESTING. */
506+
/*
507+
* Walk trees to mark them as UNINTERESTING.
508+
* This is particularly important when 'edge_aggressive' is set.
509+
*/
510+
info->revs->edge_hint_aggressive = info->edge_aggressive;
507511
edge_repo = info->revs->repo;
508512
edge_tree_list = root_tree_list;
509513
mark_edges_uninteresting(info->revs, show_edge,

path-walk.h

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,13 @@ struct path_walk_info {
5050
*/
5151
int prune_all_uninteresting;
5252

53+
/**
54+
* When 'edge_aggressive' is set, then the revision walk will use
55+
* the '--object-edge-aggressive' option to mark even more objects
56+
* as uninteresting.
57+
*/
58+
int edge_aggressive;
59+
5360
/**
5461
* Specify a sparse-checkout definition to match our paths to. Do not
5562
* walk outside of this sparse definition. If the patterns are in

0 commit comments

Comments
 (0)