Spaces:
Runtime error
Runtime error
| <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"> | |
| <head> | |
| <meta http-equiv="Content-Type" content="application/xhtml+xml; charset=UTF-8" /> | |
| <meta name="generator" content="AsciiDoc 10.2.0" /> | |
| <title>Bundle URIs</title> | |
| <style type="text/css"> | |
| /* Shared CSS for AsciiDoc xhtml11 and html5 backends */ | |
| /* Default font. */ | |
| body { | |
| font-family: Georgia,serif; | |
| } | |
| /* Title font. */ | |
| h1, h2, h3, h4, h5, h6, | |
| div.title, caption.title, | |
| thead, p.table.header, | |
| #toctitle, | |
| #author, #revnumber, #revdate, #revremark, | |
| #footer { | |
| font-family: Arial,Helvetica,sans-serif; | |
| } | |
| body { | |
| margin: 1em 5% 1em 5%; | |
| } | |
| a { | |
| color: blue; | |
| text-decoration: underline; | |
| } | |
| a:visited { | |
| color: fuchsia; | |
| } | |
| em { | |
| font-style: italic; | |
| color: navy; | |
| } | |
| strong { | |
| font-weight: bold; | |
| color: #083194; | |
| } | |
| h1, h2, h3, h4, h5, h6 { | |
| color: #527bbd; | |
| margin-top: 1.2em; | |
| margin-bottom: 0.5em; | |
| line-height: 1.3; | |
| } | |
| h1, h2, h3 { | |
| border-bottom: 2px solid silver; | |
| } | |
| h2 { | |
| padding-top: 0.5em; | |
| } | |
| h3 { | |
| float: left; | |
| } | |
| h3 + * { | |
| clear: left; | |
| } | |
| h5 { | |
| font-size: 1.0em; | |
| } | |
| div.sectionbody { | |
| margin-left: 0; | |
| } | |
| hr { | |
| border: 1px solid silver; | |
| } | |
| p { | |
| margin-top: 0.5em; | |
| margin-bottom: 0.5em; | |
| } | |
| ul, ol, li > p { | |
| margin-top: 0; | |
| } | |
| ul > li { color: #aaa; } | |
| ul > li > * { color: black; } | |
| .monospaced, code, pre { | |
| font-family: "Courier New", Courier, monospace; | |
| font-size: inherit; | |
| color: navy; | |
| padding: 0; | |
| margin: 0; | |
| } | |
| pre { | |
| white-space: pre-wrap; | |
| } | |
| #author { | |
| color: #527bbd; | |
| font-weight: bold; | |
| font-size: 1.1em; | |
| } | |
| #email { | |
| } | |
| #revnumber, #revdate, #revremark { | |
| } | |
| #footer { | |
| font-size: small; | |
| border-top: 2px solid silver; | |
| padding-top: 0.5em; | |
| margin-top: 4.0em; | |
| } | |
| #footer-text { | |
| float: left; | |
| padding-bottom: 0.5em; | |
| } | |
| #footer-badges { | |
| float: right; | |
| padding-bottom: 0.5em; | |
| } | |
| #preamble { | |
| margin-top: 1.5em; | |
| margin-bottom: 1.5em; | |
| } | |
| div.imageblock, div.exampleblock, div.verseblock, | |
| div.quoteblock, div.literalblock, div.listingblock, div.sidebarblock, | |
| div.admonitionblock { | |
| margin-top: 1.0em; | |
| margin-bottom: 1.5em; | |
| } | |
| div.admonitionblock { | |
| margin-top: 2.0em; | |
| margin-bottom: 2.0em; | |
| margin-right: 10%; | |
| color: #606060; | |
| } | |
| div.content { /* Block element content. */ | |
| padding: 0; | |
| } | |
| /* Block element titles. */ | |
| div.title, caption.title { | |
| color: #527bbd; | |
| font-weight: bold; | |
| text-align: left; | |
| margin-top: 1.0em; | |
| margin-bottom: 0.5em; | |
| } | |
| div.title + * { | |
| margin-top: 0; | |
| } | |
| td div.title:first-child { | |
| margin-top: 0.0em; | |
| } | |
| div.content div.title:first-child { | |
| margin-top: 0.0em; | |
| } | |
| div.content + div.title { | |
| margin-top: 0.0em; | |
| } | |
| div.sidebarblock > div.content { | |
| background: #ffffee; | |
| border: 1px solid #dddddd; | |
| border-left: 4px solid #f0f0f0; | |
| padding: 0.5em; | |
| } | |
| div.listingblock > div.content { | |
| border: 1px solid #dddddd; | |
| border-left: 5px solid #f0f0f0; | |
| background: #f8f8f8; | |
| padding: 0.5em; | |
| } | |
| div.quoteblock, div.verseblock { | |
| padding-left: 1.0em; | |
| margin-left: 1.0em; | |
| margin-right: 10%; | |
| border-left: 5px solid #f0f0f0; | |
| color: #888; | |
| } | |
| div.quoteblock > div.attribution { | |
| padding-top: 0.5em; | |
| text-align: right; | |
| } | |
| div.verseblock > pre.content { | |
| font-family: inherit; | |
| font-size: inherit; | |
| } | |
| div.verseblock > div.attribution { | |
| padding-top: 0.75em; | |
| text-align: left; | |
| } | |
| /* DEPRECATED: Pre version 8.2.7 verse style literal block. */ | |
| div.verseblock + div.attribution { | |
| text-align: left; | |
| } | |
| div.admonitionblock .icon { | |
| vertical-align: top; | |
| font-size: 1.1em; | |
| font-weight: bold; | |
| text-decoration: underline; | |
| color: #527bbd; | |
| padding-right: 0.5em; | |
| } | |
| div.admonitionblock td.content { | |
| padding-left: 0.5em; | |
| border-left: 3px solid #dddddd; | |
| } | |
| div.exampleblock > div.content { | |
| border-left: 3px solid #dddddd; | |
| padding-left: 0.5em; | |
| } | |
| div.imageblock div.content { padding-left: 0; } | |
| span.image img { border-style: none; vertical-align: text-bottom; } | |
| a.image:visited { color: white; } | |
| dl { | |
| margin-top: 0.8em; | |
| margin-bottom: 0.8em; | |
| } | |
| dt { | |
| margin-top: 0.5em; | |
| margin-bottom: 0; | |
| font-style: normal; | |
| color: navy; | |
| } | |
| dd > *:first-child { | |
| margin-top: 0.1em; | |
| } | |
| ul, ol { | |
| list-style-position: outside; | |
| } | |
| ol.arabic { | |
| list-style-type: decimal; | |
| } | |
| ol.loweralpha { | |
| list-style-type: lower-alpha; | |
| } | |
| ol.upperalpha { | |
| list-style-type: upper-alpha; | |
| } | |
| ol.lowerroman { | |
| list-style-type: lower-roman; | |
| } | |
| ol.upperroman { | |
| list-style-type: upper-roman; | |
| } | |
| div.compact ul, div.compact ol, | |
| div.compact p, div.compact p, | |
| div.compact div, div.compact div { | |
| margin-top: 0.1em; | |
| margin-bottom: 0.1em; | |
| } | |
| tfoot { | |
| font-weight: bold; | |
| } | |
| td > div.verse { | |
| white-space: pre; | |
| } | |
| div.hdlist { | |
| margin-top: 0.8em; | |
| margin-bottom: 0.8em; | |
| } | |
| div.hdlist tr { | |
| padding-bottom: 15px; | |
| } | |
| dt.hdlist1.strong, td.hdlist1.strong { | |
| font-weight: bold; | |
| } | |
| td.hdlist1 { | |
| vertical-align: top; | |
| font-style: normal; | |
| padding-right: 0.8em; | |
| color: navy; | |
| } | |
| td.hdlist2 { | |
| vertical-align: top; | |
| } | |
| div.hdlist.compact tr { | |
| margin: 0; | |
| padding-bottom: 0; | |
| } | |
| .comment { | |
| background: yellow; | |
| } | |
| .footnote, .footnoteref { | |
| font-size: 0.8em; | |
| } | |
| span.footnote, span.footnoteref { | |
| vertical-align: super; | |
| } | |
| #footnotes { | |
| margin: 20px 0 20px 0; | |
| padding: 7px 0 0 0; | |
| } | |
| #footnotes div.footnote { | |
| margin: 0 0 5px 0; | |
| } | |
| #footnotes hr { | |
| border: none; | |
| border-top: 1px solid silver; | |
| height: 1px; | |
| text-align: left; | |
| margin-left: 0; | |
| width: 20%; | |
| min-width: 100px; | |
| } | |
| div.colist td { | |
| padding-right: 0.5em; | |
| padding-bottom: 0.3em; | |
| vertical-align: top; | |
| } | |
| div.colist td img { | |
| margin-top: 0.3em; | |
| } | |
| @media print { | |
| #footer-badges { display: none; } | |
| } | |
| #toc { | |
| margin-bottom: 2.5em; | |
| } | |
| #toctitle { | |
| color: #527bbd; | |
| font-size: 1.1em; | |
| font-weight: bold; | |
| margin-top: 1.0em; | |
| margin-bottom: 0.1em; | |
| } | |
| div.toclevel0, div.toclevel1, div.toclevel2, div.toclevel3, div.toclevel4 { | |
| margin-top: 0; | |
| margin-bottom: 0; | |
| } | |
| div.toclevel2 { | |
| margin-left: 2em; | |
| font-size: 0.9em; | |
| } | |
| div.toclevel3 { | |
| margin-left: 4em; | |
| font-size: 0.9em; | |
| } | |
| div.toclevel4 { | |
| margin-left: 6em; | |
| font-size: 0.9em; | |
| } | |
| span.aqua { color: aqua; } | |
| span.black { color: black; } | |
| span.blue { color: blue; } | |
| span.fuchsia { color: fuchsia; } | |
| span.gray { color: gray; } | |
| span.green { color: green; } | |
| span.lime { color: lime; } | |
| span.maroon { color: maroon; } | |
| span.navy { color: navy; } | |
| span.olive { color: olive; } | |
| span.purple { color: purple; } | |
| span.red { color: red; } | |
| span.silver { color: silver; } | |
| span.teal { color: teal; } | |
| span.white { color: white; } | |
| span.yellow { color: yellow; } | |
| span.aqua-background { background: aqua; } | |
| span.black-background { background: black; } | |
| span.blue-background { background: blue; } | |
| span.fuchsia-background { background: fuchsia; } | |
| span.gray-background { background: gray; } | |
| span.green-background { background: green; } | |
| span.lime-background { background: lime; } | |
| span.maroon-background { background: maroon; } | |
| span.navy-background { background: navy; } | |
| span.olive-background { background: olive; } | |
| span.purple-background { background: purple; } | |
| span.red-background { background: red; } | |
| span.silver-background { background: silver; } | |
| span.teal-background { background: teal; } | |
| span.white-background { background: white; } | |
| span.yellow-background { background: yellow; } | |
| span.big { font-size: 2em; } | |
| span.small { font-size: 0.6em; } | |
| span.underline { text-decoration: underline; } | |
| span.overline { text-decoration: overline; } | |
| span.line-through { text-decoration: line-through; } | |
| div.unbreakable { page-break-inside: avoid; } | |
| /* | |
| * xhtml11 specific | |
| * | |
| * */ | |
| div.tableblock { | |
| margin-top: 1.0em; | |
| margin-bottom: 1.5em; | |
| } | |
| div.tableblock > table { | |
| border: 3px solid #527bbd; | |
| } | |
| thead, p.table.header { | |
| font-weight: bold; | |
| color: #527bbd; | |
| } | |
| p.table { | |
| margin-top: 0; | |
| } | |
| /* Because the table frame attribute is overridden by CSS in most browsers. */ | |
| div.tableblock > table[frame="void"] { | |
| border-style: none; | |
| } | |
| div.tableblock > table[frame="hsides"] { | |
| border-left-style: none; | |
| border-right-style: none; | |
| } | |
| div.tableblock > table[frame="vsides"] { | |
| border-top-style: none; | |
| border-bottom-style: none; | |
| } | |
| /* | |
| * html5 specific | |
| * | |
| * */ | |
| table.tableblock { | |
| margin-top: 1.0em; | |
| margin-bottom: 1.5em; | |
| } | |
| thead, p.tableblock.header { | |
| font-weight: bold; | |
| color: #527bbd; | |
| } | |
| p.tableblock { | |
| margin-top: 0; | |
| } | |
| table.tableblock { | |
| border-width: 3px; | |
| border-spacing: 0px; | |
| border-style: solid; | |
| border-color: #527bbd; | |
| border-collapse: collapse; | |
| } | |
| th.tableblock, td.tableblock { | |
| border-width: 1px; | |
| padding: 4px; | |
| border-style: solid; | |
| border-color: #527bbd; | |
| } | |
| table.tableblock.frame-topbot { | |
| border-left-style: hidden; | |
| border-right-style: hidden; | |
| } | |
| table.tableblock.frame-sides { | |
| border-top-style: hidden; | |
| border-bottom-style: hidden; | |
| } | |
| table.tableblock.frame-none { | |
| border-style: hidden; | |
| } | |
| th.tableblock.halign-left, td.tableblock.halign-left { | |
| text-align: left; | |
| } | |
| th.tableblock.halign-center, td.tableblock.halign-center { | |
| text-align: center; | |
| } | |
| th.tableblock.halign-right, td.tableblock.halign-right { | |
| text-align: right; | |
| } | |
| th.tableblock.valign-top, td.tableblock.valign-top { | |
| vertical-align: top; | |
| } | |
| th.tableblock.valign-middle, td.tableblock.valign-middle { | |
| vertical-align: middle; | |
| } | |
| th.tableblock.valign-bottom, td.tableblock.valign-bottom { | |
| vertical-align: bottom; | |
| } | |
| /* | |
| * manpage specific | |
| * | |
| * */ | |
| body.manpage h1 { | |
| padding-top: 0.5em; | |
| padding-bottom: 0.5em; | |
| border-top: 2px solid silver; | |
| border-bottom: 2px solid silver; | |
| } | |
| body.manpage h2 { | |
| border-style: none; | |
| } | |
| body.manpage div.sectionbody { | |
| margin-left: 3em; | |
| } | |
| @media print { | |
| body.manpage div#toc { display: none; } | |
| } | |
| </style> | |
| <script type="text/javascript"> | |
| /*<+'])'); | |
| // Function that scans the DOM tree for header elements (the DOM2 | |
| // nodeIterator API would be a better technique but not supported by all | |
| // browsers). | |
| var iterate = function (el) { | |
| for (var i = el.firstChild; i != null; i = i.nextSibling) { | |
| if (i.nodeType == 1 /* Node.ELEMENT_NODE */) { | |
| var mo = re.exec(i.tagName); | |
| if (mo && (i.getAttribute("class") || i.getAttribute("className")) != "float") { | |
| result[result.length] = new TocEntry(i, getText(i), mo[1]-1); | |
| } | |
| iterate(i); | |
| } | |
| } | |
| } | |
| iterate(el); | |
| return result; | |
| } | |
| var toc = document.getElementById("toc"); | |
| if (!toc) { | |
| return; | |
| } | |
| // Delete existing TOC entries in case we're reloading the TOC. | |
| var tocEntriesToRemove = []; | |
| var i; | |
| for (i = 0; i < toc.childNodes.length; i++) { | |
| var entry = toc.childNodes[i]; | |
| if (entry.nodeName.toLowerCase() == 'div' | |
| && entry.getAttribute("class") | |
| && entry.getAttribute("class").match(/^toclevel/)) | |
| tocEntriesToRemove.push(entry); | |
| } | |
| for (i = 0; i < tocEntriesToRemove.length; i++) { | |
| toc.removeChild(tocEntriesToRemove[i]); | |
| } | |
| // Rebuild TOC entries. | |
| var entries = tocEntries(document.getElementById("content"), toclevels); | |
| for (var i = 0; i < entries.length; ++i) { | |
| var entry = entries[i]; | |
| if (entry.element.id == "") | |
| entry.element.id = "_toc_" + i; | |
| var a = document.createElement("a"); | |
| a.href = "#" + entry.element.id; | |
| a.appendChild(document.createTextNode(entry.text)); | |
| var div = document.createElement("div"); | |
| div.appendChild(a); | |
| div.className = "toclevel" + entry.toclevel; | |
| toc.appendChild(div); | |
| } | |
| if (entries.length == 0) | |
| toc.parentNode.removeChild(toc); | |
| }, | |
| ///////////////////////////////////////////////////////////////////// | |
| // Footnotes generator | |
| ///////////////////////////////////////////////////////////////////// | |
| /* Based on footnote generation code from: | |
| * http://www.brandspankingnew.net/archive/2005/07/format_footnote.html | |
| */ | |
| footnotes: function () { | |
| // Delete existing footnote entries in case we're reloading the footnodes. | |
| var i; | |
| var noteholder = document.getElementById("footnotes"); | |
| if (!noteholder) { | |
| return; | |
| } | |
| var entriesToRemove = []; | |
| for (i = 0; i < noteholder.childNodes.length; i++) { | |
| var entry = noteholder.childNodes[i]; | |
| if (entry.nodeName.toLowerCase() == 'div' && entry.getAttribute("class") == "footnote") | |
| entriesToRemove.push(entry); | |
| } | |
| for (i = 0; i < entriesToRemove.length; i++) { | |
| noteholder.removeChild(entriesToRemove[i]); | |
| } | |
| // Rebuild footnote entries. | |
| var cont = document.getElementById("content"); | |
| var spans = cont.getElementsByTagName("span"); | |
| var refs = {}; | |
| var n = 0; | |
| for (i=0; i<spans.length; i++) { | |
| if (spans[i].className == "footnote") { | |
| n++; | |
| var note = spans[i].getAttribute("data-note"); | |
| if (!note) { | |
| // Use [\s\S] in place of . so multi-line matches work. | |
| // Because JavaScript has no s (dotall) regex flag. | |
| note = spans[i].innerHTML.match(/\s*\[([\s\S]*)]\s*/)[1]; | |
| spans[i].innerHTML = | |
| "[<a id='_footnoteref_" + n + "' href='#_footnote_" + n + | |
| "' title='View footnote' class='footnote'>" + n + "</a>]"; | |
| spans[i].setAttribute("data-note", note); | |
| } | |
| noteholder.innerHTML += | |
| "<div class='footnote' id='_footnote_" + n + "'>" + | |
| "<a href='#_footnoteref_" + n + "' title='Return to text'>" + | |
| n + "</a>. " + note + "</div>"; | |
| var id =spans[i].getAttribute("id"); | |
| if (id != null) refs["#"+id] = n; | |
| } | |
| } | |
| if (n == 0) | |
| noteholder.parentNode.removeChild(noteholder); | |
| else { | |
| // Process footnoterefs. | |
| for (i=0; i<spans.length; i++) { | |
| if (spans[i].className == "footnoteref") { | |
| var href = spans[i].getElementsByTagName("a")[0].getAttribute("href"); | |
| href = href.match(/#.*/)[0]; // Because IE return full URL. | |
| n = refs[href]; | |
| spans[i].innerHTML = | |
| "[<a href='#_footnote_" + n + | |
| "' title='View footnote' class='footnote'>" + n + "</a>]"; | |
| } | |
| } | |
| } | |
| }, | |
| install: function(toclevels) { | |
| var timerId; | |
| function reinstall() { | |
| asciidoc.footnotes(); | |
| if (toclevels) { | |
| asciidoc.toc(toclevels); | |
| } | |
| } | |
| function reinstallAndRemoveTimer() { | |
| clearInterval(timerId); | |
| reinstall(); | |
| } | |
| timerId = setInterval(reinstall, 500); | |
| if (document.addEventListener) | |
| document.addEventListener("DOMContentLoaded", reinstallAndRemoveTimer, false); | |
| else | |
| window.onload = reinstallAndRemoveTimer; | |
| } | |
| } | |
| asciidoc.install(); | |
| /*]]>*/ | |
| </script> | |
| </head> | |
| <body class="article"> | |
| <div id="header"> | |
| <h1>Bundle URIs</h1> | |
| </div> | |
| <div id="content"> | |
| <div id="preamble"> | |
| <div class="sectionbody"> | |
| <div class="paragraph"><p>Git bundles are files that store a pack-file along with some extra metadata, | |
| including a set of refs and a (possibly empty) set of necessary commits. See | |
| <a href="../git-bundle.html">git-bundle(1)</a> and <a href="../gitformat-bundle.html">gitformat-bundle(5)</a> for more information.</p></div> | |
| <div class="paragraph"><p>Bundle URIs are locations where Git can download one or more bundles in | |
| order to bootstrap the object database in advance of fetching the remaining | |
| objects from a remote.</p></div> | |
| <div class="paragraph"><p>One goal is to speed up clones and fetches for users with poor network | |
| connectivity to the origin server. Another benefit is to allow heavy users, | |
| such as CI build farms, to use local resources for the majority of Git data | |
| and thereby reducing the load on the origin server.</p></div> | |
| <div class="paragraph"><p>To enable the bundle URI feature, users can specify a bundle URI using | |
| command-line options or the origin server can advertise one or more URIs | |
| via a protocol v2 capability.</p></div> | |
| </div> | |
| </div> | |
| <div class="sect1"> | |
| <h2 id="_design_goals">Design Goals</h2> | |
| <div class="sectionbody"> | |
| <div class="paragraph"><p>The bundle URI standard aims to be flexible enough to satisfy multiple | |
| workloads. The bundle provider and the Git client have several choices in | |
| how they create and consume bundle URIs.</p></div> | |
| <div class="ulist"><ul> | |
| <li> | |
| <p> | |
| Bundles can have whatever name the server desires. This name could refer | |
| to immutable data by using a hash of the bundle contents. However, this | |
| means that a new URI will be needed after every update of the content. | |
| This might be acceptable if the server is advertising the URI (and the | |
| server is aware of new bundles being generated) but would not be | |
| ergonomic for users using the command line option. | |
| </p> | |
| </li> | |
| <li> | |
| <p> | |
| The bundles could be organized specifically for bootstrapping full | |
| clones, but could also be organized with the intention of bootstrapping | |
| incremental fetches. The bundle provider must decide on one of several | |
| organization schemes to minimize client downloads during incremental | |
| fetches, but the Git client can also choose whether to use bundles for | |
| either of these operations. | |
| </p> | |
| </li> | |
| <li> | |
| <p> | |
| The bundle provider can choose to support full clones, partial clones, | |
| or both. The client can detect which bundles are appropriate for the | |
| repository’s partial clone filter, if any. | |
| </p> | |
| </li> | |
| <li> | |
| <p> | |
| The bundle provider can use a single bundle (for clones only), or a | |
| list of bundles. When using a list of bundles, the provider can specify | |
| whether or not the client needs <em>all</em> of the bundle URIs for a full | |
| clone, or if <em>any</em> one of the bundle URIs is sufficient. This allows the | |
| bundle provider to use different URIs for different geographies. | |
| </p> | |
| </li> | |
| <li> | |
| <p> | |
| The bundle provider can organize the bundles using heuristics, such as | |
| creation tokens, to help the client prevent downloading bundles it does | |
| not need. When the bundle provider does not provide these heuristics, | |
| the client can use optimizations to minimize how much of the data is | |
| downloaded. | |
| </p> | |
| </li> | |
| <li> | |
| <p> | |
| The bundle provider does not need to be associated with the Git server. | |
| The client can choose to use the bundle provider without it being | |
| advertised by the Git server. | |
| </p> | |
| </li> | |
| <li> | |
| <p> | |
| The client can choose to discover bundle providers that are advertised | |
| by the Git server. This could happen during <code>git clone</code>, during | |
| <code>git fetch</code>, both, or neither. The user can choose which combination | |
| works best for them. | |
| </p> | |
| </li> | |
| <li> | |
| <p> | |
| The client can choose to configure a bundle provider manually at any | |
| time. The client can also choose to specify a bundle provider manually | |
| as a command-line option to <code>git clone</code>. | |
| </p> | |
| </li> | |
| </ul></div> | |
| <div class="paragraph"><p>Each repository is different and every Git server has different needs. | |
| Hopefully the bundle URI feature is flexible enough to satisfy all needs. | |
| If not, then the feature can be extended through its versioning mechanism.</p></div> | |
| </div> | |
| </div> | |
| <div class="sect1"> | |
| <h2 id="_server_requirements">Server requirements</h2> | |
| <div class="sectionbody"> | |
| <div class="paragraph"><p>To provide a server-side implementation of bundle servers, no other parts | |
| of the Git protocol are required. This allows server maintainers to use | |
| static content solutions such as CDNs in order to serve the bundle files.</p></div> | |
| <div class="paragraph"><p>At the current scope of the bundle URI feature, all URIs are expected to | |
| be HTTP(S) URLs where content is downloaded to a local file using a <code>GET</code> | |
| request to that URL. The server could include authentication requirements | |
| to those requests with the aim of triggering the configured credential | |
| helper for secure access. (Future extensions could use "file://" URIs or | |
| SSH URIs.)</p></div> | |
| <div class="paragraph"><p>Assuming a <code>200 OK</code> response from the server, the content at the URL is | |
| inspected. First, Git attempts to parse the file as a bundle file of | |
| version 2 or higher. If the file is not a bundle, then the file is parsed | |
| as a plain-text file using Git’s config parser. The key-value pairs in | |
| that config file are expected to describe a list of bundle URIs. If | |
| neither of these parse attempts succeed, then Git will report an error to | |
| the user that the bundle URI provided erroneous data.</p></div> | |
| <div class="paragraph"><p>Any other data provided by the server is considered erroneous.</p></div> | |
| </div> | |
| </div> | |
| <div class="sect1"> | |
| <h2 id="_bundle_lists">Bundle Lists</h2> | |
| <div class="sectionbody"> | |
| <div class="paragraph"><p>The Git server can advertise bundle URIs using a set of <code>key=value</code> pairs. | |
| A bundle URI can also serve a plain-text file in the Git config format | |
| containing these same <code>key=value</code> pairs. In both cases, we consider this | |
| to be a <em>bundle list</em>. The pairs specify information about the bundles | |
| that the client can use to make decisions for which bundles to download | |
| and which to ignore.</p></div> | |
| <div class="paragraph"><p>A few keys focus on properties of the list itself.</p></div> | |
| <div class="dlist"><dl> | |
| <dt class="hdlist1"> | |
| bundle.version | |
| </dt> | |
| <dd> | |
| <p> | |
| (Required) This value provides a version number for the bundle | |
| list. If a future Git change enables a feature that needs the Git | |
| client to react to a new key in the bundle list file, then this version | |
| will increment. The only current version number is 1, and if any other | |
| value is specified then Git will fail to use this file. | |
| </p> | |
| </dd> | |
| <dt class="hdlist1"> | |
| bundle.mode | |
| </dt> | |
| <dd> | |
| <p> | |
| (Required) This value has one of two values: <code>all</code> and <code>any</code>. When <code>all</code> | |
| is specified, then the client should expect to need all of the listed | |
| bundle URIs that match their repository’s requirements. When <code>any</code> is | |
| specified, then the client should expect that any one of the bundle URIs | |
| that match their repository’s requirements will suffice. Typically, the | |
| <code>any</code> option is used to list a number of different bundle servers | |
| located in different geographies. | |
| </p> | |
| </dd> | |
| <dt class="hdlist1"> | |
| bundle.heuristic | |
| </dt> | |
| <dd> | |
| <p> | |
| If this string-valued key exists, then the bundle list is designed to | |
| work well with incremental <code>git fetch</code> commands. The heuristic signals | |
| that there are additional keys available for each bundle that help | |
| determine which subset of bundles the client should download. The only | |
| heuristic currently planned is <code>creationToken</code>. | |
| </p> | |
| </dd> | |
| </dl></div> | |
| <div class="paragraph"><p>The remaining keys include an <code><id></code> segment which is a server-designated | |
| name for each available bundle. The <code><id></code> must contain only alphanumeric | |
| and <code>-</code> characters.</p></div> | |
| <div class="dlist"><dl> | |
| <dt class="hdlist1"> | |
| bundle.<id>.uri | |
| </dt> | |
| <dd> | |
| <p> | |
| (Required) This string value is the URI for downloading bundle <code><id></code>. | |
| If the URI begins with a protocol (<code>http://</code> or <code>https://</code>) then the URI | |
| is absolute. Otherwise, the URI is interpreted as relative to the URI | |
| used for the bundle list. If the URI begins with <code>/</code>, then that relative | |
| path is relative to the domain name used for the bundle list. (This use | |
| of relative paths is intended to make it easier to distribute a set of | |
| bundles across a large number of servers or CDNs with different domain | |
| names.) | |
| </p> | |
| </dd> | |
| <dt class="hdlist1"> | |
| bundle.<id>.filter | |
| </dt> | |
| <dd> | |
| <p> | |
| This string value represents an object filter that should also appear in | |
| the header of this bundle. The server uses this value to differentiate | |
| different kinds of bundles from which the client can choose those that | |
| match their object filters. | |
| </p> | |
| </dd> | |
| <dt class="hdlist1"> | |
| bundle.<id>.creationToken | |
| </dt> | |
| <dd> | |
| <p> | |
| This value is a nonnegative 64-bit integer used for sorting the bundles | |
| list. This is used to download a subset of bundles during a fetch when | |
| <code>bundle.heuristic=creationToken</code>. | |
| </p> | |
| </dd> | |
| <dt class="hdlist1"> | |
| bundle.<id>.location | |
| </dt> | |
| <dd> | |
| <p> | |
| This string value advertises a real-world location from where the bundle | |
| URI is served. This can be used to present the user with an option for | |
| which bundle URI to use or simply as an informative indicator of which | |
| bundle URI was selected by Git. This is only valuable when | |
| <code>bundle.mode</code> is <code>any</code>. | |
| </p> | |
| </dd> | |
| </dl></div> | |
| <div class="paragraph"><p>Here is an example bundle list using the Git config format:</p></div> | |
| <div class="literalblock"> | |
| <div class="content"> | |
| <pre><code>[bundle] | |
| version = 1 | |
| mode = all | |
| heuristic = creationToken</code></pre> | |
| </div></div> | |
| <div class="literalblock"> | |
| <div class="content"> | |
| <pre><code>[bundle "2022-02-09-1644442601-daily"] | |
| uri = https://bundles.example.com/git/git/2022-02-09-1644442601-daily.bundle | |
| creationToken = 1644442601</code></pre> | |
| </div></div> | |
| <div class="literalblock"> | |
| <div class="content"> | |
| <pre><code>[bundle "2022-02-02-1643842562"] | |
| uri = https://bundles.example.com/git/git/2022-02-02-1643842562.bundle | |
| creationToken = 1643842562</code></pre> | |
| </div></div> | |
| <div class="literalblock"> | |
| <div class="content"> | |
| <pre><code>[bundle "2022-02-09-1644442631-daily-blobless"] | |
| uri = 2022-02-09-1644442631-daily-blobless.bundle | |
| creationToken = 1644442631 | |
| filter = blob:none</code></pre> | |
| </div></div> | |
| <div class="literalblock"> | |
| <div class="content"> | |
| <pre><code>[bundle "2022-02-02-1643842568-blobless"] | |
| uri = /git/git/2022-02-02-1643842568-blobless.bundle | |
| creationToken = 1643842568 | |
| filter = blob:none</code></pre> | |
| </div></div> | |
| <div class="paragraph"><p>This example uses <code>bundle.mode=all</code> as well as the | |
| <code>bundle.<id>.creationToken</code> heuristic. It also uses the <code>bundle.<id>.filter</code> | |
| options to present two parallel sets of bundles: one for full clones and | |
| another for blobless partial clones.</p></div> | |
| <div class="paragraph"><p>Suppose that this bundle list was found at the URI | |
| <code>https://bundles.example.com/git/git/</code> and so the two blobless bundles have | |
| the following fully-expanded URIs:</p></div> | |
| <div class="ulist"><ul> | |
| <li> | |
| <p> | |
| <code>https://bundles.example.com/git/git/2022-02-09-1644442631-daily-blobless.bundle</code> | |
| </p> | |
| </li> | |
| <li> | |
| <p> | |
| <code>https://bundles.example.com/git/git/2022-02-02-1643842568-blobless.bundle</code> | |
| </p> | |
| </li> | |
| </ul></div> | |
| </div> | |
| </div> | |
| <div class="sect1"> | |
| <h2 id="_advertising_bundle_uris">Advertising Bundle URIs</h2> | |
| <div class="sectionbody"> | |
| <div class="paragraph"><p>If a user knows a bundle URI for the repository they are cloning, then | |
| they can specify that URI manually through a command-line option. However, | |
| a Git host may want to advertise bundle URIs during the clone operation, | |
| helping users unaware of the feature.</p></div> | |
| <div class="paragraph"><p>The only thing required for this feature is that the server can advertise | |
| one or more bundle URIs. This advertisement takes the form of a new | |
| protocol v2 capability specifically for discovering bundle URIs.</p></div> | |
| <div class="paragraph"><p>The client could choose an arbitrary bundle URI as an option <em>or</em> select | |
| the URI with best performance by some exploratory checks. It is up to the | |
| bundle provider to decide if having multiple URIs is preferable to a | |
| single URI that is geodistributed through server-side infrastructure.</p></div> | |
| </div> | |
| </div> | |
| <div class="sect1"> | |
| <h2 id="_cloning_with_bundle_uris">Cloning with Bundle URIs</h2> | |
| <div class="sectionbody"> | |
| <div class="paragraph"><p>The primary need for bundle URIs is to speed up clones. The Git client | |
| will interact with bundle URIs according to the following flow:</p></div> | |
| <div class="olist arabic"><ol class="arabic"> | |
| <li> | |
| <p> | |
| The user specifies a bundle URI with the <code>--bundle-uri</code> command-line | |
| option <em>or</em> the client discovers a bundle list advertised by the | |
| Git server. | |
| </p> | |
| </li> | |
| <li> | |
| <p> | |
| If the downloaded data from a bundle URI is a bundle, then the client | |
| inspects the bundle headers to check that the prerequisite commit OIDs | |
| are present in the client repository. If some are missing, then the | |
| client delays unbundling until other bundles have been unbundled, | |
| making those OIDs present. When all required OIDs are present, the | |
| client unbundles that data using a refspec. The default refspec is | |
| <code>+refs/heads/*:refs/bundles/*</code>, but this can be configured. These refs | |
| are stored so that later <code>git fetch</code> negotiations can communicate each | |
| bundled ref as a <code>have</code>, reducing the size of the fetch over the Git | |
| protocol. To allow pruning refs from this ref namespace, Git may | |
| introduce a numbered namespace (such as <code>refs/bundles/<i>/*</code>) such that | |
| stale bundle refs can be deleted. | |
| </p> | |
| </li> | |
| <li> | |
| <p> | |
| If the file is instead a bundle list, then the client inspects the | |
| <code>bundle.mode</code> to see if the list is of the <code>all</code> or <code>any</code> form. | |
| </p> | |
| <div class="olist loweralpha"><ol class="loweralpha"> | |
| <li> | |
| <p> | |
| If <code>bundle.mode=all</code>, then the client considers all bundle | |
| URIs. The list is reduced based on the <code>bundle.<id>.filter</code> options | |
| matching the client repository’s partial clone filter. Then, all | |
| bundle URIs are requested. If the <code>bundle.<id>.creationToken</code> | |
| heuristic is provided, then the bundles are downloaded in decreasing | |
| order by the creation token, stopping when a bundle has all required | |
| OIDs. The bundles can then be unbundled in increasing creation token | |
| order. The client stores the latest creation token as a heuristic | |
| for avoiding future downloads if the bundle list does not advertise | |
| bundles with larger creation tokens. | |
| </p> | |
| </li> | |
| <li> | |
| <p> | |
| If <code>bundle.mode=any</code>, then the client can choose any one of the | |
| bundle URIs to inspect. The client can use a variety of ways to | |
| choose among these URIs. The client can also fallback to another URI | |
| if the initial choice fails to return a result. | |
| </p> | |
| </li> | |
| </ol></div> | |
| </li> | |
| </ol></div> | |
| <div class="paragraph"><p>Note that during a clone we expect that all bundles will be required, and | |
| heuristics such as <code>bundle.<uri>.creationToken</code> can be used to download | |
| bundles in chronological order or in parallel.</p></div> | |
| <div class="paragraph"><p>If a given bundle URI is a bundle list with a <code>bundle.heuristic</code> | |
| value, then the client can choose to store that URI as its chosen bundle | |
| URI. The client can then navigate directly to that URI during later <code>git | |
| fetch</code> calls.</p></div> | |
| <div class="paragraph"><p>When downloading bundle URIs, the client can choose to inspect the initial | |
| content before committing to downloading the entire content. This may | |
| provide enough information to determine if the URI is a bundle list or | |
| a bundle. In the case of a bundle, the client may inspect the bundle | |
| header to determine that all advertised tips are already in the client | |
| repository and cancel the remaining download.</p></div> | |
| </div> | |
| </div> | |
| <div class="sect1"> | |
| <h2 id="_fetching_with_bundle_uris">Fetching with Bundle URIs</h2> | |
| <div class="sectionbody"> | |
| <div class="paragraph"><p>When the client fetches new data, it can decide to fetch from bundle | |
| servers before fetching from the origin remote. This could be done via a | |
| command-line option, but it is more likely useful to use a config value | |
| such as the one specified during the clone.</p></div> | |
| <div class="paragraph"><p>The fetch operation follows the same procedure to download bundles from a | |
| bundle list (although we do <em>not</em> want to use parallel downloads here). We | |
| expect that the process will end when all prerequisite commit OIDs in a | |
| thin bundle are already in the object database.</p></div> | |
| <div class="paragraph"><p>When using the <code>creationToken</code> heuristic, the client can avoid downloading | |
| any bundles if their creation tokens are not larger than the stored | |
| creation token. After fetching new bundles, Git updates this local | |
| creation token.</p></div> | |
| <div class="paragraph"><p>If the bundle provider does not provide a heuristic, then the client | |
| should attempt to inspect the bundle headers before downloading the full | |
| bundle data in case the bundle tips already exist in the client | |
| repository.</p></div> | |
| </div> | |
| </div> | |
| <div class="sect1"> | |
| <h2 id="_error_conditions">Error Conditions</h2> | |
| <div class="sectionbody"> | |
| <div class="paragraph"><p>If the Git client discovers something unexpected while downloading | |
| information according to a bundle URI or the bundle list found at that | |
| location, then Git can ignore that data and continue as if it was not | |
| given a bundle URI. The remote Git server is the ultimate source of truth, | |
| not the bundle URI.</p></div> | |
| <div class="paragraph"><p>Here are a few example error conditions:</p></div> | |
| <div class="ulist"><ul> | |
| <li> | |
| <p> | |
| The client fails to connect with a server at the given URI or a connection | |
| is lost without any chance to recover. | |
| </p> | |
| </li> | |
| <li> | |
| <p> | |
| The client receives a 400-level response (such as <code>404 Not Found</code> or | |
| <code>401 Not Authorized</code>). The client should use the credential helper to | |
| find and provide a credential for the URI, but match the semantics of | |
| Git’s other HTTP protocols in terms of handling specific 400-level | |
| errors. | |
| </p> | |
| </li> | |
| <li> | |
| <p> | |
| The server reports any other failure response. | |
| </p> | |
| </li> | |
| <li> | |
| <p> | |
| The client receives data that is not parsable as a bundle or bundle list. | |
| </p> | |
| </li> | |
| <li> | |
| <p> | |
| A bundle includes a filter that does not match expectations. | |
| </p> | |
| </li> | |
| <li> | |
| <p> | |
| The client cannot unbundle the bundles because the prerequisite commit OIDs | |
| are not in the object database and there are no more bundles to download. | |
| </p> | |
| </li> | |
| </ul></div> | |
| <div class="paragraph"><p>There are also situations that could be seen as wasteful, but are not | |
| error conditions:</p></div> | |
| <div class="ulist"><ul> | |
| <li> | |
| <p> | |
| The downloaded bundles contain more information than is requested by | |
| the clone or fetch request. A primary example is if the user requests | |
| a clone with <code>--single-branch</code> but downloads bundles that store every | |
| reachable commit from all <code>refs/heads/*</code> references. This might be | |
| initially wasteful, but perhaps these objects will become reachable by | |
| a later ref update that the client cares about. | |
| </p> | |
| </li> | |
| <li> | |
| <p> | |
| A bundle download during a <code>git fetch</code> contains objects already in the | |
| object database. This is probably unavoidable if we are using bundles | |
| for fetches, since the client will almost always be slightly ahead of | |
| the bundle servers after performing its "catch-up" fetch to the remote | |
| server. This extra work is most wasteful when the client is fetching | |
| much more frequently than the server is computing bundles, such as if | |
| the client is using hourly prefetches with background maintenance, but | |
| the server is computing bundles weekly. For this reason, the client | |
| should not use bundle URIs for fetch unless the server has explicitly | |
| recommended it through a <code>bundle.heuristic</code> value. | |
| </p> | |
| </li> | |
| </ul></div> | |
| </div> | |
| </div> | |
| <div class="sect1"> | |
| <h2 id="_example_bundle_provider_organization">Example Bundle Provider organization</h2> | |
| <div class="sectionbody"> | |
| <div class="paragraph"><p>The bundle URI feature is intentionally designed to be flexible to | |
| different ways a bundle provider wants to organize the object data. | |
| However, it can be helpful to have a complete organization model described | |
| here so providers can start from that base.</p></div> | |
| <div class="paragraph"><p>This example organization is a simplified model of what is used by the | |
| GVFS Cache Servers (see section near the end of this document) which have | |
| been beneficial in speeding up clones and fetches for very large | |
| repositories, although using extra software outside of Git.</p></div> | |
| <div class="paragraph"><p>The bundle provider deploys servers across multiple geographies. Each | |
| server manages its own bundle set. The server can track a number of Git | |
| repositories, but provides a bundle list for each based on a pattern. For | |
| example, when mirroring a repository at <code>https://<domain>/<org>/<repo></code> | |
| the bundle server could have its bundle list available at | |
| <code>https://<server-url>/<domain>/<org>/<repo></code>. The origin Git server can | |
| list all of these servers under the "any" mode:</p></div> | |
| <div class="literalblock"> | |
| <div class="content"> | |
| <pre><code>[bundle] | |
| version = 1 | |
| mode = any</code></pre> | |
| </div></div> | |
| <div class="literalblock"> | |
| <div class="content"> | |
| <pre><code>[bundle "eastus"] | |
| uri = https://eastus.example.com/<domain>/<org>/<repo></code></pre> | |
| </div></div> | |
| <div class="literalblock"> | |
| <div class="content"> | |
| <pre><code>[bundle "europe"] | |
| uri = https://europe.example.com/<domain>/<org>/<repo></code></pre> | |
| </div></div> | |
| <div class="literalblock"> | |
| <div class="content"> | |
| <pre><code>[bundle "apac"] | |
| uri = https://apac.example.com/<domain>/<org>/<repo></code></pre> | |
| </div></div> | |
| <div class="paragraph"><p>This "list of lists" is static and only changes if a bundle server is | |
| added or removed.</p></div> | |
| <div class="paragraph"><p>Each bundle server manages its own set of bundles. The initial bundle list | |
| contains only a single bundle, containing all of the objects received from | |
| cloning the repository from the origin server. The list uses the | |
| <code>creationToken</code> heuristic and a <code>creationToken</code> is made for the bundle | |
| based on the server’s timestamp.</p></div> | |
| <div class="paragraph"><p>The bundle server runs regularly-scheduled updates for the bundle list, | |
| such as once a day. During this task, the server fetches the latest | |
| contents from the origin server and generates a bundle containing the | |
| objects reachable from the latest origin refs, but not contained in a | |
| previously-computed bundle. This bundle is added to the list, with care | |
| that the <code>creationToken</code> is strictly greater than the previous maximum | |
| <code>creationToken</code>.</p></div> | |
| <div class="paragraph"><p>When the bundle list grows too large, say more than 30 bundles, then the | |
| oldest "<em>N</em> minus 30" bundles are combined into a single bundle. This | |
| bundle’s <code>creationToken</code> is equal to the maximum <code>creationToken</code> among the | |
| merged bundles.</p></div> | |
| <div class="paragraph"><p>An example bundle list is provided here, although it only has two daily | |
| bundles and not a full list of 30:</p></div> | |
| <div class="literalblock"> | |
| <div class="content"> | |
| <pre><code>[bundle] | |
| version = 1 | |
| mode = all | |
| heuristic = creationToken</code></pre> | |
| </div></div> | |
| <div class="literalblock"> | |
| <div class="content"> | |
| <pre><code>[bundle "2022-02-13-1644770820-daily"] | |
| uri = https://eastus.example.com/<domain>/<org>/<repo>/2022-02-09-1644770820-daily.bundle | |
| creationToken = 1644770820</code></pre> | |
| </div></div> | |
| <div class="literalblock"> | |
| <div class="content"> | |
| <pre><code>[bundle "2022-02-09-1644442601-daily"] | |
| uri = https://eastus.example.com/<domain>/<org>/<repo>/2022-02-09-1644442601-daily.bundle | |
| creationToken = 1644442601</code></pre> | |
| </div></div> | |
| <div class="literalblock"> | |
| <div class="content"> | |
| <pre><code>[bundle "2022-02-02-1643842562"] | |
| uri = https://eastus.example.com/<domain>/<org>/<repo>/2022-02-02-1643842562.bundle | |
| creationToken = 1643842562</code></pre> | |
| </div></div> | |
| <div class="paragraph"><p>To avoid storing and serving object data in perpetuity despite becoming | |
| unreachable in the origin server, this bundle merge can be more careful. | |
| Instead of taking an absolute union of the old bundles, instead the bundle | |
| can be created by looking at the newer bundles and ensuring that their | |
| necessary commits are all available in this merged bundle (or in another | |
| one of the newer bundles). This allows "expiring" object data that is not | |
| being used by new commits in this window of time. That data could be | |
| reintroduced by a later push.</p></div> | |
| <div class="paragraph"><p>The intention of this data organization has two main goals. First, initial | |
| clones of the repository become faster by downloading precomputed object | |
| data from a closer source. Second, <code>git fetch</code> commands can be faster, | |
| especially if the client has not fetched for a few days. However, if a | |
| client does not fetch for 30 days, then the bundle list organization would | |
| cause redownloading a large amount of object data.</p></div> | |
| <div class="paragraph"><p>One way to make this organization more useful to users who fetch frequently | |
| is to have more frequent bundle creation. For example, bundles could be | |
| created every hour, and then once a day those "hourly" bundles could be | |
| merged into a "daily" bundle. The daily bundles are merged into the | |
| oldest bundle after 30 days.</p></div> | |
| <div class="paragraph"><p>It is recommended that this bundle strategy is repeated with the <code>blob:none</code> | |
| filter if clients of this repository are expecting to use blobless partial | |
| clones. This list of blobless bundles stays in the same list as the full | |
| bundles, but uses the <code>bundle.<id>.filter</code> key to separate the two groups. | |
| For very large repositories, the bundle provider may want to <em>only</em> provide | |
| blobless bundles.</p></div> | |
| </div> | |
| </div> | |
| <div class="sect1"> | |
| <h2 id="_implementation_plan">Implementation Plan</h2> | |
| <div class="sectionbody"> | |
| <div class="paragraph"><p>This design document is being submitted on its own as an aspirational | |
| document, with the goal of implementing all of the mentioned client | |
| features over the course of several patch series. Here is a potential | |
| outline for submitting these features:</p></div> | |
| <div class="olist arabic"><ol class="arabic"> | |
| <li> | |
| <p> | |
| Integrate bundle URIs into <code>git clone</code> with a <code>--bundle-uri</code> option. | |
| This will include a new <code>git fetch --bundle-uri</code> mode for use as the | |
| implementation underneath <code>git clone</code>. The initial version here will | |
| expect a single bundle at the given URI. | |
| </p> | |
| </li> | |
| <li> | |
| <p> | |
| Implement the ability to parse a bundle list from a bundle URI and | |
| update the <code>git fetch --bundle-uri</code> logic to properly distinguish | |
| between <code>bundle.mode</code> options. Specifically design the feature so | |
| that the config format parsing feeds a list of key-value pairs into the | |
| bundle list logic. | |
| </p> | |
| </li> | |
| <li> | |
| <p> | |
| Create the <code>bundle-uri</code> protocol v2 command so Git servers can advertise | |
| bundle URIs using the key-value pairs. Plug into the existing key-value | |
| input to the bundle list logic. Allow <code>git clone</code> to discover these | |
| bundle URIs and bootstrap the client repository from the bundle data. | |
| (This choice is an opt-in via a config option and a command-line | |
| option.) | |
| </p> | |
| </li> | |
| <li> | |
| <p> | |
| Allow the client to understand the <code>bundle.heuristic</code> configuration key | |
| and the <code>bundle.<id>.creationToken</code> heuristic. When <code>git clone</code> | |
| discovers a bundle URI with <code>bundle.heuristic</code>, it configures the client | |
| repository to check that bundle URI during later <code>git fetch <remote></code> | |
| commands. | |
| </p> | |
| </li> | |
| <li> | |
| <p> | |
| Allow clients to discover bundle URIs during <code>git fetch</code> and configure | |
| a bundle URI for later fetches if <code>bundle.heuristic</code> is set. | |
| </p> | |
| </li> | |
| <li> | |
| <p> | |
| Implement the "inspect headers" heuristic to reduce data downloads when | |
| the <code>bundle.<id>.creationToken</code> heuristic is not available. | |
| </p> | |
| </li> | |
| </ol></div> | |
| <div class="paragraph"><p>As these features are reviewed, this plan might be updated. We also expect | |
| that new designs will be discovered and implemented as this feature | |
| matures and becomes used in real-world scenarios.</p></div> | |
| </div> | |
| </div> | |
| <div class="sect1"> | |
| <h2 id="_related_work_packfile_uris">Related Work: Packfile URIs</h2> | |
| <div class="sectionbody"> | |
| <div class="paragraph"><p>The Git protocol already has a capability where the Git server can list | |
| a set of URLs along with the packfile response when serving a client | |
| request. The client is then expected to download the packfiles at those | |
| locations in order to have a complete understanding of the response.</p></div> | |
| <div class="paragraph"><p>This mechanism is used by the Gerrit server (implemented with JGit) and | |
| has been effective at reducing CPU load and improving user performance for | |
| clones.</p></div> | |
| <div class="paragraph"><p>A major downside to this mechanism is that the origin server needs to know | |
| <em>exactly</em> what is in those packfiles, and the packfiles need to be available | |
| to the user for some time after the server has responded. This coupling | |
| between the origin and the packfile data is difficult to manage.</p></div> | |
| <div class="paragraph"><p>Further, this implementation is extremely hard to make work with fetches.</p></div> | |
| </div> | |
| </div> | |
| <div class="sect1"> | |
| <h2 id="_related_work_gvfs_cache_servers">Related Work: GVFS Cache Servers</h2> | |
| <div class="sectionbody"> | |
| <div class="paragraph"><p>The GVFS Protocol [2] is a set of HTTP endpoints designed independently of | |
| the Git project before Git’s partial clone was created. One feature of this | |
| protocol is the idea of a "cache server" which can be colocated with build | |
| machines or developer offices to transfer Git data without overloading the | |
| central server.</p></div> | |
| <div class="paragraph"><p>The endpoint that VFS for Git is famous for is the <code>GET /gvfs/objects/{oid}</code> | |
| endpoint, which allows downloading an object on-demand. This is a critical | |
| piece of the filesystem virtualization of that product.</p></div> | |
| <div class="paragraph"><p>However, a more subtle need is the <code>GET /gvfs/prefetch?lastPackTimestamp=<t></code> | |
| endpoint. Given an optional timestamp, the cache server responds with a list | |
| of precomputed packfiles containing the commits and trees that were introduced | |
| in those time intervals.</p></div> | |
| <div class="paragraph"><p>The cache server computes these "prefetch" packfiles using the following | |
| strategy:</p></div> | |
| <div class="olist arabic"><ol class="arabic"> | |
| <li> | |
| <p> | |
| Every hour, an "hourly" pack is generated with a given timestamp. | |
| </p> | |
| </li> | |
| <li> | |
| <p> | |
| Nightly, the previous 24 hourly packs are rolled up into a "daily" pack. | |
| </p> | |
| </li> | |
| <li> | |
| <p> | |
| Nightly, all prefetch packs more than 30 days old are rolled up into | |
| one pack. | |
| </p> | |
| </li> | |
| </ol></div> | |
| <div class="paragraph"><p>When a user runs <code>gvfs clone</code> or <code>scalar clone</code> against a repo with cache | |
| servers, the client requests all prefetch packfiles, which is at most | |
| <code>24 + 30 + 1</code> packfiles downloading only commits and trees. The client | |
| then follows with a request to the origin server for the references, and | |
| attempts to checkout that tip reference. (There is an extra endpoint that | |
| helps get all reachable trees from a given commit, in case that commit | |
| was not already in a prefetch packfile.)</p></div> | |
| <div class="paragraph"><p>During a <code>git fetch</code>, a hook requests the prefetch endpoint using the | |
| most-recent timestamp from a previously-downloaded prefetch packfile. | |
| Only the list of packfiles with later timestamps are downloaded. Most | |
| users fetch hourly, so they get at most one hourly prefetch pack. Users | |
| whose machines have been off or otherwise have not fetched in over 30 days | |
| might redownload all prefetch packfiles. This is rare.</p></div> | |
| <div class="paragraph"><p>It is important to note that the clients always contact the origin server | |
| for the refs advertisement, so the refs are frequently "ahead" of the | |
| prefetched pack data. The missing objects are downloaded on-demand using | |
| the <code>GET gvfs/objects/{oid}</code> requests, when needed by a command such as | |
| <code>git checkout</code> or <code>git log</code>. Some Git optimizations disable checks that | |
| would cause these on-demand downloads to be too aggressive.</p></div> | |
| </div> | |
| </div> | |
| <div class="sect1"> | |
| <h2 id="_see_also">See Also</h2> | |
| <div class="sectionbody"> | |
| <div class="paragraph"><p>[1] <a href="https://lore.kernel.org/git/[email protected]/">https://lore.kernel.org/git/[email protected]/</a> | |
| An earlier RFC for a bundle URI feature.</p></div> | |
| <div class="paragraph"><p>[2] <a href="https://github.com/microsoft/VFSForGit/blob/master/Protocol.md">https://github.com/microsoft/VFSForGit/blob/master/Protocol.md</a> | |
| The GVFS Protocol</p></div> | |
| </div> | |
| </div> | |
| </div> | |
| <div id="footnotes"><hr /></div> | |
| <div id="footer"> | |
| <div id="footer-text"> | |
| Last updated | |
| 2023-04-17 18:42:49 PDT | |
| </div> | |
| </div> | |
| </body> | |
| </html> | |