Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple Git Repositories #6770

Closed
carlescufi opened this issue Mar 23, 2018 · 51 comments
Closed

Multiple Git Repositories #6770

carlescufi opened this issue Mar 23, 2018 · 51 comments
Assignees
Labels
Feature A planned feature with a milestone priority: high High impact/importance bug

Comments

@carlescufi
Copy link
Member

carlescufi commented Mar 23, 2018

Important: The west multi-repo model is discussed and tracked in this document

This issue covers splitting the current zephyr Git repository into multiple ones, and having a tool to manage multiple repositories and contribute to them.

Note: We have edited this issue's description in order to reflect the outcome of all the discussions and work that has taken place since the issue was first raised.

Motivation

Zephyr should avoid mixing external code with original code for the following reasons:

  • Repo size (-)
  • License, IP restrictions (---)
  • Unused code that might contaminate actual use code (--)
  • Marketing reasons, PR, perceptions (-)
  • Customer A might not be interested in anything that is Vendor A related (-)
  • Simplify development outside of the Zephyr tree, including applications and libraries (--)
  • Provide integration with external projects such as MCUboot (-)

We have focused on being able to retrieve a subset of repositories without having to modify the upstream tree, on the ability to maintain downstream forks that replace a subset of repositories and on full support for Linux, macOS and Windows. There is a reason we are doing this, and it is exclusively related to trying to adapt to how the embedded world deals with software today.
It is of critical importance that, in a project that intends to provide a one-stop-shop solution for embedded development, we make it easy for distributors, silicon vendors and product developers to easily replace bits and pieces with proprietary software or forks of open source projects where required.

Requirements

  • Ability to retrieve all required repositories from the command-line
  • Ability to place repositories outside of the main zephyr tree
  • Ability to maintain a manifest with pinned revisions or tracking branches
  • Ability to retrieve only a subset of repositories
  • Ability to remove or replace a subset of repositories without modifying the upstream zephyr tree
  • Ability to work on Linux, macOS and Windows
  • Ability to bisect the main upstream zephyr tree carrying along exact revisions of the projects during the bisection. This implies tracking projects using exact SHAs upstream.
  • Ability to manage (create, delete, push, pull, rebase, etc.) project branches/revisions in diverse situations: locally, in collaboration with other developers, for upstream review
  • Ability to query global status information about local projects (branches, commits, diffs)
  • Ability to manage the current projects in the manifest (list project names and their paths, run commands in multiple projects)
  • Ability to manage the manifest as a first class entity (pin revisions, semantic diff between versions, distribute versions to others without affecting any other component in the system)
  • Ability for contributors to locally reproduce any builds performed by upstream CI without having to modify the vanilla manifest file.

Conclusion

We will use west, a Zephyr meta-tool to manage multiple repositories, using a manifest to define the set of repositories, their revisions and other metadata.

FAQ

Why a single tool?

It has been argued that different functionality (repository management, flashing, debugging, etc) belongs in different tools instead of trying to come up with swiss-army knife that does it all.
While the argument has weight and value, after careful consideration we have decided to provide a single entry point to the west functionality in order to simplify the user experience. That does not mean that all of the code needs to be in a single place, and in fact west uses an extension mechanism that allows us to place the implementation of different west commands in separate repositories, including the build, flash and debug commands in the zephyr repository where they live now (see scripts/west-commands.yml).
There are many examples of tools similar in scope to west:

  • Go's go command-line tool
  • Mynewt's newt command-line tool
  • Mbed's mbed-cli command-line tool
  • CMake's ability to combine fetching external projects with building

Why is it called west?

See here.

Why Python?

Because it's cross-platform, many of our users already know it and most important of all, it is already a dependency for Zephyr.

Why not use Google's repo

  1. It is Python 2 only
  2. It requires an administrative command prompt on Windows due to its use of symlinks
  3. Its code review system is hardcoded to gerrit
  4. It is poorly documented and maintained ad-hoc
  5. It is not suited with the zephyr upstream multi-repo model of a central repository (zephyr) with all of the core code and the manifest itself

Why not use Git submodules?

There would be two possible ways of using submodules with Zephyr:

  1. Add submodules to the main zephyr repository. This would not meet some of the requirements, in particular the ability to retrieve only a subset of repositories, since the paths and existence of those would be hardcoded.
  2. Create a new "meta" repository which only contains submodules to other repos. This would be equivalent to a "manifest" repository. This option would satisfy most of the requirements, but it would require an additional commit on the "meta" repository every time anything is committed to any of the repos.

Neither would really fully cover all of the requirements described in the Requirements section

Additionally, the conclusion to use a meta-tool for multiple uses makes submodules less of a good fit.
Finally, using a meta-tool should not preclude users from still using submodules if they prefer to do so.

Unresolved issues

Unresolved issues before we can split the main repository into multiple ones:

  1. PRs across multiple repos: How to match and retrieve a PR that spans multiple repositories.
    Possible solutions:

    • Use branch names: Same branch name across all repositories, including manifest repo
    • Use an equivalent of a Changeset ID
  2. Upmerging forks (taking upstream into a fork): Forks will use a different manifest, with a different set of repositories. Some will be common, some not. Today one can do: git fetch upstream, git merge upstream/master.
    Possible solutions:

    • west upmerge <repo list> ?
@carlescufi
Copy link
Member Author

carlescufi commented Mar 23, 2018

Options to achieve this goal:

  1. Fork Google's repo tool and make it work on Windows properly
  2. Take parts or fragments from either repo or gclient inside depot_tools and write our own tool that, in time, can also fulfill the requirements in Command-line Zephyr meta-tool  #6205

Additional tools that achieve similar objectives:

  • jiri used in Google's Fuchsia

@zephyrproject-rtos zephyrproject-rtos deleted a comment from mbolivar Mar 23, 2018
@jukkar
Copy link
Member

jukkar commented Mar 25, 2018

Perhaps this is discussed already in other forums but why do we need to have multiple git repositories in first place?

@mbolivar
Copy link
Contributor

Perhaps this is discussed already in other forums but why do we need to have multiple git repositories in first place?

In my view, Zephyr already has multiple Git repositories. Examples:

https://github.com/zephyrproject-rtos/zephyr
https://github.com/zephyrproject-rtos/Kconfiglib
https://github.com/zephyrproject-rtos/net-tools

At least zephyr and net-tools are already required to use many networking samples in Zephyr in important cases.

Just as Zephyr's networking subsystem already benefits from multiple repositories, why would other areas not also potentially find this useful?

@jukkar
Copy link
Member

jukkar commented Mar 26, 2018

Just as Zephyr's networking subsystem already benefits from multiple repositories, why would other areas not also potentially find this useful?

I am not questioning this issue. I was just wondering the reasoning because the issue started to talk about requirements but was not describing the "why" part.

@nashif
Copy link
Member

nashif commented Mar 26, 2018

@jukkar good point. Updated with the "why" part. We had this documented somewhere else.

@locomuco
Copy link
Contributor

what also could be considered:

e.g. at the moment zephr master is working with net-tools master, but there is no pinning at the moment to a specific version, like it would be with git submodules

@MaureenHelm MaureenHelm added the Feature A planned feature with a milestone label Mar 28, 2018
@nashif nashif added this to v1.12 in Release Plan Apr 3, 2018
@nashif nashif moved this from v1.12 to v1.13 in Release Plan May 9, 2018
@carlescufi
Copy link
Member Author

@locomuco definitely. net-tools can (and probably will) be part of the default manifest

@carlescufi
Copy link
Member Author

Relevant PR: #7338

@ulfalizer
Copy link
Collaborator

ulfalizer commented Aug 8, 2018

@carlescufi @SebastianBoe @mbolivar
Working on multiple repository support in West at the moment.

Random brain dump below:

In some previous discussion, people (can't remember who) said they'd prefer if west sync checked out the repositories on a local branch instead of with a detached HEAD (git-repo uses a detached HEAD, if I understand it right).

One advantage of having a branch checked out is that git status automatically gives sensible output (I'm not a Git expert, so that makes it even nicer). It might be less confusing when working manually on the repositories too.

IIRC, someone also said that rebasing on sync in git-repo is confusing. I'm not sure what the alternative would be there though. Throwing away local changes seems less useful (even if the changes can be recovered).

Having a local branch leads to some tricky design decisions:

  • If the repository is on some other branch, what should west sync do? Just rebase it on top of the original branch? Switch back to the original branch? Should the original branch be updated as well?

  • If the user is in a detached HEAD state, what should happen?

  • What should happen if the repository is in some other weird state, e.g. in the middle of a git rebase? Just bail out?

  • Probably other stuff I haven't thought of...

Detached HEAD might be simpler to implement and less "magic". No juggling with local branches. I have a prototype working for that (though there's probably a lot of robustness stuff to add). I could try to do the branch thing though and see if it runs into other trickiness.

Another random thing I thought of: Might want to use "branch" instead of "revision" in default.yml, if it's always supposed to be a branch name.

Bit worried that we're reinventing the wheel here too. git-repo is probably mature and stable at this point, with a lot of devs with more Git internals experience having worked on it.

@ulfalizer
Copy link
Collaborator

Hmz... maybe the sanest thing if we go for the branch thing would be to always switch over to it and then rebase (git pull --rebase or some equivalent), leaving rebasing of any other branches up to the user. That's simple enough to understand.

Might be able to switch back to the previous location too, with git checkout - (just discovered that one).

ulfalizer added a commit to ulfalizer/west that referenced this issue Aug 9, 2018
This RFC adds manifest parsing and three basic commands (sync, diff, and
status).

More error checking needs to be added. This is mostly to get some
feedback on the approach. There are some cases that turn tricky if you
always keep a local branch to avoid a detached HEAD.

I'm wondering if 'revision' is supposed to always point to a branch (as
opposed to e.g. a SHA). SHAs would be more flexible, but make it even
trickier to keep a local branch.

I've written a bit in
zephyrproject-rtos/zephyr#6770 as well.

Signed-off-by: Ulf Magnusson <Ulf.Magnusson@nordicsemi.no>
ulfalizer added a commit to ulfalizer/west that referenced this issue Aug 9, 2018
This RFC adds manifest parsing and three basic commands (sync, diff, and
status).

More error checking needs to be added. This is mostly to get some
feedback on the approach. There are some cases that turn tricky if you
always keep a local branch to avoid a detached HEAD.

I'm wondering if 'revision' is supposed to always point to a branch (as
opposed to e.g. a SHA). SHAs would be more flexible, but make it even
trickier to keep a local branch.

I've written a bit in
zephyrproject-rtos/zephyr#6770 as well.

Signed-off-by: Ulf Magnusson <Ulf.Magnusson@nordicsemi.no>
ulfalizer added a commit to ulfalizer/west that referenced this issue Aug 11, 2018
This RFC adds manifest parsing and three basic commands (sync, diff, and
status).

More error checking needs to be added. This is mostly to get some
feedback on the approach. There are some cases that turn tricky if you
always keep a local branch to avoid a detached HEAD.

I'm wondering if 'revision' is supposed to always point to a branch (as
opposed to e.g. a SHA). SHAs would be more flexible, but make it even
trickier to keep a local branch.

I've written a bit in
zephyrproject-rtos/zephyr#6770 as well.

Signed-off-by: Ulf Magnusson <Ulf.Magnusson@nordicsemi.no>
ulfalizer added a commit to ulfalizer/west that referenced this issue Aug 11, 2018
This RFC adds manifest parsing and three basic commands (sync, diff, and
status).

More error checking needs to be added. This is mostly to get some
feedback on the approach. There are some cases that turn tricky if you
always keep a local branch to avoid a detached HEAD.

I'm wondering if 'revision' is supposed to always point to a branch (as
opposed to e.g. a SHA). SHAs would be more flexible, but make it even
trickier to keep a local branch.

I've written a bit in
zephyrproject-rtos/zephyr#6770 as well.

Signed-off-by: Ulf Magnusson <Ulf.Magnusson@nordicsemi.no>
ulfalizer added a commit to ulfalizer/west that referenced this issue Aug 11, 2018
This RFC adds manifest parsing and three basic commands (sync, diff, and
status).

More error checking needs to be added. This is mostly to get some
feedback on the approach. There are some cases that turn tricky if you
always keep a local branch to avoid a detached HEAD.

I'm wondering if 'revision' is supposed to always point to a branch (as
opposed to e.g. a SHA). SHAs would be more flexible, but make it even
trickier to keep a local branch.

I've written a bit in
zephyrproject-rtos/zephyr#6770 as well.

Signed-off-by: Ulf Magnusson <Ulf.Magnusson@nordicsemi.no>
@nashif nashif moved this from v1.13 to v1.14 in Release Plan Aug 21, 2018
@tautologyclub
Copy link
Contributor

@marc-hb The thing is though that the submodules @carlescufi mentions fit that bill pretty niely - external vendor HALs, external cbor libs, mbedtls, etc. They're supposed to be more or less static in the context. As soon as you try to find a solution for subrepos that don't fit that description, you're flying very close to the sun and will probably, after much time and effort spent, find that you're running into the same design headaches that the guys behind repo/submodules ran into and couldn't find an elegant solution to.

@mbolivar
Copy link
Contributor

Hi @tautologyclub and thanks very much for your comments!

You could either make a new project where the Zephyr kernel itself is a submodule (would be fine imo), or you could interpret "wrapper around submodules" a bit more liberally to implement some additional features -- or rather, have it present some well-defined interface (such as a specific directory structure) that the build system can infer stuff from reliably. Either way it seems to me that submodules would be pretty well suited to handle the bulk of the logic.

The devil is very much in the details here, I'm afraid.

I encourage you to try fleshing out the "have it present some well-defined interface (such as a specific directory structure) that the build system can infer stuff from reliably" idea in detail for Zephyr.

We certainly tried approaches like that (see #7338 for one example from one of Zephyr's core maintainers), and they all fell down in one use case or another.

Some other comments follow.

but it would require an additional commit on the "meta" repository every time anything is committed to any of the repos.

I claim that this can't be done in a sane way when you're integrating repositories from multiple external sources without doing what this comment by @marc-hb proposes...

Yes and the only vaguely sensible workaround is automate commits in such a repo.

... which is exactly what we (my company, foundries.io, which has been helping a bit with west) has been doing with google repo in a Zephyr project for quite some time (about 2 years).

We tried to push a similar approach upstream in west. It was dead on arrival as it was a dealbreaker for some users. We didn't learn this until quite late in the development cycle; recovering the design was a bit of a last minute roller coaster :).

They're supposed to be more or less static in the context.

("They're" above is referring to "vendor HALs, external cbor libs, mbedtls, etc.".)

I disagree about this "supposed to be". I think it's far more common that users will want to mix and match one or two components but leave the rest mostly the same. I think having a manifest file like repo or west (or a DEPS file like chromium, etc.) makes this easier than alternatives I've seen. And I must say I don't agree with the idea we should just give up if we can't make them static as described here:

As soon as you try to find a solution for subrepos that don't fit that description, you're flying very close to the sun and will probably, after much time and effort spent, find that you're running into the same design headaches that the guys behind repo/submodules ran into and couldn't find an elegant solution to.

I certainly do feel that we've run into many of the same design headaches they have in the past year plus that we have been working on west and trying to gather requirements for and from Zephyr users.

I suppose we'll see if we've missed any critical ones and our wings melt, or not. I am sure that the fun is far from over. I hope you wish us luck!

@mbolivar
Copy link
Contributor

Hi there @marc-hb:

I would just avoid vague terms like "one-stop shop" and focus instead on actual examples/benefits/features that only a tighter integration of versioning+building can achieve. Providing the references above is great, summarising some of their integration benefits here would be better.

Bootloader (and in particular, MCUboot) integration is a killer app for me personally.

I've maintained some out of tree scripts for a while now that make it easier to build and flash Zephyr images that are child-loaded by MCUboot, and I can tell you from experience both using them myself and helping out members of my team that are less deeply invested in the details of the Zephyr build system than I am that it is really nice to have (especially since we are a remote work company across many timezones, simplicity of interface is a big win UX wise). A lot of the differences between boards and flashing mechanisms can be suitably abstracted away if you have a tool that understands not just the build system but also the details of flashing different zephyr boards with different backends.

We (my company) are also working on some additional tooling for doing automated testing of a multi-repo tree across multiple boards and sample applications -- think shippable but real Zephyr hardware. West integration is a nice selling point as we can rely on the above and extend it.

I would get it if that that feels like vaporware to you -- and, from what's available upstream, some of it is (though not all of it, as I've been steadily upstreaming the bootloader integration and plan on finishing the job this week once west is finally merged and part of the core workflow.) But I invite you to watch the space as there is real code somewhere in the haze :)

@mbolivar
Copy link
Contributor

mbolivar commented Jan 29, 2019

Hi @pfalcon

And I'm personally 80% sure that there wouldn't be such an urge to have own cute tool if Mynewt didn't have it ;-).

Given your winky emoticon I am not sure whether you meant this as a joke, but I can assure you I am 100% sure that newt has nothing to do with it from where I am sitting. The features provided by the Android build system (the one in AOSP for building entire images, not the IDE ones for building apps) and tools like repo, fastboot, and adb are a much bigger design influence on me.

@marc-hb
Copy link
Collaborator

marc-hb commented Jan 29, 2019

active development in ONLY ONE git repo, all other git repos being SLOW moving and very strictly controlled dependencies

They're supposed to be more or less static in the context.

"supposed" and "more or less" isn't good enough. To prove that git submodules are a good fit you'd have to know how every Zephyr project is organized, including closed-source projects and... future projects. Looking at https://docs.google.com/document/d/1HrrMZ11nULWoAv3mR70VxT6nB_I1qMytpnmxFcVPCpM the intention is to very clearly support more than one git repo actively developed at a time.

As soon as you try to find a solution for subrepos that don't fit that description, you're flying very close to the sun, after much time and effort spent,

The high-level design of west-multirepo seems relatively close to Google's repo (at least much closer to it than to submodules). This does mean a fair amount of work but not rocket science either. So quite far from the sun ;-)

find that you're running into the same design headaches that the guys behind repo/submodules ran into and couldn't find an elegant solution to.

Google's repo may not always be "elegant" but it "does the job" - a massive amount of production work actually. Running into design headaches that have been already solved doesn't seem like a bad idea even if some of those solutions were not "elegant" and optimal. "Evolution not revolution"? Plus the ambition is (unfortunately...) limited to support only Zephyr for now.

@mbolivar
Copy link
Contributor

mbolivar commented Jan 29, 2019

@marc-hb I'm sorry as I meant to reply to some of your other comments in my earlier response but I forgot and hit send. Rather than edit, I'll just add another comment here:

We believe that a one-stop shop tool is the easiest path forward in order to provide a simple user experience for inexperienced users and newcomers.

I do believe that "one stop shop" is not totally as vague as it seems on the surface :). For example, the ability to do things like, say, this:

$ west --help
usage: west [-h] [-z ZEPHYR_BASE] [-v] [-V] <command> ...

The Zephyr RTOS meta-tool.

optional arguments:
  -h, --help            show this help message and exit
  -z ZEPHYR_BASE, --zephyr-base ZEPHYR_BASE
                        Override the Zephyr base directory. The default is
                        the manifest project with path "zephyr".
  -v, --verbose         Display verbose output. May be given multiple times
                        to increase verbosity.
  -V, --version         print the program version and exit

commands for managing multiple git repositories:
  list:                 print information about projects in the west
                        manifest
  diff:                 "git diff" for one or more projects
  status:               "git status" for one or more projects
  update:               update projects described in west.yml
  selfupdate:           selfupdate the west repository
  forall:               run a command in one or more local projects

commands from project at "zephyr":
  build:                compile a Zephyr application
  flash:                flash and run a binary on a board
  debug:                flash and interactively debug a Zephyr application
  debugserver:          connect to board and launch a debug server
  attach:               interactively debug a board

Run "west <command> -h" for detailed help on each command.

is a win that a variety of tools on the PATH -- no matter how carefully named or documented -- will not be able to match in terms of discoverability and ease of use. I think there is a reason why docker is a single command for dealing with containers -- and I think it's not crazy to have a single command for "dealing with Zephyr", which is also a somewhat isolated computing environment that you manage from a host system.

As long as it doesn't cause much more work I would recommend keeping the multirepo part of west as generic as possible.

Yes. We are trying. I would love this to be generic and usable as a separate tool someday too, believe me, but it's just not practical for now. But we aren't losing sight of this.

@mbolivar
Copy link
Contributor

mbolivar commented Jan 29, 2019

The high-level design of west-multirepo seems relatively close to Google's repo (at least much closer to it than to submodules)

@marc-hb you are right about this.

We basically started with a reimplementation of the minimal subset of google repo that we figured we could get away with, except:

  1. written in python 3 instead of 2
  2. compatible with windows (repo's internal heavy use of symlinks makes it a no go on that platform, which is a first class citizen for zephyr -- and yes, we know about https://github.com/esrlabs/git-repo)
  3. without some of the crazy repo magic behavior (although opinions on how 'magical' west is are not uniform, in fairness)
  4. no assuming the git remote is handled by gerrit (for things like repo upload)
  5. edit: and YAML instead of XML. I hate XML.

If you watch this (by now very out of date) status update I gave on west to the zephyr TSC, you'll hear me admit that we tried to just use repo, but couldn't, mostly because of these issues, towards the end when I raced to recap the multirepo parts:

https://www.youtube.com/watch?v=P6s0HSZAua8

At the end of the day, we had to change tack and incorporate some submodule-style features because the free-form way repo allows the individual repositories to vary did not meet the requirements of some zephyr users.

@mbolivar
Copy link
Contributor

@carlescufi

That does not mean that all of the code needs to be in a single place, and we are currently looking at an extension mechanism that would allow to place the implementation of different west commands in separate repositories

This is in the issue description and needs an update

@pfalcon
Copy link
Contributor

pfalcon commented Jan 29, 2019

And I'm personally 80% sure that there wouldn't be such an urge to have own cute tool if Mynewt didn't have it ;-).

Given your winky emoticon I am not sure whether you meant this as a joke, but I can assure you I am 100% sure that newt has nothing to do with it from where I am sitting. The features provided by the Android build system (the one in AOSP for building entire images, not the IDE ones for building apps) and tools like repo, fastboot, and adb are a much bigger design influence on me.

Thanks for the response. Yeah, there's a bit of joke in every joke ;-). So, a case with MyNewt and its "newt" tool must be a coincidence then ;-).

Well, seriously, all in one management tools are well-known pattern in bigger IT ("python setup.py" for all things modules in Python, Django's, etc. application frameworks' management tools (from starting an app to fishing in its database)). In embedded space Zephyr isn't the first either. MyNewt is an obvious affinity suspect, but then there're also mbedOS' yotta build/package management tool, and PlatformIO which is built on this concept.

@carlescufi
Copy link
Member Author

@carlescufi

That does not mean that all of the code needs to be in a single place, and we are currently looking at an extension mechanism that would allow to place the implementation of different west commands in separate repositories

This is in the issue description and needs an update

Fixed, thanks!

@carlescufi
Copy link
Member Author

carlescufi commented Jan 29, 2019

Thanks @mbolivar for further commenting on the process that has led us here.

I would like to further clarify what has already been said, especially given the latest feedback by @tautologyclub and @marc-hb: west has been designed around a set of requirements and needs from some of the contributors to the Zephyr project. In particular, and as is described in this issue and in the official west documentation, we have focused on being able to retrieve a subset of repositories without having to modify the upstream tree, on the ability to maintain downstream forks that replace a subset of repositories and on full support for Linux, macOS and Windows. There is a reason we are doing this, and it is exclusively related to trying to adapt to how the embedded world deals with software today.
It is of critical importance that, in a project that intends to provide a one-stop-shop solution for embedded development, we make it easy for distributors, silicon vendors and product developers to easily replace bits and pieces with proprietary software or forks of open source projects where required.
It is also fundamental that we attract users, and again in the embedded world this means supporting Windows as a first-class citizen and also providing an interface that is as simple (yet powerful) as possible.
Finally, and as you may have read about already, Zephyr is also about security and safety. We are in the process of trying to FuSa-certify Zephyr, and that imposes its own set of requirements in how the code is presented and can be split.
All of these reasons have led us to the conclusion that we needed a tool. We certainly never wanted to develop a tool, we want to write embedded software and support as many SoCs and technologies as possible. We studied Google repo and Git submodules extensively, and simply came to the conclusion that neither would be able to fulfill our requirements.

@Vudentz
Copy link
Contributor

Vudentz commented Jan 29, 2019

Id suggest adding another requirement:

  • All tests run by CI must have its dependencies, if any, as submodule, so individuals can run those tests locally without having to switch their manifest file.

@pabigot
Copy link
Collaborator

pabigot commented Jan 29, 2019

we have focused on being able to retrieve a subset of repositories without having to modify the upstream tree, on the ability to maintain downstream forks that replace a subset of repositories and on full support for Linux, macOS and Windows. There is a reason we are doing this, and it is exclusively related to trying to adapt to how the embedded world deals with software today.
It is of critical importance that, in a project that intends to provide a one-stop-shop solution for embedded development, we make it easy for distributors, silicon vendors and product developers to easily replace bits and pieces with proprietary software or forks of open source projects where required.

This is the most clear and compelling core requirement and justification related to multi-repo support I've seen. It's also the first time I've seen it stated this way.

I'm not entirely convinced that it couldn't have been satisfied with existing solutions, nor that it would withstand a rigorous validation process, but it does at least provide a basis for assessing potential solutions.

However, it's long past the point of requirements specification: we have what we have and mighta/coulda/shoulda gets us nowhere. My intent is to give west a shot and if it proves to have problems work to resolve them, develop an alternative solution, or move on in some other way.

@carlescufi
Copy link
Member Author

@pabigot

This is the most clear and compelling core requirement and justification related to multi-repo support I've seen. It's also the first time I've seen it stated this way.

Thanks, I will add this to the issue description then.

@carlescufi
Copy link
Member Author

Id suggest adding another requirement:

  • All tests run by CI must have its dependencies, if any, as submodule, so individuals can run those tests locally without having to switch their manifest file.

Agreed, will add.

@carlescufi
Copy link
Member Author

carlescufi commented Jan 29, 2019

@Vudentz @pabigot I added the following requirement (which was omitted from the list by mistake):

  • Ability to bisect the main upstream zephyr tree carrying along exact revisions of the projects during the bisection. This implies tracking projects using exact SHAs upstream.

@tautologyclub
Copy link
Contributor

I disagree about this "supposed to be". I think it's far more common that users will want to mix and match one or two components but leave the rest mostly the same. I think having a manifest file like repo or west (or a DEPS file like chromium, etc.) makes this easier than alternatives I've seen. And I must say I don't agree with the idea we should just give up if we can't make them static as described here:

Well, the key point is "mixing and matching", not "doing active development on". It's active development that becomes annoying when using submodules/repo. Cloning an external HAL and some helper libs that you don't intend to modify can surely not be seen as a nightmare using existing tools...?

I suppose we'll see if we've missed any critical ones and our wings melt, or not. I am sure that the fun is far from over. I hope you wish us luck!

Absolutely, not trying to bring you down - it's certainly odd that no one has managed to find a good solution to this problem given the wide audience and the awkwardness of all existing tools. What I'm trying to argue for is that perhaps your needs could be satisfied WITHOUT reinventing the wheel and instead wrapping existing wheels with some python :P

@mbolivar
Copy link
Contributor

It's active development that becomes annoying when using submodules/repo. Cloning an external HAL and some helper libs that you don't intend to modify can surely not be seen as a nightmare using existing tools...?

Speaking personally as someone whose company chases tip on multiple projects and usually carries out of tree patches in forks of those repositories (that rebase regularly, because those patches make their way upstream often), I do intend to modify, frequently. So while I can see your point for the "doesn't change often" situation, I don't think it covers all the users.

What I'm trying to argue for is that perhaps your needs could be satisfied WITHOUT reinventing the wheel and instead wrapping existing wheels with some python :P

Sure, and I understand that. As you've said, nobody seems to have a really good general solution to this problem, though. We're just scratching our own itch.

To echo @carlescufi, this was definitely not our first choice. If I really believed existing tools could do the job, I would have advocated for them, but I don't believe that's the case. If that turns out to be a mistake, I'm all on board to learn from it and move on. But this is the best way forward that I can see right now.

Thanks again for your feedback.

@tautologyclub
Copy link
Contributor

Well, let me further argue. Here's the arguments against submodules:

[We'd have to add] submodules to the main zephyr repository. This would not meet some of the requirements, in particular the ability to retrieve only a subset of repositories, since the paths and existence of those would be hardcoded.

I can't see how this holds if you allow the plumbing to be submodules but the porcelain to be west. I'll throw up some imaginary commands to reflect upon:

// start from scratch
git clone https://blala/zephyr.git
./west-setup.sh

// create a new branch foobar, with a separate manifest.yaml
west checkout -b foobar

// parse .gitmodules and adds a new entry, also parse/add to manifest.yaml
west add https://github.com/foobar/ ext/lib/foobar 

// perhaps also allow for non-path arguments, when the repo to be added is well-known/supported upstream
west add tinycbor

// or perhaps allow argument to be a manifest file
west add samples/net/wap_mesh/manifest_deps.yaml

// NOW we sync
west sync

To reiterate, I can't see how submodules + thin wrapper doesn't accomodate requirements.

@pfalcon
Copy link
Contributor

pfalcon commented Jan 29, 2019

[We'd have to add] submodules to the main zephyr repository.

To start (well, continue), this is a misconception. To use submodules, you don't need to add them to the main Zephyr repo. Anybody can add submodules to their fork/clone. Anybody can update to any revision of a submodule in their branch. Anybody can replace an existing submodule with something else. When they do 2 of the last actions, they expectedly will get a conflict when upstream also updates their submodule definition. To be fair, I dunno how conflict resolution is handled in that case, but I bet that in git 2019, it's done much better than in mytool-v0.NIH-beta. And of course, someone doesn't have to checkout all submodules. One can chose only those that needed.

The whole subconscious idea here is that there must be something hard in git. It stems from those time when cvs-, at most svn-, familiar developer stood on the entrance of git. Eerie sounds and flickering. A year after, nothing's eerie in day to day git, but something must eerie must be lurking someone in the corner! Myth of frightening git submodules serves that role. The whole myth is based on mixing up "tracking and matching multiple projects is hard" with "git submodules are hard".

That said, all this discussion is rather theoretic now, given that "west" was merged into the mainline.

@marc-hb
Copy link
Collaborator

marc-hb commented Jan 29, 2019

The whole myth is based on mixing up "tracking and matching multiple projects is hard" with "git submodules are hard".

I think you missed my comment about submodules above, the one where I referred to real nights and real week-ends. Plus a few other thousands people sharing their real-world experience on the Internet. All too stupid to use submodules? Most likely! A "myth": certainly not.

For git itself the best summary is this: https://stevebennett.me/2012/02/24/10-things-i-hate-about-git/ It describes perfectly my years of real-world experience as "the local git support desk" for real people across multiple projects, many of them actively (and of course wrongly) not interested in version control. Off-topic sorry.

@marc-hb
Copy link
Collaborator

marc-hb commented Jan 29, 2019

A lot of the differences between boards and flashing mechanisms can be suitably abstracted away if you have a tool that understands not just the build system but also the details of flashing different zephyr boards with different backends.

I've never wondered about the value of this type of tighter integration, sorry for any confusion. I don't know much about it but it seems to make a lot of sense to me. The only "integration" question I still have and that you haven't answer yet is very specific and unfortunately getting lost in other, vaguer discussions. The value that is really not obvious to me (yet?) is just the supposed "integration" of:

  1. the new multirepo tool which hasn't showed any obvious sign of being limited to Zephyr yet;
  2. all the Zephyr-specific rest.

Making version control for Zephyr artificially specific to Zephyr actually makes me wonder about potentially negative value because it means implementing things like Continuous Integration for instance become potentially more Zephyr-specific too which means more work for a smaller community. For instance:

We (my company) are also working on some additional tooling for doing automated testing of a multi-repo tree across multiple boards and sample applications...

I do believe that "one stop shop" is not totally as vague as it seems on the surface :). For example, the ability to do things like, say, this: [west --help]

Very interesting! The output of west --help is very clearly split between TWO sections: 1. one for multirepo commands 2. the other section for the Zephyr-specific rest. This tends to show that these two sections could be TWO (not "a variety") separate tools for no usability or user-friendliness difference.

I think there is a reason why docker is a single command for dealing with containers

https://docs.docker.com/engine/reference/commandline/ has a fairly large number of commands but they're all (closely) related to managing images and containers and none of them deals with versioning. "docker history" is a log file and checkpointing system images is very far from versioning code edited by humans.

I would just avoid vague terms like "one-stop shop" and focus instead on actual examples/benefits/features that only a tighter integration between versioning + all the rest can achieve. Providing the references above is great, summarising some of their integration benefits here would be better.

In other words and despite all the digital ink above I've seen no clear rationale yet for thinking too small.

@pfalcon
Copy link
Contributor

pfalcon commented Jan 29, 2019

the one where I referred to real nights and real week-ends

Here on github I see dozens of people having dozens of cases of not being able to rebase or fix merge conflict correctly. What does it tell us?

the best summary is this: https://stevebennett.me/2012/02/24/10-things-i-hate-about-git/

This? I thought this: https://codingkilledthecat.wordpress.com/2012/04/28/why-your-company-shouldnt-use-git-submodules/ , where a google employee explains the reasoning of creating ugly tools like "repo", though thru all that a confession rings along the lines of "the best thing we've come up with is one giant monorepo where we commit half of the world, flat". Of course, there're still sleepless nights for support staff - that's unavoidable part of enterprise software development. But at least git submodules can't be blamed.

@mbolivar
Copy link
Contributor

mbolivar commented Jan 30, 2019

@marc-hb:

  1. the new multirepo tool which hasn't showed any obvious sign of being limited to Zephyr yet;
    [...]
    In other words and despite all the digital ink above I've seen no clear rationale yet for thinking too small.

Code freeze for Zephyr v1.14 LTS is this Friday. We barely squeaked by getting west merged into master with a bunch of Zephyr-specific assumptions baked into it. We will be supporting LTS for over a year and west missing this deadline would have been a Big Problem.

I'm all for generalizing the multi-repo pieces of it when we have time, but we just don't right now. And the haters implying we're too dumb to realize we should have been using submodules all along, or are afraid of git or something, may well be right. And there really are a variety of things in west which are zephyr specific.

  • the default manifest URL for west init is https://github.com/zephyrproject-rtos/zephyr
  • the default URL for the west that gets cloned into an installation is https://github.com/zephyrproject-rtos/west
  • the tool looks for a project in the manifest whose path in the installation is "zephyr" and sets the ZEPHYR_BASE environment variable to that repository's absolute path for the duration of the call, so that the CMake invocations performed by west build and other commands will Just Work, keeping pervasive build system assumptions from having to change without requiring users to set ZEPHYR_BASE themselves (which we have learned in user testing is too hard for big classes of people, like Windows users, i.e. the vast majority of embedded developers)
  • the west package provides a variety of Python modules which individual west extension commands (about which more below) will rely on being there which have all sorts of Zephyr-specific things inside, like details about the build system's usage of the CMake cache.

Could we have cleanly separated these things into their own Zephyr-specific and generic portions, and uploaded the non-generic portions to PyPI in their own clean zephyr_rtos package, which would have imported the generic west portions appropriately? Definitely! And then we would have missed the LTS deadline, which has already been extended by several months compared to our usual release cadence :).

To all the requests for a generic tool, I think the only real response is "yes, we know, we'd like that too, all in good time if it's appropriate".

I hope the above helps clarify why.

The output of west --help is very clearly split between TWO sections: 1. one for multirepo commands 2. the other section for the Zephyr-specific rest. This tends to show that these two sections could be TWO (not "a variety") separate tools for no usability or user-friendliness difference.

There isn't any documentation up for this yet (we're working on it) but the big idea here is that any repository can provide west extension commands and they will show up here. For details, see the comments in our manifest file pykwalify schema describing each project entry:

https://github.com/zephyrproject-rtos/west/blob/master/src/west/manifest-schema.yml#L73

And the related schema for the files which each project which declares extension commands in the manifest must then provide:

https://github.com/zephyrproject-rtos/west/blob/master/src/west/commands/west-commands-schema.yml

So this is in fact arbitrarily many extension commands from arbitrarily many repositories, as determined by the corporate user, SoC vendor, Zephyr downstream distributor, individual hacker, other random upstream project in the manifest (like net-tools today), etc.

They will all show up in a unified way in the west -h output as a result of parsing the manifest file, which can be in any repository -- just pass the repository URL to west init -m <URL> and hey presto you've got a custom Zephyr derivative with your own west commands inside that are all discoverable by any user who is familiar with the upstream tool without having to open your documentation at all. Their implementations can all rely on some common infrastructure in the west package. Now, this is not 1.0 software and we've definitely taken on tech debt trying to get there that will require some refactoring and a bit of interface breakage to clean up eventually, but it is an actual generic mechanism for doing Zephyr development, not an arbitrary confluence of exactly two unrelated things.

We've been getting this "it's just two sets of commands" since the beginning, but it's really not, promise.

Hopefully the above starts to make clear why.

@marc-hb
Copy link
Collaborator

marc-hb commented Jan 31, 2019

I'm all for generalizing the multi-repo pieces of it when we have time, but we just don't right now.

Thanks! This doesn't match the initial impression(s) the current documentation gave to me and probably others.
There's actually a ton of useful information that you and others just shared in this thread that should really be rolled back into the documentation in some shape or form when you can find the time.

On one hand I feel somewhat guilty we just pressured the west developers to justify and share all this while you were working hard on a still evolving implementation and trying to make a deadline. On the other hand I'm happy all these rationales and information were elaborated and shared here because not just the documentation about "how" but also the "why" is extremely important for features affecting the top-level user interface and workflows that much.

I hope you wish us luck!

You bet!

@mbolivar
Copy link
Contributor

There's actually a ton of useful information that you and others just shared in this thread that should really be rolled back into the documentation in some shape or form when you can find the time.

Absolutely. I am very committed to making that happen for v1.14. Just needed to get the code in first -- zephyr's release management policies make it possible to write documentation after code freeze, so there's a lot of documentation still to be written that unfortunately is just in the heads and meeting minutes of the people that have been working on this.

You bet!

Thanks! I am glad this thread has been useful so far. I will CC you on any documentation PRs. Your feedback has been extremely helpful and I hope you will have time to look at the docs patches too.

@nashif nashif removed their assignment Feb 13, 2019
@mbolivar
Copy link
Contributor

We've released v1.14 with west, so I'm closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature A planned feature with a milestone priority: high High impact/importance bug
Projects
None yet
Development

No branches or pull requests