Dark Horse Package Management Components: DPM and DON

After a short contest to find a name, there’s a definitive winner.

The winner is Marc Wolf, of Longview, Washington — who suggested DPM, standing for Dark Horse Package Manager, as both a tool and a format name.

As to the network repository interaction component that wraps it, this will be called DON, standing for DPM over Network.

So, for example, to install a package, envision:

dpm -iv ./Audacity-3.7.1-2.x86_64.dpm

Or, if you wanted to pull it from a repository:

don install Audacity

As you may have already noticed, the interfaces will be familiar at face value to those who are used to enterprise environments.

I’d like to give out a special thanks to all the people who participated in the naming contest, as there was a deluge of submissions to consider, and honestly, I feel like most of them were great, viable product names so it was hard to choose!

In terms of next steps, I feel that making it distro-agnostic is the right approach and is consistent with the development process of everything else built so far, so I will launch a separate project site for DPM and DON.

It will eventually include utilities to:

  • load the RPMDB to DPM-native stores to take over as package manager for RPM-based distributions
  • convert RPMs to DPM format
  • load the /var/lib/dpkg tree to DPM-native stores to take over as package manager for DEB-based distributions
  • convert DEBs to DPM format

I might include other package manager database conversion utilities later.

I feel that using this approach will give DPM/DON full system lifecycle capability in a context more broad than just a small package manager, without having to spend a great deal more energy on building it.

A New Package Manager

It’s been too long!

Another round of development on Dark Horse Linux is about to take place.

Beyond just a refresh of Pyrois, and consequently Dyad, package management will be introduced.

I thought long and hard about this, and even changed my mind a few times, but Dark Horse Linux will not use RPM for its native package manager as originally intended. It will instead use its own native package manager, and will have a utility that can convert RPM packages (and a few other package formats) to this native package manager’s format.

RPM hits all the requirements, but it’s got too many legacy features and introduces alot of complexity to what should be a very simple thing. It has many design decisions baked deeply into it that would have certainly been “one way to do it” a long time ago, but even then — we could have done much better with alot less complexity. To top that off, other mainstream package managers are rather derivative as opposed to being a product of a functionality need, and so borrowed alot of those concepts into their implementations as well. New package managers not necessarily tied to a distribution have a different problem, and that’s integration with the core system packages for whole system lifecycle management of packages on the OS, and they lead to configuration management issues, unnecessary system resource overhead and incompatibilities that are unnecessarily complex to troubleshoot — at a cost of long term reliability in alot of cases. We can just do so much better.

It’s easy to say that “Well, RPM can do alot”, and it can, but it shouldn’t. All of the things that RPM can do can be done better by returning to the UNIX design philosophy, with dedicated purpose components for each type of thing. We won’t lose anything this way, and we stand to do better than RedHat and RedHat derivatives if Dark Horse goes its own way with package management.

Designing a package manager is a daunting task. There’s alot that goes into it because we have overloaded what a package manager is. If you look at the first 2 sentences of the wikipedia page for “Package manager”, you’ll see it’s a pretty overloaded term there, too:

A package manager or package-management system is a collection of software tools that automates the process of installing, upgrading, configuring, and removing computer programs for a computer in a consistent manner.[1]

Truthfully, it doesn’t have to be that big to be secure, and usable for all the things consumers (and enterprises) need a package manager to do as of today. Essentially, it just needs to manage the lifecycle of versioned collections of files on the system. That’s it.

If you look at what RPM is doing, or if you’re already familiar with RedHat systems, you’ll see that the process of creating an RPM also involves often compiling the software as part of that rpmbuild process, and RPMbuild itself if you’ve ever built with it dedicates a great deal of its features to navigating the arduous process of compiling software. Even without compiling it’s like navigating a complex maze to build the package.

This is an artifact from an earlier time, when the users were regularly compiling software and wanted to reduce or automate their compile burden and just bundle the whole product lifecycle into that process. However, the user profile since those days changed to something else — a user base that didn’t do quite so much of that. The developer profile changed too — RPM is such an involved drag to work with for modern users that write software that they outright refuse to use it at all, even for major products. This happens in both enterprise and FOSS projects. So, what we’ve all ended up with is a huge ecosystem of software whose install process is more often than not just spraying files onto servers like animals, or some big bloated ansible job that eats everybody’s time to maintain and is prone to failures, or the use of super shitty third party package management systems like snap, or flatpak, or any of the other absolute train wrecks developers are writing to get around just packaging their software. They can compile their software just fine — their barrier is the complexity that many native package managers for even mainstream distributions introduce and it’s alot of unnecessary work. If you want an example, the Signal project, which actually wants to package their software properly due to the security implications, has been arguing about the right approach to build RPMs for four years and counting as of the time of this writing. If you want another project stunted by this issue, there’s Gitea, who’s in the same boat. This is hurting those projects. Surely enough, flatpak and snap got packages first for it, because they don’t create unnecessary complexity barriers to format adoption even if they are horrid, awful package management systems to use.

At the end of the day, RPM is a /good/ package manager compared to what currently exists, and I’m going to probably have some pieces that resemble what it currently does, done in much simpler ways than RPM does it.

For example, while RPM is installing the package on the OS, they have a utility called yum/dnf that wraps around it to pull from repos. That’s a good way to do that to keep purposes of components separated. DNF/YUM is pretty solid. I have some complaints about its repo management aspects, but it’s good.

Then they added a bunch of cruft to that too, with plugins and other things that don’t apply for the overwhelming majority of their uses cases but get used just enough to create a ridiculous spread in developer practice for building the packages, which really should just be very straightforward and boringly uniform.

Then, at some point, licensing was introduced as more barriers. To me, this is purpose defeating for a Linux distribution. Even if you sell support and subscriptions, the OS itself should be freely available. Any DRM you bake into it is going to deviate from the F/OSS approach and introduce fragility and trust issues. The way you make money with it is by selling something of value to people using it: support, add-ons etc.

So, the Dark Horse Linux package manager, yet to be given a name, will operate in a manner that assumes that the software is already compiled. It won’t care how you compiled your software and it expects you to have already done so. Let’s call it “build system agnostic”.

The package creation component will consist of some files that identify which files are in the package, their checksums, and whether they are “controlled by the package” or “not controlled by the package” (such as configuration files that should be left alone during an update). It will have an option for signing and there will be a package manifest for integrity validation. It will do dependency resolution.

There’s obviously an entire cycle that needs spent on design, but currently I’m envisioning something like this for the package structure:

package-name-1.0.0.pkg.xz
├── metadata/                    
│   ├── FILES_DIGEST
│   ├── FILES_DIGEST.sig
│   ├── PACKAGE_DIGEST
│   ├── DEPENDENCIES
│   └── contents.sig
└── contents.xz

This contains some elements of slackpkg, as well as RPM. The package itself is just an XZ archive (I may end up using a gzipped tarball instead).

At the top level is a directory named metadata. There is also a contents.xz that contains a directory structure of the packages in the paths they are intended to be on the system, from the context of the root filesystem:

contents.xz/
├── etc
│   └── myapp
│       └── conf.d
│           └── main.config
└── usr
    └── bin
        └── myapp

Under metadata, I envision a FILES_DIGEST and a PACKAGE_DIGEST. The FILES_DIGEST would contain something like a line delimited table:

C $CHECKSUM /usr/bin/myapp
N $CHECKSUM /etc/myapp/conf.d/main.config

Here you can see files marked as “CONTROLLED” or “NOT CONTROLLED”, to indicate whether update and remove operations will replace those files.

And then the package digest would perhaps be the checksum of the concatenated checksums of the files. And then notably the signature files and more to flesh out, with the signature aspect being optional.

Another piece is dependencies. I think a line delimited list of rules that it would check would be sufficient:

glibc > 2.21.0
glibc < 2.42.0
libstdc++ > 0

And then perhaps a few additional files like for AUTHOR, NAME, VERSION, ARCHITECTURE, DESCRIPTION, SOURCE et al that we’re all used to seeing.

Next is the database. You do need a database for querying installed packages, their names, description, dependencies, all the things I listed — including what files that package provides, or perhaps reverse lookups for which package provides a certain file, or even a check to say “have any of the controlled files this package has installed been modified” that does a little checksum comparison and reports accordingly based on what’s in FILES_DIGEST and PACKAGE_DIGEST and perhaps some additional cryptographic mechanism to ensure it hasn’t been tampered with (this is an example of where I’m going with it, and has issues to work out).

The problem with a package database is you can hose the system pretty good if you rely on a database as a single point of failure. Even RPM suffers from this where if the rpmdb is wiped you can recreate it but it’s not really “all the way repaired” in alot of cases.

So, maybe something like a directory structure of the objects in metadata moved to, say, /var/lib/${package_manager_name}/packages/${package_name}

And then having the “database” that is worked with be, say, an sqlite database generated from that tree on demand, so that the database itself is just a caching mechanism to improve query performance on lookups, and can be deleted any time if there is an issue because the next time it runs it’s going to rebuild the package database from that metadata directory tree.

One thing that I have not accounted for yet is post-operation scripting, such as “restart this service after updating this package” or “apply these tuning parameters to such and such after it’s installed”. I may leave this open ended and just have post-action hooks declarable in files, so, perhaps as a sibling to “contents.xz” and “metadata“, perhaps something like directory named “HOOKS” and it would check for the existence of reserved filenames, such as:

HOOKS/
├── PRE-INSTALL
├── PRE-INSTALL_ROLLBACK
├── POST-INSTALL
├── POST-INSTALL_ROLLBACK
├── PRE-UPDATE
├── PRE-UPDATE_ROLLBACK
├── POST-UPDATE
├── POST-UPDATE_ROLLBACK
├── PRE-REMOVE
├── PRE-REMOVE_ROLLBACK
├── POST-REMOVE
└── POST-REMOVE_ROLLBACK

And these would just be optionally placed shell scripts that do what is needed for their software for those operations. This way we’re agnostic to the code that’s actually running without hopping on the runaway complexity train.

Truthfully this is a naive approach, because software developers do some terribly stupid and destructive things to the point that giving them free scripting range without implementing an entire command framework to limit what they can have these do is probably the only way to do this that will protect the users from them. They do things like “the package is just installing the repo on your machine and it then downloads from our repo and pollutes your machine with garbage when all you told it to do was install a software package”. Slack does this. Their engineering department is aware of it and they refuse to fix it. After a certain point of complaint instead of fixing it, they dropped support for most distros. So maybe some sanity checks would be appropriate there because software engineering teams don’t care about the system.

And then obviously a component that wraps it that facilitates repository interaction to fetch packages remotely, much like DNF/YUM does for RPM but much simpler. Repos should be file-based or accessible via HTTPD and will rely on reserved directory names for repository metadata, much like DNF/YUM does. This is not a complex piece to implement.

What I’ve described here is just a stub of the design, and I’m sure it’ll pan out to become wildly different, but, this is where my brain is headed with it. This post is less about what the final design will be and more about the fact that yes, Dark Horse is still moving, and yes, it’s going to have its own package manager.

I will need a name. If someone emails me name suggestions to chris.punches@darkhorselinux.org and their suggestion is used, they will receive a usb stick preloaded with Dark Horse Linux at the next release that contains the package manager.

Cyrois is now Dyad

Cyrois was a placeholder name for the new fork of Pyrois.

Pyrois will be responsible for generating the early builds of Dark Horse Linux and its source based system compilation.

Dyad, previously called Cyrois, will be responsible for providing a step by step automation of a variant of LFS, to better enable those who should wish to create their own Linux distributions not beholden to existing interests.

Cyrios: Forking Pyrois

After a great deal of thought, I’ve decided to fork Pyrois. The new project is called Cyrios.

Cyrios will be kept under the SILO GROUP umbrella and not the Dark Horse Linux Project space. Its source code is available at:

https://source.silogroup.org/SILO-GROUP/Cyrios

I will at a later date set up a downstream mirror on github.

Pyrois will continue to be part of the Dark Horse Linux Project and will remain where it is, but will be specialized to produce Dark Horse Linux images.

For those of you who are visual thinkers, this is the current state of the project structure:

And this is the future state of the project structure:

The reason why I’ve decided to fork my own project is a bit nuanced, but in essence, the goal of Pyrois is both important to me and limiting of its utility to the Dark Horse Linux Project at the same time. This solves both angles.

Here are the highlights:

  • Pyrois was successful in providing a way to build a non-specialized ISO image for someone to be able to extend to create a new Linux distribution image to just about any configuration imaginable. It’s supposed to be a “generic distribution factory” that doesn’t make too many assumptions about what you want to build beyond a standard source-based system that is cross-compiled from raw sources. While there are many projects that will produce a linux image, I’m not aware of any that do not provide a highly specialized system beholden to upstream project interests with large sponsors behind them.
  • Pyrois is currently used by Dark Horse Linux to create its images. Now that Dark Horse has reached a certain level of maturity in planning, I need to extend Pyrois to be specialized to generate the Dark Horse Linux image, and specifically an installer ISO.
  • I didn’t want to destroy Pyrois’ current facilitation of distribution genesis by just specializing it to DHLP, so I’m forking it to a new project called Cyrios to serve that function.
  • Cyrios won’t be able to get as many resources as Pyrois, so, I’m expecting it to largely stay out of date but to provide a base on which others can build by refreshing it using the LFS documentation.
  • This will pave the way for a more developed DHLP image and eventually an installer image.

Reduced ISO Size and Improved Build Safety

I was able to reduce the generated livecd ISO image from its eyesore of 8GB down to just under 2GB, putting the ISO size for the Dark Horse Linux livecd on par with other distributions.

You can grab the latest image here.

The issue was ultimately an overzealousness to make the image “provable” from a security perspective. The debuginfo in the compiled binaries of the system, particularly the kernel modules and shared libraries, accounted for about 6GB of the storage. Stripping that debuginfo consisted of almost all of the size reduction. 75% size reduction is really quite worth it.

This version also has some repaired filepaths, as well as additional safety in the scripts that generate the image.

You’ll probably see in the git logs that there are a bunch of fairly recent additions of set -u in the bash executables that Rex is kicking off. This tells bash to treat any reference to an unset variable as an error and exit immediately. That way if you do the admittedly dumb rm -Rf ${unset_path}/${also_unset_path}, bash will refuse to execute that line and fail the project run of rex due to a non-zero exit code. Obviously this is a stop-gap measure until I can do a full cycle focused on code safety rewrites.

Next is (besides some more clean up on the pyrois codebase) the introduction of RPM and presumably DNF, followed by some formation of what an installer ISO will look like.

It would potentially be a turning point for this project as it could be the point at which the distro moves from source based to precompiled binary package based, depending on how much infrastructure gets introduced during the RPM and DNF build/research work.

Make sure and subscribe to the new mailing list for updates as well.

Email, Mailing lists and More

As part of the productization effort of Dark Horse Linux, one of the things that needed to be done is getting Dark Horse Linux resources off the SILO GROUP domain.

One of those included email. Another included mailing lists.

So, I’m bringing mailing lists back. If you’re looking for how to contribute, the dhlp-contributors mailing list is a great place to ask questions and find out about the project:

https://lists.darkhorselinux.org/mailman/listinfo/dhlp-contributors

If you want to contact someone personally, I can be reached at:

chris (dot) punches (at) darkhorselinux (dot) org

As always, I’m eager to bring more people onto the project, so if you know of any potential volunteers, please feel free to point them at that mailing list or to me directly.

I think I may be running out of excuses to focus on the installer and bringing in RPM.

New Website

As a result of community feedback, I have created a new website for dark horse linux, and it is in production now.

I am lukewarm with the results, but it does meet the intended purpose for now.

The documentation system is an acknowledged gap, one that presents an eyesore, and I’m not sure how I’ll approach it yet.

I’m not 100% convinced I should be writing the documentation so much as overseeing community contributions to that and ensuring the content is accurate.

New LiveCD ISO

Within the next hour or so, I’ll be uploading a new livecd to the downloads section of the site. It is a rather clean and faithful production of an LFS system, with overlayfs added as a dracut module so that the livecd can actually be used as a system with a writable root filesystem.

Changes won’t be persistent, but, this is a much better proof of work than what was up before it.

At this point I do encourage people to try it by downloading it and firing it up in an emulator and provide feedback. I’d be thrilled to hear someone got it running on a physical machine.

While cool in and of itself, this paves the way for transitioning the image to an installer ISO, where Pyrois copies the sysroot to a subdirectory in itself to put on to a target machine to boot from local disk after some basic configuration prompts.

ISO Image file size is still an issue at 8GB. I’ll have to depart from LFS to fix that. So, this might be the “goodbye LFS” post I’ve been building up to. Not sure yet.

Progress Update

I am excited to announce a milestone in the Dark Horse Linux project.

The code in the repo has finished creating the initial target system, including the kernel compilation using the default Fedora Linux kernel config.

There is an exception, the fstab file, which has a chicken-egg problem to resolve but I consider this to be minor and can be resolved during the ISO file generation, which is the next piece.

I should be able to use grub-mkrescue or genisoimage/xorriso or some combination of these with a thought out init ram disk to complete that part.

Once complete, I’ll start polishing the codebase for Pyrois up, branding a fork of it for ALFS-NG, and tag and release that for now.

For the long term, the `ALFS-NG` spinoff is just a side project that got spawned by this effort. I haven’t decided if this is something I want to give to the LFS folks to revive/modernize their seemingly abandoned ALFS project, or if it should remain wholly separate due to the Darcy infection. I have a desire to “play well with others”, but, that can’t occur in a vacuum, and I need to be certain it stays shielded from Canonical based on previous behaviours.

On the other hand, it’s likely to go very stale if I keep it under my thumb as it’s not a priority for me to keep it up to date more than to facilitate it as an instructional device. I suppose that begs the consideration that they’ve already abandoned that project once, so, it may not be a priority for them either. They likely wouldn’t want it.

I will then want to decide whether I want to introduce librpm for the next phase, or whether I want to create an installer disk, presumably with something like `anaconda`, though, I am hesitant about introducing an upstream dependency on the RH ecosystem based on the goals of the DHLP project.

I will want this system to be very familiar to RH-based distro users for many reasons, but, all cooperation with external projects must be a voluntary effort that can be easily severed for this mission to work. This is, after all, a contagion firewall. I’m already going the librpm route, which is maintained by RH, but, many distros use librpm. Not many use anaconda, so, their maneuvreability is different with that piece. Not that I think anything would happen there, but, it’s a risk surface that needs gaurded.

I suppose I’ve answered my own question thinking it out — anaconda is a package-based distro installer, so, if I’m going the librpm route I need to do that first before anaconda.

I suppose one approach might be to build my own basic installer that is a modification of the sysroot I’ve created that just copies the unmodified sysroot to a target system after the user selects and mounts all their target mounts. That’s a thought. Copy it off, when it’s done in the build, then engage in a 6th stage that turns one of the copies into an installer that puts the intact copy onto the target system.

I may do that first to move forward on a known working path, and then do my reductionist cross-reference with other similar projects to synthesize a new process. It’s a bit of a longer path but it’s one that can provide consistent progress.

Of course, things are always subject to pivot based on what I learn as I go through this. I have no experience with this aspect of things, so, I’m certainly open to ideas.