0

SLotD: Serialise/Deserialise is Copy

Software Lesson of the Day for 2/12/2022. Rather than waste all that time writing a perfect universal .net object copy library that copes with every possible variation of nested reference types, value types, fields, properties, statics, and twenty-plus years of special cases and regret and technical debt baked into the framework… just serialise your damn source object to json and de-serialise it back out again. And then be on your way.

SLotD: Premature Abstraction

Software Lesson of the Day for 2022-10-04: Note to self. If you’re struggling to create a generic, reusable, well factored, abstraction to implement a simple, application-specific piece of functionality – then it may just be that instead you can get away with a couple of simple data fields and a static function or two. Remember this for the next time.


(Lets see if I can remember to do this more often…)

Adventures in Yak Shaving with System.CommandLine

Over the last few months I’ve been using wasting odd moments of free time by tinkering with some code to extract pictures from a Google Takeout archive. The idea is to use the json metadata in the archive to restore the image timestamps (which Google removes from the embedded image metadata – for reasons best known to itself), rationalise the file naming, separate edits and originals, etc. The ultimate aim is to be able to grab a Takeout, extract and locally archive all the images from some period of time (say, the last year) so that I can then manually remove those images from my Google Photos collection. And the aim of that is to reduce my exposure to Google arbitrarily closing my account and consequently deleting my pics. And because I’m old fashioned enough to distrust 100% reliance on “the cloud”.

So I wrote some code and got it working as a c# .net 6 command-line app. Its a bit rough but it does what I need.

And then I had the genius idea of restructuring it as a set of providers that could be used to extract all the other stuff that you might find in a Takeout archive: contacts, emails, whatever. And of course this would need command-line options that apply to each provider, to allow the output to be customised. Which requires a way of grouping those options – basically I needed the idea of “commands” that delimit groups of options and correspond to the different types of media in the archive. I also needed some options that are global and not associated with a command – for input and output directories, for example. At this point my old CommandLineParser class that I’ve been dropping into console apps for the decade or so was not going to cut it.

So I did some reading and decided to try System.CommandLine – the shiny new way to parse command line parameters. This is still in beta but my initial impression was favourable. Basically, you create an object model of your command-line syntax, hook it up to handlers, and let the library do the grunt work of parsing the command-line into values, handling errors, automatically generating help text (particularly impressive), and lots of other stuff.

Here’s a little test app that I made:

    public static int Main(string[] args)
    {
        // audio command
        var thresholdOpt = new Option<int>("--threshold");
        var scaleOpt = new Option<double>("--scale");
        var audioCommand = new Command("audio") { thresholdOpt , scaleOpt};
        audioCommand.SetHandler(
            (int threshold, double scale) => { Console.WriteLine($"threshold={threshold}, scale={scale}"); },
            thresholdOpt, scaleOpt);

        // video command
        var monochromeOpt = new Option<bool>("--mono", description: "Monochrome");
        var colourOpt = new Option<bool>("--colour");
        var brightnessOpt = new Option<int>("--brightness");
        var videoCommand = new Command("video") { monochromeOpt, colourOpt, brightnessOpt };
        videoCommand.SetHandler(
            (bool mono, bool colour, int brightness) => { Console.WriteLine($"mono={mono}, colour={colour}, brightness={brightness}"); },
            monochromeOpt, colourOpt, brightnessOpt);

        // root command
        var infileOpt = new Option<FileInfo>("--i");
        var outfileOpt = new Option<FileInfo>("--o);
        var rootCommand = new RootCommand("test");
        rootCommand.AddOption(infileOpt);
        rootCommand.AddOption(outfileOpt);
        rootCommand.AddCommand(audioCommand);
        rootCommand.AddCommand(videoCommand);
        rootCommand.SetHandler(
            (FileInfo infile, FileInfo outfile) => { Console.WriteLine($"i={infile}, o={outfile}"); },
            infileOpt, outfileOpt);

        return rootCommand.Invoke(args);
    }

This implements the commands for an entirely fictitious test program that might be invoked with arguments like:

test --i "input.dat" --o "output.dat" audio --threshold 42 --scale 3.14 video --mono --brightness 60

Hopefully the similarity to my Takeout extractor should be obvious.

I was initially a bit mystified by the use of lambdas as “handlers” that are passed the values of various options. This mean that there was no single place in the code where everything about the parse was “known”. I didn’t know why it was like that but I thought I could work around it.

The first difficulty I encountered was that, while it is possible to associate options with the root command and also associated commands (which have their own options), only the first command is ever parsed. So if I include the audio command then the video command is ignored. Also, if any command is included in the args array then options associated with the root command itself (e.g. --i and --o) are not parsed. Clearly I was either not understanding something, or I wasn’t using it in the way that it was designed to be used. I opened an issue on github and fairly quickly got confirmation that it was the latter.

There was, however, cause for hope: I could split the command-line at command-token boundaries and parse each subset of arguments separately. Since RootCommand.Invoke() is actually an extension method (more of this below) I wrote a new extension method to do this:

    public static int InvokeMultiCommand(
        this RootCommand command, 
        string[] args)
    {
        var commands = new List<Command>() { command };
        commands.AddRange(command.Subcommands);
        foreach (var seg in SegmentArgs(args, commands.ToArray()))
        {
            var exitCode = command.Invoke(seg);
            if (exitCode != 0)
                return exitCode;
        }
        return 0;    }

SegmentArgs() does the job of chopping up the string[] arguments array into a string[][].

With that working, I looked at how to customise the help output to include all commands and their options. As it stood, invoking the app it with the –help option gave the following:

I needed descriptions for all the commands, and also for their options to be listed.

After reading the documentation for help customisation, and digging into how the help is generated, I realised that what I’d done so far was the easy bit. The library provides a CommandLineBuilder class, instances of which can be wired to lambdas that customise how it generates help text. But having done that, the CommandLineBuilder instance is responsible for doing the parse via it’s Invoke() method, not the root command. And there didn’t seem to be a way to make this compatible with the code I’d already written: I wanted to parse commands separately but have help generation that was aware of the syntax of all commands. There seemed to be a fundamental mismatch.

I tried extending CommandLineBuilder by the deeply unfashionable approach of sub-classing, but its Build() method (which generates a Parser object to actually do the parse) isn’t virtual so I couldn’t override it. And many of its key methods are implemented as extension methods, so I couldn’t override them either.

I tried extending CommandLineBuilder instead, but I found that I was having to wrap more and more of its functionality. And because CommandLineBuilder is injected as a dependency at various points, my non-overriding extension methods weren’t being called anyway.

So I gave up shaving the yak. At the top of my stack of requirements, I just wanted to archive photos. At the bottom of the stack I was hacking on a command-line parsing library to extend it in an unusual way. It was an interesting exercise, but I was wasting time. Its always good to know when to give up and pop the stack.

Meditation

I used to say that I sell my time for money. But recently it increasingly feels like I sell my life for money.

Go and look at a tree stump. Count the rings. Each of those was a summer of that tree’s life. Trees aren’t made of cellulose, they’re made of time.

And if you’re reading this, so are you.

Some AI Art

Herewith some rather startling art produced by a latent diffusion model implemented in a Colab notebook from @multimodalart. In each case the caption is the phrase that I gave the model to use as the seed for generating the image. I have to remind myself that these images didn’t exist until I generated them today.

I started with a few “X in the style of Y” runs:

an owl in the style of picasso
An owl in the style of Picasso
An owl in the style of Rubens
An owl in the style of Rubens
Unicorn in the style of Picasso
Unicorn in the style of Picasso

Then I tried a few more ambiguous phrases:

At night I dream of the whales
At night I dream of the whales
Childhood memories of the dream lighthouse
Childhood memories of the dream lighthouse
Passageway between dreams of childhood homes
Passageway between dreams of childhood homes

I particularly like the final one.

I’ve played around with quite a lot of ML-based stuff over the last couple of years, but the relative ease with which I was able to do this has me kind of shaking my head in disbelief. We’re a shockingly long way from MNIST now.

Via The Checkpoint newsletter