Pointers

XML & Web Services magazine has a quite detailed article about Microsoft’s structured document editor – “XDocs”. Its nice to see Microsoft’s product-naming committee abandoning all this trendy “.net” nonsense and reverting to the tradition, dating back to when ActiveX was introduced in … well a long time ago, of simply sticking an ‘X’ on the name. The idea of subsuming Word, Excel, etc. into one super-app is also vaguely remenicient of the OLE idea of “compound documents” a la Cairo.

Anyway, the product seems pretty good. Quite a lot of the Word documents I read seem to be written with just a single style-tag, so anything that encourages structure and proper semantics gets my vote. #

Searching for answers to those niggling quiestions At the Tomb of the IUnknown Interface. #

The world’s first age simulator. This is a very good idea, but people’s experience of older age seems to vary a lot. My father will be seventy in March, but he still cycles and is one of the most active people I know. #

Looks like Eric Raymond is at it again. Not quite as mind-boggling as this, mind, but still pretty worrying for us elitist Euro-commies. Probably best not to make any sudden movements when that guy’s nearby…

The one with Wi-Fi in it

The New York Times has a special report on Wi-Fi. I just love this stuff:

“Wi-Fi is also changing the way that people – at least some young, technologically adept people – go about their work. In Philadelphia, Yvonne Jones, a 33-year-old freelance copywriter, moved her base of operations to a Starbucks about a month ago and said she quickly became “a thousand times” more productive than she was when working at home. “It’s not your house, and you are there for a specific purpose, so the ‘distractions’ aren’t that distracting,” she said.”

I read this and realise for the nth time just how far behind we are here in the Britain. I’m young (well, youngish) and technologically adept – why can’t I have this stuff? Why can’t I sit in a nice caffee shop and surf?  I seriously doubt whether the number of public wireless hotspots in this country is much above double figures yet. And most of those will be in London. Guaranteed.

PhoneCam, SOAP, DNA

Another interesting article by Dan Gillmor on how camera-equiped mobile phones are being used in everyone’s favorite technology testing ground: Japan. This is the first article I’ve read that considers the social and privacy implications of having large numbers of people walking around with what are affectively always-on, peer-to-peer, instant-broadcasting systems.&nbsp

Dead Media

After the widespread gnashing of teeth earlier in the year, it’s good to see that the BBC Domesday Project’s data has been brought back from the dead. I remember seeing a demonstration of one of these systems sometime after it was launched in the mid eighties. This was probably the first multimedia system I ever saw, way ahead of the Acorn Electron I had in my bedroom, and (for its time) it was a very sweet bit of kit.

Slashdot had a lot of discussion about this and the wider implications of storing important cultural artifacts on obsolescence-prone computer media. Who today can read 8″ floppies, punch-cards, paper tape, obscure 1960’s magtapes and drums, or even (to take an extreme example) Sinclair Microdrives? Just what is being lost here? A common opinion on Slashdot, and one I’ve also seen elsewhere, was that information should be subject to a sort of Darwinian test. If nobody needs the information then they won’t maintain its accessibility by copying it to current media every few years; so let it die.

I think this is wrong for a few reasons. Firstly, some information gets more valuable as it gets older – maybe not until its very old. The original Domesday Book was created for the purpose of taxation assessment, so it was pretty useful for a few years after it was created, but arguably nowhere near as useful as it now is as a record of medieval Britain. The BBC Domesday discs were fairly interesting in 1986, but told nobody anything fundamental that they didn’t already know. Their value to historians in 1000 years time could be huge. So how long to you wait before deciding to abandon the data?

The second reason relates to the way the information is represented. Take some planetary science data recorded in 1971 and sitting on a magtape on a rack in an air-conditioned NASA vault. It was likely to be stored as a bunch of Fortran records; the format of which is quite likely to be specific to the obscure compiler used to write the software that gathered that data. Even if the tape hasn’t decayed, and you can you can somehow read the data and get it onto a modern hard drive, what do you do with it. The meaning of the data is embedded in the software that was written to manipulate it, way back in ’71. Maybe you’ve got the source-code, or even a file-structure specification document, but you’ve got to port the old code, or write new code, to get at that data. That’s expensive to do, and with more and more and more data being archived each day, the global cost of keeping the data accessible gets more and more costly. Things will be thrown away not because they’re not interesting, but because its just too expensive to keep transforming the data into a form that you can do something with. Further examples are not difficult to find. The newly-opened Library of Alexandria has a 100 terabyte digital archive, including a web archive dating back yo 1996. How is this information stored? In the private sector, what about the Lexis-Nexis database?

A solution (of sorts) to this problem is standards. Which file format was used to store the video clips in the BBC Domesday project? I don’t know for sure, but I suspect it was a proprietary format specific to the videodisk players that were used, and which is now documented only in some dusty tech manuals stashed away somewhere. Today we have well-defined, standard video-file formats like MPEG, which are unlikely to be forgotten about. In the short term (say the next five years) I can just buy any MPEG manipulation software I need off-the-shelf. In the medium term (say the next century or two) someone could probably discover enough information about the file format to decode an MPEG without any trouble. Over longer timescale’s, though, I don’t hold out much hope. The problem is that the specification is itself a document that must be kept accessible over time.

A good approach to these problems is to make information self describing and anti-coded. If you do this, you at least work around the problem of how to interpret the information: the information carries its own description of its structure. paper books are the ultimate realisation of this, but it is also what XML and SGML try to do. Of course, the structure-description has to be obvious and capable of being processes by a computer. Going a step further, these issues are precisely those that have been grappled-with by scientists who have developed messages intended to be understandable to alien civilisations:. For example, the Pioneer 10 plaque or Frank Drake’s Arecibo message. Although they generally make use of much more redundancy than is desirable for real information storage, I think there are a lot of lessons to be learned from these efforts. A good source of information is Gregory Benford’s book Deep Time: How Humanity Communicates Across Millennia.

Another idea, proposed by Jeff Rothenberg, is to encode a document as instructions for a “Universal Virtual Machine”. Executing the instructions has the effect of producing original document. The idea is that only the virtual machine has to be migrated to future hardware and operating systems. This is not only ingenious, but it has been tried and seems to work. Now all you have to do is preserve the specification of the virtual machine…

There’s another big problem on the horizon as more and more information goes on-line: copyright. Its all very well having a utopian view of preserving information for the good of humanity, but someone (or, more likely, some business) probably owns it. And they don’t want you to mess with it. Digital Rights Management (DRM) is the name for technology that allows you to, say, view an e-book on your palmtop, but prevents you from copying it to another device. This is typically done by encrypting the file in question, and using special viewing software to decrypt it at the point of use. The problem should be obvious: as soon as the company stops supporting the DRM software, the information is inaccessible for good. Under UK law, copyright expires after 25 years. The purpose of this was to ensure that all information would eventually find it way into the public domain for the benefit of suture generations. Information secured by current DRM technology never, ever becomes accessible. Big business has every interest in ensuring that it stays that way.

These are the problems that Bruce Sterling’s Dead Media Project was set-up to examine. In a nice touch of self-reference, the site itself appears not to have been touched for some time. Not a good omen.

Mozilla, e-cash, DVRs

I’ve just downloaded Mozilla 1.2, and I’m very, very impressed. The start-up time is much improved over the 1.0 release, the popup blocking works, the tabbed windows are great. I’m sure there’s a lot of other great stuff to discover. As soon as I figure out how to import my favorites from IE, I’m going to make it my default browser.&nbsp

Science and Belief

The following is a short essay on belief, rationality, science and non-science. Its part of my renewed effort to create more content. This stuff is important to me. I hope its not too heavy. Part two is coming soon.

– # –

Part One

I try not to watch too much TV – much of it bores me, and I’m often dismayed by the bland, stupid, triviality of what is broadcast. Every so often, though, along comes a programme that I really enjoy. Last night’s Horizon was excellent. Titled “Homeopathy – The Test“, it did two things that are rarely done in popular culture. First, it objectively investigated something, in this case homeopathy, which is rarely subject to skeptical enquiry. Secondly, it showed how science works.

I was expecting the usual “on one hand, one the other hand” summary of homeopathy, with maybe a few case studies of people who’d used it, and an only-time-will-tell conclusion about whether it works. But instead, they actually tested it. Right there. On the programme. They identified an experiment that in the past had appeared to confirm that homeopathy worked and had a scientific basis. They assembled a team of scientists who performed the experiment using tissue samples, an automated cell-counter, and a double-blind protocol to avoid personal bias and the placebo effect. In a superb illustration of the scientific method, they showed the scientists defining the theory, doing the experiment, and analysing the results. And the result was: the homeopathic preparation had precisely no effect. This didn’t surprise me. I knew that homeopathy doesn’t have any scientific basis. In fact, it is contrary to fundamental and well-tested knowledge of how the world works. It was an exciting thing to see, though.

Still, there are a lot of people who use homeopathy and seem to feel that it works. Some of them are sufficiently ill, and desperate, that its efficacy is a big deal to them. Similarly, a lot of people read astrological horoscopes and, to varying degrees, use them to guide their lives. And they do this even though astrology, like homeopathy, is contrary to the way we know the universe to work. People say that they believe in homeopathy/astrology/ghosts/clairvoyance/UFOs etc. I find this fascinating, because I don’t understand how the minds of these people work.

When I was about 17 I read Carl Sagan’s wonderful novel Contact. In it, one of the characters (a religious believer) tries to test another character (an astronomer) using the Foucault Pendulum in the Smithsonian Institution. The scientist has to stand at the point where the huge pendulum slows and reverses direction. She then has to wait for it to swing away and return. If she “believes” in science then she won’t step away because there is no way the pendulum can increase its arc and hit her. In the book she flinches a little, but passes the test. After I read this, I decided to made myself a pendulum, with a length of string and piece of wood for the weight, and try it. With my back to a wall I pulled the weight back from its rest position, held it against my forehead, and let go. It swung away and then back, not quite touching my forehead as it returned. Just like it should. And I knew why: because the pendulum lost energy due to friction with the air that it forced out of its way. Simple. Explicable. It remember repeating the experiment again, concentrating hard to see if I could somehow make the pendulum speed up and hit me. It didn’t.

This taught me what in retrospect was a priceless lesson. The Universe just is. The way that it works is just how it works. If you want to, you can find out just how it works by trying something, by doing an experiment. But don’t expect it to care what you want to happen. If you try hard enough, and you’re clever enough, you can understand a little more than you did before. You can do this because the Universe is explicable and consistent, and it doesn’t make special cases for you or anyone else. Its not arbitrary, and it doesn’t care about anyone’s beliefs or desires.

Part Two

(To follow…)

Writing, magtape

A useful article on how to write a better weblog. In summary: don’t just link, say something new, join the dots, create content. Well, that was my original idea with this blog. I was going to write articles or nothing, and that would help me organise my thoughts and practice writing. Unfortunately, I quiickly found out how hard that is. If I spend a day playing with, say, web services, my poor little brain is just too frazzled to be able to extract some useful or interesting commentary from the experience. And even if I could, I find it very difficult to switch from coding to writing – I produce very long sentences with nested, punctuated clauses, and way too many brackets. Code, in other words.