| In the last day of the 2005 MySQL conference, I finally heard a speaker who stretched the audience's assumptions and pointed toward a liberating path forward. This is the sign of a good conference, incidentally--most of the sessions deal intensively with the problems of today, but one or two keynotes prepare the listeners for tomorrow. I wrote in my earlier weblog about this conference that MySQL was becoming conventional. Many people are doing innovative things with it--I sat in today, for instance, on a session about MySQL as an embedded server or library--but the largest attendance has been reserved for traditional topics such as replication and performance tuning. MySQL AB itself is concerned with catching up to its competitors in terms of SQL features that centralize more and more control in the database engine. Adam Bosworth, in his keynote today, threw all that out and set his ship headed in a different direction. The problem he found with centralizing processing--with stored procedures and triggers and so forth--is that it doesn't scale. His talk also implied that it restricts users from making innovative connections. Google, his most recent landing place during Bosworth's long and impressive career, illustrates an entirely different way to handle data. Adam Bosworth's view of an open data query protocol The promise of the Web was to aggregate the contributions of individuals everywhere and make retrieval easy along any lines one chose to use. As the volume of content became unmanageable, XQuery was supposed to provide a Web-aware search mechanism, and Web Services the infrastructure and protocols to connect sites. XQuery and Web Services were too big and came too late, however. Nobody actually wants to use them, even if they know how. So the gap has been filled with RSS, the model highlighted by Bosworth for the next stage in search. RSS and Atom are lightweight and easy to understand. The put control in the hands of the content providers and the potential viewers. Bosworth's extended vision is for a protocol that provides raw access to data, somewhat as XQuery is supposed to do. It would be a very simple and database-independent protocol that would make all data in the world open. Then, he says, everybody could do what Google does. And more--we could provide distributed updates too. Where to impose structure The Google approach to data, carried through in Bosworth's vision, runs head-on up against the ideals of the relational database model. The entire relational approach, from the canon of Third Normal Form (three is a holy number) to the enormously complex collection of analytic functions, subqueries, and other ways to impose structure in SQL, is an attempt to be as precise as possible about the data chosen and returned. Bosworth isn't interested in that. If the user gets a few hundred results and has to scroll through them a little bit, that's fine. We don't need no stinkin' metadata or knowledge management. The philosophical debate underlying relational database design Bosworth evoked earlier debates that I've found valuable and aired several concerns of mine; his views of the XML specs and RSS/Atom are familiar. But his brief critique of the trend toward putting more and more features into the database engine--a critique that he whisked through on the way to grander visions--left open a question about the basic philosophy of SQL. When MySQL was bare-bones and lightweight (which it still is compared to commercial database management systems or PostgreSQL), it put responsibility in the hands of the application programmer. If a value was supposed to be limited to a particular range or two columns were supposed to be entered in tandem, it was the application programmer that made sure of it. In contrast, traditional database design takes as much control away from the application as possible and puts it in the database. A constraint or trigger or stored procedure or foreign key can make sure that no one gives someone an absurdly high salary or fires an employee while leaving his phone number in the database. This centralized control is a relic of the 1970s, when corporate staff would sit at command-line processors and type in SQL to do what they wanted. Nowadays, when an application and even a Web interface stand between the user and the database engine, the never-trust-the-user philosophy is less valid. At the very least, an application has to know the rules the database is enforcing and translate error messages into something the user can understand. The wall between application and database engine is porous, so the application can take on more of the validation and logic. But both philosophies are valid, and now MySQL offers a choice. I suggested to Arjen Lentz, the organizer of this year's conference, that he offer a debate next year between the application-aware philosophy and the database-aware philosophy--when is each appropriate? Most of us still need to find that phone number for an employee and do other everyday tasks; we'll be using a relational database for that, and MySQL will be providing that service for more and more sites. The people with day jobs who came this year to find out whether MySQL could bring home the bacon got their answers. But MySQL can also support fun applications, and I hope to see more coolness next year. Andy Oram is an editor for O'Reilly Media, specializing in Linux and free software books, and a member of Computer Professionals for Social Responsibility. His web site is www.praxagora.com/andyo. |