Resume Footnotes


1. Caltech does not offer an undergraduate computer science degree as such. Technically, I chose the Engineering and Applied Sciences "Option", with a "Concentration" in Computer Science. Nevertheless, students and faculty considered people like me "computer science majors".

2. It seemed like a good idea at the time. XML was clearly gaining momentum as a universal structured data interchange format; my co-founder and I had significant experience with XML and semi-structured data management in general from Microsoft; the Internet bubble had not yet burst.

3. Originally, XYZFind focused on technology for end-user search, navigation, and discovery over large collections of heterogeneous XML data, applying editorial guidance where available and making educated guesses otherwise. The basics of this interesting work are summarized in the paper we presented at a SIGIR 2000 workshop.

Unfortunately, we found end user search tools hard to sell, wacky new end user search tools harder still to sell, and wacky end user search tools that presuppose structured XML data quite unsalable. So, we productized the underlying semi-structured indexing technology.

Branded an "XML database" and targeted at developers, our product developed customers (and competition). I progressively refactored the system to support real-time indexing, concurrent operation, and lightweight transaction processing. We developed novel semi-structured index compression techniques that enabled XYZFind to win major customers from the competition.

4. It was really gross, too.

5. The group, led by Edward Jung, was originally called "Second Foundation", after the Asimov books. Generally speaking, our mission was to enable applications to expose their semantics via an RDF-like layer. Among other advantages, this would enable the development of generic "user interface drivers". Currently, to support N user interface modes (GUI desktop, HTML thin client, pen tablet, small-screen PDA, voice-only telephony, ...) and M applications (Outlook, Sidewalk, Word, ...), developers must write N times M user interfaces. Given a semantic platform, we could implement a N user interface drivers and M application semantic models and the platform would make everything work. We liked to describe this as the logical extension of the video and printer drivers already present in the Windows platform.

Not incidentally, such a platform would give Microsoft a key advantage in the emerging world of ubiquitous computing devices, allowing Microsoft to capture vast new markets and maintain the exponential revenue and share price growth to which it had become addicted.

"Ordinary" Microsoft developers thought we were lunatics. They weren't all wrong, but they weren't all right, either. We did do some interesting work on the way to our inevitable demise.

6. The ASR's basic data model was a graph or "semantic network". To query a graph structure, we decided that we needed a query language based on graph pattern matching. We designed a unique database interface we termed the "spider" which allows the application to interactively explore the configuration space of possible bindings for a set of constrained variables (expressed, of course, as a graph in the system). Microsoft actually patented this interface. While it is extremely flexible and conceptually quite simple, it was also extremely difficult to implement efficiently. Nobody actually needed that much flexibility; something much simpler would have done just as well.

7. To support user interfaces with direct editing and immediate visualization feedback rather than the "submit button" experience commonly associated with database or client/server applications, we decided that the ASR needed access latency under 10 microseconds for cached data. Database servers typically take a few milliseconds to respond, even to "easy" queries.

To achieve this response time, the ASR needed to operate inside each application's process space and give the application direct access to the database cache. For this reason the ASR ran as a Windows DLL (with COM interfaces) that managed a global shared memory area. However, this left the database at the mercy of the applications; if an application crashed or was killed during a critical operation, the shared memory area may have been left in a bad state.

To solve this problem, I implemented a locking system capable of detecting when the process owning the lock had died. When this happened, the next thread in the lock queue was granted the lock, but notified that the lock was stolen from a cadaver. Every lock had an associated state flag which described what operation was underway and how far it had got. All of the code which used synchronized data structures in the shared memory area was meticulously constructed so that every operation (such as inserting a key into a binary tree index) could always be rolled forward or backward if it was interrupted at any point. The thread inheriting the lock would start that process (and that thread itself might be terminated as well).

That way, even if applications were crashing left and right, those applications still running would clean up the shared memory area "on the fly". In addition to careful reasoning about the recovery of each operation, I built a test framework that would launch and terminate database test applications randomly, striving for coverage of every possible termination point. Starting and killing hundreds of concurrent processes per second on a multiprocessor PC exposed plenty of bugs in Windows, but the database held up well.

Of course, the shared memory area was still entirely susceptible to memory corruption from ill-behaving applications, if they decided to scribble memory before crashing. Building this system was a lot of work; I'm not sure it was worth it, but it was an interesting challenge. As far as I know, this sort of cooperative transaction processing system with on-the-fly recovery from failed peers is unprecedented.

8. The user interface was driven by an "annotated schema" which described the structure of the relevant parts of the Sidewalk database and included tags describing which prompts to use, how results should be presented, and so forth. The interface supported expert users (who could say a phrase like "action movies in Bellevue after 8:00" as soon as the call connected), novice users (guided by a series of prompts to choose a search domain and specify restrictions), and users who had problems with speech recognition (after repeated failure or delay, the system would fall back to DTMF menus and guided entry of names using the keypad alphabet).

Voice user interface is challenging because the effective bandwidth is so low. Speech recognition systems made the problem worse in some ways; then (even worse than now), accuracy rates were low for speaker-independent telephone recognition. We always confirmed the user's state ("There are 27 Chinese restaurants in Bellevue. Would you like to list them all, choose a price, start over, go back, ..."), and made sure to use phrases the user could echo to return to this location or get additional prompting ("Chinese restaurants in Bellevue", "choose a price"). It was also important to choose navigation phrases the speech engine could unambiguously recognize.

The data-driven (or "semantics-based") approach worked well here, because we could apply generic solutions to problems like DTMF fallback without having to specifically handle that case at every point in the user's interaction with the system. It also let us have a more flexible user interface than the usual "phone menu"; the user was free to specify constraints out of order or jump from place to place, and the system would react accordingly, taking all of the available data into consideration before recommending the next step or presenting results.

9. Sidewalk has since been purchased by its former competitor CitySearch.

Second Foundation was involved heavily in Sidewalk's design; we saw it as a testbed for semantics-driven applications. Sidewalk was built on SQL Server, included a very fancy schema and query transformation engine that transformed "conceptual schema" into physical schema, and used data-driven "renderer" components to turn conceptual data into Web pages at a given resolution while carefully separating content and presentation.

It was great fun to have the database available for ad-hoc SQL queries; it included everything from the head chef of every restaurant to the showtimes and running time of every movie showing in the city, all in a wonderfully clean, perfectly normalized schema.

Of course, this employed dozens of top-notch developers and was expensive as hell. Microsoft eventually couldn't justify the investment and sold the lot to CitySearch, who use a much more typical Web design approach: cheesy Perl scripts, ordinary Web page template systems, and a pile of dirty imported data schlepped into whatever table format seemed easiest at the time. It works almost as well. Go figure.

10. TAPI is Microsoft's Telephony API; we used it with Dialogic multi-line telephony hardware to answer the phones, move sound data in and out, and detect touch tones. SAPI is Microsoft's Speech API; we used it with the Microsoft speech recognition engine and Lernout & Hauspie's text-to-speech engine. These days, we'd have built the whole thing with something like VXML, rather than duplicating all the infrastructure work companies like TellMe now specialize in.

11. I converted a dozen poorly understood special cases into three well-understood general cases, rationalizing the way negative assertions and negative preconditions were handled in this engine. (Negation is the bugaboo of rule-based reasoning systems everywhere.)

12. The Rete algorithm is a technique for compiling inference engine rules.

13. MediaView was an internal product used extensively in the multimedia CD-ROM titles so popular a few years ago. It was later renamed "InfoTech" and used in Microsoft's Web properties as well, but eventually lost to Tripoli (now known as "Microsoft Index Server").

14. The first step a traditional natural language parser takes is to identify the "part of speech" (noun, verb, adjective, ...) for each word in the sentence. Dictionaries can be used to look up words, but applications may include their own specialized lexicon. For example, our prototype application offered a natural language interface to Microsoft Cinemania (a CD-ROM title with IMDB-like movie facts); movie titles such as "Star Wars" were considered nouns, even though "Star" by itself could be interpreted as a verb. I wrote a component to scan the database, build an index of phrases, and recognize them when they appeared in the input text. (In fact, the system processed each of the possible interpretations in parallel, preserving ambiguity all the way from the speech recognition system to the application's semantic model.)

15. We had a demonstration that combined GUI, spoken input and natural language typed text into a single user interface. (The user could say "When was he born?" and click on a person's name; the system would resolve the input into a single question and look up the relevant birthdate.) To explain the system's internals and the levels of semantic processing, I created a timeline-oriented display that visualized everything from the audio waveform (at the bottom) to application operations (at the top), representing speech recognition, word identification, natural language parsing, predicate reasoning, and the other phases of processing. The user could manipulate the timeline and click on individual elements of the display to see more information.

Say what you will about this group, but we did have killer demos.

16. Cassini, like the older missions it was modeled after, has almost no local autonomy. The spacecraft is driven by sequences of timed commands uploaded from the ground, which say things like "on 2001-07-17 14:44:10.75 UTC, open valve B17". Opening valve B17 may be part of a complex series of events designed to effect a course correction, adjust cooling systems, activate an internal experiment, or who knows what.

These sequences were and are assembled, processed, verified and compiled for upload by a complex and very old chain of ground software components. A component in this chain is software called SEQTRAN which is effectively a macro processor that turns scheduled logical events (such as firing a thruster) into physical events (such as opening and closing valves) via macro expansion. This software was originally written in assembly, translated to FORTRAN by mechanical substitution, and later converted to C via similar techniques. Function parameters and local variables are almost universally avoided in favor of global variables which are named after the registers on the ancient IBM minicomputer the assembly code was originally written for. Its input and output formats are fixed-width text records, and the documentation still talks about "cards" and "tapes".

My assignment as a summer intern was to experiment with replacing this system. I built a replacement that was (in my opinion at the time) vastly more maintainable and (certainly) tens of times faster and much more flexible. It was considered an interesting proof of concept, but dropped -- after all, who wants to jeopardize a billion-dollar mission with some wacky summer intern's pet project?

Later, as a part-time employee, I wrote various scripts and pieces to work with SEQTRAN, preprocessing its input and massaging its output, automating the process of running the beast, and generally aiding ground operations. Some of that may even still be in use.

17. When I worked on it, Excel was 1,000,000 lines of C code and 300,000 lines of assembler code, all wrapped up into a wonderful spaghetti mess. The whole thing was held together with lavish use of assert() and extensive cross-check verification frameworks. Normal development procedure was to make a change, run the program, watch the assertions light up the debugging console, and go to work fixing everything you could find. The recalculation loop in particular was one big C "switch" statement laced together with "goto" statements and very carefully ordered to allow the optimizer to generate an efficient jump table.

18. This task had two parts, the horrible part and the fun part.

The horrible part (actually, it wasn't so bad) was converting existing code to be thread-safe and reentrant. At the time, Excel made extensive use of global variables to hold state like "the cell currently being calculated". Through a combination of elbow grease, stupid preprocessor tricks and editor macros, I converted much of the relevant code to use thread-local context structures.

The fun part was devising and implementing an algorithm to efficiently and concurrently recalculate a set of interdependent cells with a fixed number of processors. In some cases, dependencies were known only at runtime, since cells could reference computed locations; the algorithm had to adapt to these "faults". As the spreadsheet is recalculated multiple times, the algorithm converged to an optimal recalculation ordering that keeps the processors busy.

At the end of the summer, parallel recalculation actually worked for "simple" spreadsheets (and correctly reverted to single-threaded recalculation for cases it couldn't handle). Management estimated that it would take another three months of effort to pull the feature into "shipping" status, but it didn't make the feature list for the next version.

As far as I know, Excel recalculation is still single-threaded. Multiprocessor desktop PCs are still rare, and Moore's Law means that most people are perfectly happy with the speed of recalculation as it stands.

19. Penelope used GrammaTech's Synthesizer Generator; it was mostly written in Synthesizer Specification Language. This was my first exposure to formal computer science concepts (program proofs, attribute grammars, automated proof verification, etc). I didn't really know what to make of it at the time, but it certainly helped my classwork the next year.

20. See the paper we published at the Tri-Ada conference in 1994.

21. Installing and maintaining systems, network configuration, budgeting hardware acquisition, hiring assistants, rolling out upgrades, and supporting end users as well as the professors and classes that used the systems. All told, the facility had 2,000 users (local and remote) and was occupied nearly 24/7.

22. Anyone running an understaffed and overused system is always trying to do more with less. Above all, we needed to avoid managing each workstation and server individually, so I invested quite a bit of time in distribution infrastructure to automatically generate the appropriate configuration files, cache the most frequently used software and perform routine maintenance on each system. Most large-scale system administrators do something similar.

23. For some reason, remote users always logged in to the same machines (the ones which had memorable or interesting names) despite the fact that they were chronically overloaded. To fix the problem, we instituted a load-balancing redirector (the granddaddy of today's super-slick HTTP load balancing systems) that would pass remote login traffic to the machine with the least load.

24. These courses serve double duty as a first-year curriculum for computer science students and as an introductory "programming how-to" for people in other disciplines. To meet the diverse needs of hundreds of students, undergraduates are heavily involved in course operation. Recitation sections often happen at midnight, and students can usually find the nearest teaching assistant without having to change out of their bathrobe.

I had lots of great ideas for how to make the class more interesting and productive for everyone. Some of them worked and some did not.

25. This was developed on the (not entirely incorrect) theory that students would be more motivated by drawing pretty pictures and making little games than by writing little console-mode phone-book applications.

26. During registration, students log in with a special account that verifies their user information, creates an account, and collects information on their preferred times for recitation, their estimated ability level, other students they would like to have in the same section, other students they would like to avoid, TAs they prefer, TAs they can't stand, and so on. Combined with TA preferences, the system used a simulated annealing algorithm to choose recitation times and assign students to TAs. It worked pretty well. They might even still be using it.

27. The Gale web site is in terrible need of repair, but Gale itself is pretty nifty. It features end-to-end public-key encryption for all users, a self-organizing, self-healing per-domain spanning tree network, and an interesting message categorization and subscription scheme. It's best compared to MIT's Zephyr rather than today's "instant messaging" systems.

28. Laser Tetris was a project for a Caltech class, but I only finished it a year after I graduated. It used a 68HC11 processor for game logic, EEPROM for the high score table, an original Atari joystick as a controller, an AMD Mach215 PAL to time and sequence vector data from RAM to the DACs, and a pair of accelerated galvo mirrors to steer the beam. It worked beautifully (but has since broken, alas). If you think the idea is cool, you should check out Laser MAME, which is everything Laser Tetris ever wanted to be.