Les said, the better: OSC Conference

On Friday, I attended the OSC Conference at UC Berkeley held by CNMAT. The location is at the top of a very steep hill and I had to lie on the ground and pant for a while before I could drag myself into the auditorium. I hate that hill. Anyway, so I showed up part way through the first talk, which was on OSC Application Areas and was an exceedingly brief introduction to OSC. Matt Wright talked a bit about wrapping other Protocols in OSC, for example, MIDI over OSC. He also talked a bit about gesture controllers, which I think will be the Next Big Thing and other things you can find out about by reading his paper on OSC Application Areas.

Keynote

The Keynote Address was by Marc Canter, the founder of Macrominds, the company which became MacroMedia. He talked about Digital Lifestyle Aggregation. This was one of those hand-wavy "in the future, everything will work" talks. DLA is a fuzzy idea about user experience that folks have a hard time explaining, however, many people are hard at work on DLA tools. One example of a DLA tool that I can think of is GMail. Email is not ordered hierarchically. Instead, it's all stored in a database and allows the user to look at different view of it with powerful searching tools. You can put labels on email and view all messages with a particular label, but the messages themselves are not stored hierarchically, like they are in pine. Any message can have many labels or no label. The speaker mentioned a desktop environment under development which uses this model, called Chandler.

He said the three things to remember are Integration, Aggregation and Customization. He then gave the example of RSS Feeds. Let's say you use Live Journal. You publish your content of your blog and your friends on Live Journal read it. But they don't go to your blog to read it. The have a friends page (really an RSS aggregator) which they use to view all of their friends blogs. They can use tools provided by LiveJournal to change what that page looks like. Canter suggests taking all of this further so that any data can be represented in any format the user would like. Everything would be dumped together but searchable, much like gmail. He suggests that this would be a way for musicians to take content directly to the masses, bypassing record labels, etc. Therefore everything could be shared and open. You, the music publisher, would get to decide what things you wanted to be freely available and what things might require an interaction with paypal.

this is probably the future of the end-user experience. Somehow, OSC will fit into this brave new future vision.

Implementations of OSC

Open Sound World

Amar Chaudhary talked about Open Sound World a very interesting music programming language, which I will very shortly download and compile. (There's a new release out, but Mac Binaries are not yet available for it.) OSW looks like MAX and more or less acts like MAX, however it has OSC fundamentally integrated into the language. Objects in OSW are called "transforms." A patcher window full of transforms is, itself a transform. All programs are fundamentally hierarchical. User-defined transforms are re-usable. This gives me the impression that OSW is more flexible than MAX. Every inlet or outlet of every transform has an OSC address. The hierarchy of the transforms automatically creates the OSC address. It is therefore possible, via OSC commands, to get data to any inlet. This means that if you want to mock something up quickly and then want to, say, write a supercollider script, that script can send data directly to your OSW program. However, OSW has a scripting language built-in. You can have your transform and then put it in a for-loop. It supports a plethora of data-types. It looks, after seeing the demo, that it's what I wish MAX could have been.

The OSC integration allows for multiple font ends, like our hypothetical SuperCollider script. OSW also extends OSW with a query protocol. This means that you could write a script to discover what transforms were available and play them and reconnect them. Chaudhary demonstrated a Python script which discovered a copy of OSW on another computer via Rendezvous, created a patch on the other computer and then ran it. This means that any front end can be attached to the OSW engine. I found the OSC messages required to be very readable and not cryptic as they are in SuperCollider. It reminded me of calling methods on Java Objects.

Open Sound World is free and open source and deserves a good look.

SuperCollider

James McCartney talked about the use of OSC in SuperCollider. His view of how to address things is very different. He described the SC Server as a Virtual Machine for audio and described the tree structure, which all of students who took the SuperCollider tutorial at Wesleyan last semester are certainly already familiar with. Briefly, the objects on the server are stored in a tree. Some nodes may have sub-nodes. Those are called groups. Some nodes must be leaves. Those are called synths. A synth is a program which contains unit generators, which are little pre-defined bits of code to process audio. The three is evaluated in a depth-first search, left to right. The functional units in the VM are the tree, an array of buffers (either audio files or control data) and busses.

OSC in supercollider is designed for speed, speed and more speed. He has only a single-level name space. (as opposed to the OSW hierarchical namespace "/mypatch/sampler/int1/" ) Every addressable node is given a number. This means that, unlike OSW, sending a message to a node at the very bottom of the tree takes the same amount of time as a node at the top of the tree and does not require any pattern matching. The OSC messages also pack a very dense amount of information into a single message. SC Server does not support discovery due to the high speed of changes. A grain-playing synth may only last for 50 milliseconds or less. Trying to find all of those grains would eat up a lot of processor power and not give much benefit.

Interestingly, SC obeys the time-stamp part of the OSC message protocol. This means that you can compensate for possible lag or jitter (this is where the tempo is scooting around a bit (apparently it is detectable by the human ear even if it's only a few nanoseconds according to a conversation I was eavesdropping on) by sending OSC messages programmed to execute at some future time. The VM takes care of running them at the correct time.

McCartney suggested a few changes to the OSC standard. One was to drop nested bundles. He explained how they added overhead without adding functionality. I suspect this change will be adopted. Another was to add a dictionary type, using parenthesis and name, value pairs. finally, he suggested some sort of authentication scheme for OSC, because right now it allows open access to a high priority thread on the user's machine.

FLOSC

Ben Chun wrote and talked about Flash OSC. This is a Java program that translates OSC to XML so that it can be played in Flash Movies. He demonstrated a program where he had written some SuperCollider synthdefs and loaded them. Then he played a flash movie in which the SC synthdefs had inlets and outlets. He connected them together and sound came out from the SC synth server.

The important thing to get from this (and the OSW discussion) is that when you have OSC you can write any front end. Any program can tale to any other program. Your program can span five different languages, all sending OSC messages to each other. It can span five different computers. You can use any language you want to play the SC synth.

FLOSC is also a full Java Implementation of OSC. Which means that if, like me, you loooove Java, you can write java programs to define synths for the SC server and little programs in Java to play those synths. FLOSC is worth checking out for that reason alone.

OSC Device Design Space

Folks gave demonstrations of prohibitively expensive hardware. Gesture controllers are in.

Say you have a bunch of sensors. Your sensors are actually just variable resistors, like potentiometers (knobs), light sensors, bend sensors, etc. All of these are analog resistors that alter their resistance based on bending them, or light or knob position or whatever. You can use these to attach them to dancers or performers. Then you can use gestures to create interesting musical sounds.

As far as I know, aside from the P5 Glove, the cheapest way to get such data into your computer without homebrewing hardware is the I Cube X. As far as I know (meaning I don't) that still only speaks MIDI. Newer input devices speak MIDI and OSC. They also cost thousands of dollars. But some of them are network-able and wireless. CCRMA uses some prototype boards for development of gestural controllers. Those look very interesting. When students finish mocking up the devices and finally build them, the cost of the controller is about $20 in parts. Much more affordable, however, you have to do embedded programming stuff.

All of the commercial solutions use Xylinks chips.

The Effects of Latency on Networked Musical Performance

Stanford researchers did an experiment in testing acceptable latency times. The got two subjects and put them in nearby acoustically separated rooms. They had rhythms to clap. They were wearing headphones and clapping into a microphone. One of them would be chosen randomly to start. That person would hear a metronome counting off. The metronome would cease and that person would clap their half of the pattern. The other person would then start to clap. The signal from person A, clapping into a microphone would go through a linux box which would add some amount of delay to the signal before person B heard it, and vice versa. This researched discovered that the optimum amount of latency is 11 milliseconds. Below that delay and people tended to speed up. Above that delay and they tend to slow down. After about 50 millisecond (or 70), performances tended to completely fall apart.

Clock Synchronization for Interactive Music Systems

Roger Danneberg talked about Clock synching in what was perhaps the most technically heavy of all the talks. If you don't have any significant latency, then you don't need to worry about clock synching. Otherwise, if you want to send packets ahead of time (to compensate for latency), your computers need to agree on what time it is, to a precision great enough to be finer than the latency and the jitter that you're trying to avoid. Computer clocks are a might bit inaccurate, but this doesn't usually pose a problem.

As you know, macs can get the time via network time protocol from an Apple server. However, you're not always online, except in your performing sub-net. Therefore, you need a scheme where a small set of computers can agree on the time. NTP servers are generally all in hardware and prohibitively expensive and not very giggable. Having a single master computer time clock can also be a problem because if that computer is also performing and it crashes, then the whole network doesn't know the time anymore, so a cooperative system may be best.

How time stuff often works is that a client asks the server what time it is and then waits for a reply. It takes the reply and adds to it one half the time it spent waiting. If it takes the server too long to reply, the reply is ignored.

this wasn't part of the talk, but packets can take different routes on the sending trip and the return trip, so half of the total time may be wrong. Also, cable modems go way faster downstream than upstream, so it may take the packet longer to get to the time server than to get back, in a large networked performance.

Towards a more effective OSC Time Tag Scheme

Adrian Freed talked about possible changes to OSC Time Tags. I learned that the very popular P5 Glove samples at 60 Hz. And time Tags are good. You can synch across different nodes, you can compensate for latency and jitter. It makes creating sequencers much easier and you can record when things happened.

This talk was predicated on a better knowledge of OSC than I posses and so my notes aren't good, so I don't know if this was addressed, but it seems to me that it might be good to have both absolute time tags and relative time tags. You would use the absolute ones in real-time performances. You would use relative ones in playback. (One second after playing that note, play this next note)

Setting up OSC sessions using Voice-over-IP protocols

John Lazzaro, who serves on standards bodies and just spent five years on the RFC for MIDI over RTP, talked about integrating OSC into VOIP. Voice over IP is the conferencing protocols used by iChat AV and net2phone and a bunch of other systems. It's using the IP network for telephony and video-conferencing. There's two parts to these protocols. The first is SIP. This is everything that happens before you connect to the other person. It's the handshake where the actual communication protocol (RTP) is agreed upon. There are some advantages to using SIP to set up OSC connections. It would take about 2 years to get an official RFC for this. However, he suggested that putting OSC over RTP would be too much work for not much payoff.

Discovering OSC services with ZeroConf

ZeroConf is another name for Rendezvous. Apple has an open source implementation of this. It is possible to find and connect to OSC with rendezvous. SuperCollider supports this right now.

Type 	-osc.-udp.
Name 	SuperCollider
Port 	57110
Domain 	local.

Your program must first register, then it can discover other OSCs. This is very interesting. hopefully, I or somebody else can dig up some sample code. Maybe soon we can control installations with our cell phones? How is rendezvous related to bluetooth?

What folks are doing with OSC

UC Santa Barbara is building a building of DOOM. A sphere filled with gadgets that speak OSC. It will open in 2006.
Stanford has nifty prototype boards for students to design gestural controllers
UCLA is doing a VR project which sits on top of MAX. The project is under development and so there's a lot of fudging going on, but they've successfully given some concerts.
Quintet.net, developed at HfMT Hamburg is a distributed performance environment which seems to do a whole of unusual things. It's designed for up to five players separated by great distances
SonART is a multimedia collaboration tool that will soon have all the features of photoshop but be networked
David Wessel suggests that it's possible to imrpove MAX/MSP programming practice with OSC. this is apparently not a lost cause.

Draft Proposals

Bidirectional XML mapping

This proposed standard would allow users to map from OSC to XML and vice versa. This could be useful because XML is human readable and editable whereas a binary file format (such as the one used in SuperCollider) is not. Also, going back to Digital Lifestyle Aggregation, XML is syndicatable content. Meaning you could publish your synthdefs like you publish your blog posts. I think this could be a very exciting thing to do. I am interested in programming a aggregator of XML OSC (in my copious free time). Ideally, a synthesis engine like the SC VM would be integrated into the OSX operating system. I talked to James McCartney about this (he's working for Apple on Core Audio). His objection was that it would no longer be free. But Apple has an open source license, so there may be a way to build the engine into the OS without making it Apple proprietary. Ideally, this would be an open standard, so your synthdefs would work on any compliant OS. This could certainly be integrated into Linux without compromising the GPL-ness of the SC project. then electronic music content could be syndicated and played by remote users. Obviously, authentication would need to be implemented before anyone would make this a major part of their operating system. I feel, however, that there are definitely interesting possibilities with OSC, XML (or RDF) and DLA.

Queries for OSC

Being able to extend OSC to allow queries would allow diverse applications to share a common interface. It would get documentation, find out about requested types, etc. The speaker noted that the OSW use of OSC and the SC use of OSC represent two different models of using OSC. It may be that they become frozen as schema. There's no reason to re-do the same work over and over. So your query could discover which schema you were using, find out appropriate information and allow your script to make sounds with somebody else's OSC-enabled program. Anything can talk to anything else. Potentially very very powerful.

CNMAT wishes for work groups

You could be on a small 3 or 4 person committee to design something and write some code for it. then you'd be cool.

Binary File Format - allows the persistent storage of OSC. This is already used by SC, I presume in saved synthdefs.
Time Tags and Synch
Schemas - covering address space and semantics. OSW and SC have done a lot of work in this direction.
OSC Hardware Kit this would be cool, but it's beyond me
Regular Expressions and Pattern Matching
Queries already underway
OSC Web Site sure, it's not as glorious as writing a query system, but somebody needs to keep up with what folks are doing with OSC and become a clearinghouse for OSC information. The website could be a lot more useful than it is.
Possible data-type workgroup You've got a favorite data type, like hashtables that you love and can't live without. But OSC doesn't support it yet. You could make it happen.

Closing

David Wessel gave a very strong push to the idea of interactive music. the major experimental idea of the 20th century was tape music. But now that's the major direction of music. Most people's experiences of music are pre-recorded. OSC and gestural controllers could re-integrate the listener into the musical experience, so they can really hear stuff in a way that the passivity of tape music does not encourage.

He also talked about Future directions for OSC

Les said, the better

Pages

This blog has moved

Monday, 2 August 2004

OSC Conference