As many of you know, early last September we held the first OpenSimulator Community Conference, a purely online event hosted in OpenSimulator itself. It was produced collaboratively by the Overte Foundation and Avacon, with the planning process beginning in February 2013 and increasing in intensity right up until the conference itself. It consisted of 23 separate regions, with a keynote area, breakout session areas, landing zones, a sponsor expo area and a staff area. Planned capacity for the conference was 220 avatars though avatar numbers in the expo zone was unrestricted. The conference was financed through volunteer effort (time is money!) and sponsorship. Conference registration itself was free.
The event was a great success – everybody that I’ve heard from, whether attendee, volunteer, speaker or sponsor, was very positive about it.
In this series of blog posts, I’m going to talk about the technical, organizational and human sides of the conference, as well as some thoughts about virtual conferences in general. I’m quite well placed to this as I was both the co-chair of the conference and a major part of the technical effort to provide a stable and high performance OpenSimulator installation.
First up is the technical side. I hope this will be of both general interest and useful to anybody who is thinking of hosting a similar event in the future. So please feel free to ask any technical questions in the comments.
In this particular post I’m going to talk about the hardware that we used in the conference. All 23 regions were hosted on a single 24-core Intel Xeon X5650 machine with 64G or RAM. This machine was hosted on a high-bandwidth low-latency network (not a home network).
There’s an approximate rule of thumb which says that a machine running OpenSimulator should have one CPU core per region simulated. However, final performance in the conference was very good – it’s likely that we could have managed with considerably fewer cores. It’s very difficult to determine in exact ratio, however, without extremely time-consuming performance testing. It’s also heavily dependent on the number of objects in regions, number and complexity of scripts, number of avatars in the region, etc.
We can say that we had more memory than required. Maximum memory use during the conference was approximately 16G, about 700M per region. This was with a separate OpenSimulator instance per region – hosting multiple regions in a single instance would use less memory, though the difference may not be very large. So a good safe rule of thumb is to allocate 1G per region. Each region was hosted in its own simulator to provide fault isolation between regions (e.g. if the Mono runtime failed in one simulator it would only bring down that region).
In terms of network capacity, we were never going to have any issues. From analysing network data, we can say that an approximately rule of thumb is to have 500 kbit download from the server available for each connected avatar and 50 kbit upload. More details on the OpenSimulator wiki.
Using a single machine for the conference was convenient for administration purposes. It also eliminated one potential source of teleport issues, as communication between source and destination simulator was internal to a single machine. However, it also made the conference vulnerable to a failure in that single machine. To counter this, one can either have a duplicate machine available with a copy of the entire conference installation (as we did) or spread the regions and grid services out amongst multiple machines to reduce the consequence of failure in any single machine (though one is still vulnerable to a database server failure).
One could also host duplicate or multiple simulators at different physical locations in case the network at a particular location became unavailable for a certain period of time. But the thing to bear in mind is that all these choices are tradeoffs – greater redundancy involves greater cost, operational work and in some cases potentially decreases reliability (e.g. teleports between different locations not on the same LAN). You can compare this with physical location reliability. For instance, a physical conference centre may have a fire drill which puts everything out of action for a certain period. Or a volcano in Iceland may start spewing flight-halting volcanic ash to disrupt flights, as happened with the Metameets 2010 conference.
In part 2, I talk about the road to performance, namely how we went from barely being able to accommodate 70 avatars in our keynote area to easily dealing with a 220 capacity by conference time.