First TMF Switch-Off: Organizational Meeting (CFP)

The Measurement Factory

Note: Meeting minutes are available elsewhere.

This page discusses the agenda and logistics for the first switch-off organizational meeting. Attending the meeting is not required to participate in the switch-off. However, if you do not attend, you will miss an opportunity to shape the switch-off rules and workload for your taste. If you cannot attend the meeting, feel free to share your opinions on the Polygraph mailing list or contact us directly.

Table of Contents

1. Logistics
2. Attendees
3. Preliminary agenda
    3.1 Deadlines and location
    3.2 Competition rules
    3.3 Workload
    3.4 Budget

1. Logistics

The Measurement Factory is hosting an organizational meeting for upcoming switch-off participants and anybody interested in the agenda. If you want to come to the meeting, please e-mail us. You must send us an e-mail to attend. At most two representatives per company, please.

The meeting is scheduled for Thursday, May 17th in Denver, Colorado. This is an 8 hour event, starting at 9:30am. The meeting will take place in Hilton Garden Inn. The hotel is about 15 driving minutes from the Denver International Airport. There is a free shuttle service from DIA provided by the hotel. For more information on the location and shuttle service, please see hotel's home page. Several other hotels are nearby.

We hope that at least some of the participants will be able to arrive and leave on the same day. For those of you who have/want to spend a night or more in Denver area, we recommend going to Denver (~25 driving minutes from the hotel) or, better, Boulder (~55 driving minutes from the hotel).

On-site lunch will be provided. Please let us know if you have any food preferences. We will try to get an Internet connectivity in the meeting room, but do not count on it.

2. Attendees

We have invited all companies with L4/7 traffic redirection products we knew about, including Alteon/Nortel, Cisco, ClickArray, Enterasys, Extreme, Foundry, HP, Linux Virtual Server, Radware, Riverstone, and Sharinga. Many have already confirmed their attendance. Attendee list will be announced soon.

We have received many requests to increase meeting attendance limits to two persons per vendor. If you absolutely must send two people, please keep the following caveats in mind.

3. Preliminary agenda

Major agenda items are discussed below. If you want to add an item to the agenda, please let us know ASAP.

3.1 Deadlines and location

Preliminary deadlines need to be discussed and adjusted as needed. Our objective is to give participants and TMF enough time to prepare for the switch-off, execute the tests, and process the results.

The switch-off is likely to be two to five days long, depending on the number of tests included in the competition.

We do not have a confirmed location yet. If you can provide a warehouse or similar space with lots of power and appropriate cooling for about 200 PCs, please contact us. Providing switch-off facilities saves you shipping and travel costs while giving prompt spare delivery and ``on-site'' support from the best local troubleshooters. If we receive no attractive offers, the switch-off is likely to be held in the Boulder-Denver area.

We have been talking to Network World magazine and several trade show organizers about the possibility of holding the switch-off at a trade show. So far there seems to be too many obstacles that make a trade show switch-off difficult, but we will pursue the opportunities if there is sufficient demand.

3.2 Competition rules

We intend to use our cache-off experience when designing the rules for the event. The proposed rules are outlined elsewhere. No rules have been finalized yet! Meeting participants will discuss specifics and vote on the alternatives if necessary.

We ask all meeting attendees to familiarize themselves with the proposed rules and discuss their effects with company management before the meeting.

3.3 Workload

Web Polygraph tools will be used for the tests. TMF is developing the workload, called SrvLB-L4 (layer 4 server load balancing workload). The first workload draft is available on the Polygraph Web site. The workload borrows many concepts and simulation models from the PolyMix family of workloads that are being used for Web caching competitions. Meeting participants are expected to be familiar with PolyMix-3 characteristics.

There are several important workload features/characteristics that must be discussed at the meeting. The number of features to be added to the current workload will affect the feasibility of the proposed switch-off deadlines.

L4 versus L7

There is huge demand for the layer 7 tests. Polygraph can already test many features specific to L7 load balancers. We could make L7 tests our goal for the first switch-off, but doing so will complicate the workload and is likely to delay the competition. Given the two alternatives (L4 results in July or L7 results in September/October), what should we shoot for?

Some feel that starting with a better understood environment (L4) is prudent. Others say that L4 results are not interesting enough.

Another factor to consider is that L7 workloads are not ``backwards compatible'' with L4 load balancers. Testing L7 features correctly requires a different environment/workload compared to the L4 setup. Real Web site designs often depend on the load balancing layer. Any L7 switch should be able to support L4 workload, but the reverse is not true. Thus, if L7 features are added into the mix, we will either end up with two incomparable workloads or will have to limit the competition to L7 vendors only.

Traffic aggregation

Polygraph client- and server-side PCs will have 100BaseT interfaces. In real life, high performance load balancers often receive traffic on Gbit links. Thus, there may be a need for a L2 aggregation device(s) to accumulate client-side traffic. We propose to allow such devices in the participant zone. Moreover, these devices should not affect the total price of the equipment under test, provided they do nothing but L2 aggregation.

Similar rules can be applied to the server side of the bench: Participants can also aggregate traffic on the server side. A single L2 aggregation device can be used for both sides if desired.

Participants are responsible for bringing and configuring the aggregation devices.

Minimum number of origin servers

To sustain peak request rate generated by N client PCs, we would need to deploy at least N server PCs. Some load balancers are unable to handle more than 100Mbits/sec which we can generate with one or two client-server pairs. However, balancing one or two servers is not very interesting. Thus, for small number of client PCs, we probably want to deploy more servers than peak request rate requires. We need to agree on the minimum number of origin servers that any switch must balance. Four origin servers may be a reasonable minimum.

Custom origin server modifications

Several load balancers on the market require (or benefit from) custom modifications of server-side kernels or applications. Those modifications vary from very simple ones (e.g., configuring an extra IP alias on the server box) to very complex (e.g., adding kernel-level support for a proprietary transport protocol).

These modifications do happen in practice. Prohibiting custom server-side modifications will reduce the number of load balancing products we can test.

We hesitate allowing server-side modifications at the switch-off for two reasons. First, it is impossible to draw a line on what can be modified on the server and, hence, drastic modifications (possibly conflicting with Polygraph operation) have to be allowed. Second, since the device under test is the load balancer and not the combination of load balancer plus origin server, we believe it is unfair to allow some vendors to modify the servers.

Routing issues

In general, real clients and servers do not share the same subnet. Real load balancers are usually deployed in the environments with routers and, hence, do not have to perform routing functions.

The switch-off workload should put simulated clients and servers into different subnets. Some device will have to route traffic from simulated clients to simulated origin servers and back. Some load balancers can perform simple routing functions, but some cannot. We can ``help'' those that cannot by configuring appropriate static routes on Polygraph servers. However, we do not know whether we can cover all scenarios. We request that all participants double check that their equipment can function correctly in at least one of the routing setups allowed at the switch-off.

We could allow using zero-cost routers as a part of participant zone (just like L2 traffic aggregation devices above). However, this will further complicate the setup and may introduce performance overheads that will be impossible to factor out.

WAN simulation

Most requests to origin servers come from remote machines and often have to go through modem connections or congested links with significant packet delays, loss, and bandwidth limitations. Moreover, our tests and reports from caching vendors suggest that the presence of packet-level problems affects some L4/7 devices in a profound way.

We can simulate WAN packet delay, loss, and pipe bandwidth using FreeBSD kernel feature called DummyNet. We have been using DummyNet successfully for caching workloads. Meeting participants will need to agree on exactly what WAN delays and loss should be simulated (e.g., 100msec mean packet delay and 0.1% average packet loss with 20Kb/sec bandwidth limits).

Simulating server failures

Redirecting traffic to healthy servers in the presence of network, hardware, or software failures is a key feature of L4/7 load balancers. We can simulate some failures in a lab environment. For example, a simulated origin server can be configured to stop responding to HTTP and/or ICMP requests. We can also simulate network-level problems by bringing server NICs down. Finally, we can pull network cables out of the switch or server box.

The two big questions to be discussed at the meeting are what kind of failures should be simulated (if any), and whether those failures should be integrated into a baseline test or separated into a special experiment.

Several metrics to quantify and compare failure handling should also be discussed and adopted for future switch-off reporting.

3.4 Budget

We expect to setup one or two big benches (100-200 PCs) for all participants to share. Tests will be executed sequentially on each bench. Participation fee will be proportional to entry's peak request rate so that participants pay only for what they use. Peak request rate will determine the number of client PCs required for the test. Per-client price will depend on the number of benches, the number of tests per participant, and the number of participants. More tests may mean longer switch-off duration and/or higher participation costs. More participants may mean lower participation fees.

While there are still plenty unknowns, we hope to keep average participation costs in the $5-15K/entry ballpark range.

Each participant will be responsible for bringing and setting up their equipment.

Besides covering the expenses, our primary objectives are:

  1. Attract both ``small'' and ``large'' participants.
  2. Sustain Web Polygraph development.

We expect a single client PC at the switch-off to generate 500-700 req/sec.