Below are some prelimianry test results for SPDY-over-SCTP compared to SPDY-over-TCP.
These results don't represent the full compliment of test conditions that I plan
to cover, but they do represent some common scenarios, and are interesting. In
choosing the test conditions, I referred to Jerry Chu's July 2009 IETF
presentation [1] on tuning TCP for loss rate guidance. He reported 0.8% - 2.4%
packet loss on Google front end servers, and I used 1% and 2%. I referred to
the Akamai Q3 2010 report on the state of the Internet [2] for information on
bandwidths. There's a lot of information there and I haven't yet had time to test a
range of bandwidths at this point (let alone asymmetric bandwidths), but 5 Mbps
seemed to be a good central point to start with and that is what I used. I also
referred to Jerry Chu's IETF presentation [2] for data on RTT distribution. Jerry's
data shows
that Google's servers see 50% of connections with RTTs over approximately 70ms.
I'd like to test RTTs of 20ms, 50ms, 100ms, and perhaps 200ms, but at this point
I've only tested 100ms. Here are some details on the test conditions:
-
Chrome browser connecting to flip_in_mem_edsm_server (the flip server), over our lab network (100 Mbps
ethernet), with all traffic passing through an intermediate machine running
dummynet to shape the network conditions. Chrome and the flip server were both run on machines running Ubuntu 9.10.
-
SPDY streams are mapped to SCTP streams up to a maximum SCTP stream number,
beyond which the SCTP stream number wraps back around to 1. This design ensures
that head of line blocking cannot occur between different SPDY streams up to the
maximum number of SCTP streams. An option was included to allow using SCTP
stream 0 as a control stream for all SPDY control frames on all SPDY streams.
This has the advantage of providing a common compression context for
all headers, but the disadvantage that head-of-line blocking can
occur between SPDY control frames on different SPDY stream numbers.
Head-of-line blocking is still avoided between SPDY data frames on different
SPDY streams.
-
A series of 23 test pages was loaded with Chrome's benchmarking tool (100
iterations each) to obtain an average page load time. Cache and connections
were cleared before each page load operation. The pages were selected from the
top 100 of alexa.com's May 5, 2011 top million web pages list [3]. The peculiar
number of test pages (23) is partly due to problems recording content from some
sites for replay with flip, partly due to the benchmarking tool balking at some
of the recorded content, and partly due to time constraints. I intend to
augment this list with more sites in the future.
-
TCP was run in the default configuration for Ubuntu 9.10, which includes
auto-tuning of send and receive buffer sizes, a 200ms RTO min, cubic congestion
control, and apparently some automatic enabling and disabling of delayed ACKs.
-
SCTP was run with the following changes to the default configuration for Ubuntu
9.10: RTO min reduced from 1 sec to 200 ms, send and receive buffer sizes
increased to 64K, max_burst set very large (4096) so as not to be a factor in
SCTP's behavior, and delayed SACKs disabled (otherwise cwnd only grows by 50%
every RTT during slow start).
Here is some additional detail and a summary of the test results:
-
All results were obtained with a 5 Mbps bandwidth and a 100 ms RTT.
-
Reported values are the average page load time across 23 sites, each loaded 100 times.
-
SPDY-over-SCTP was run using a control stream; all SPDY control frames were sent
on SCTP stream 0 (see description above).
-
Unlike other major implementations of SCTP (FreeBSD, Mac OS X), linux SCTP
(lksctp) does not allow bundling of data with the third leg of the handshake
(COOKIE_ECHO), and this results in a 1 RTT delay for all page load times on
linux. The reported page load times for SCTP have been reduced by 1 RTT
(100 ms) to reflect the results that would be obtained from a linux SCTP
implementation patched to support bundling of data with the COOKIE_ECHO.
-
Average page load times for SPDY-over-TCP
-
1% loss: 6690 ms
-
2% loss: 8823 ms
-
Average page load times for SPDY-over-SCTP (with a control stream)
-
1% loss: 6190 ms
-
2% loss: 7350 ms
-
Difference in average page load time (positive number means SCTP was faster than
TCP, on average)
-
1% loss: 500 ms
-
2% loss: 1473 ms
At 1% loss the average page load time is half a second better when using
SPDY-over-SCTP. Four pages are more than 5% slower using SCTP, eight pages are
more than 5% faster using SCTP, and the remaining 11 pages are within +/-5% of
the TCP page load time. At 2% loss the average reduction in page load time is
nearly 1.5 seconds, and SCTP provides a substantially better user experience
than TCP. Only 1 page is more than 5% slower using SCTP, while 15 pages are
more than 5% faster using SCTP. At 2.5% loss the improvement should be even
more significant.
Below are more detailed results. The reported values are (average TCP page load time) - (average SCTP page load time).
The results are not uniform across the various test pages; some are very
sensitive to loss while others are not, and some significantly benefit from SCTP
while others do not. I plan to expand these results to cover other network
conditions, and it will be important to include a larger set of test pages. A
superficial review of the results doesn't reveal an obvious correlation between
these results and such per-page attributes as number of SPDY sessions, number of
connections or amount of data read. I also want to look into why certain pages
gain much more than others from using SCTP. It may be that there are reasonable
changes that would improve the performance of pages loaded over SCTP, and
thereby provide a more uniform improvement with SCTP.
It is important to point out that lksctp's inability to bundle data with the
COOKIE_ECHO impacts many sites by an additional RTT delay. Most of the test
pages are loaded from multiple domains, and in many cases the completion time
of the page depends on a connection other than the initial connection. In these
cases, lksctp's inability to bundle data with the COOKIE_ECHO further delays the
completion of the page by 1 RTT. I have looked at the way each of these sites
loads and found that 17/23 experience an additional RTT delay. The table
below shows the results from the above table adjusted for this additional RTT
delay, for those sites where is is appropriate. These are the results I would
expect if we were running our tests on FreeBSD or Mac OS X, or if lksctp were
patched to allow bundling data with the COOKIE_ECHO.
References:
-
[1] H. K. Jerry Chu, "Tuning TCP Parameters for the 21st Century", 75th IETF,
Stockholm, Sweden, July 27, 2009
-
[2] Akamai Technoligies Inc., "The State of the Internet Q3 2010", Vol 3, no 3, 2010.
-
[3] Alexa Internet Inc., "Top 1,000,000 sites", May 5, 2011.
Contact: jtleight@udel.edu