336x280(권장), 300x250(권장), 250x250, 200x200 크기의 광고 코드만 넣을 수 있습니다.
By Agnes Jacob and Kelly Kishore, May 2002  

Introduction

Streaming media technology emerged as the web evolved from static to dynamic content. Many companies, educational institutions, and service and content providers have been using the web to broadcast live presentations, movies, concerts, educational programs, news, sports, and advertisements. Streaming technology allows these types of content to be viewed as it is transmitted without having to download the entire content onto client's local storage.

There are two types of streaming:

  • True Streaming uses RTP and RTSP (Real Time Protocol and Real Time Streaming Protocol) as the transport protocols, and there is a dedicated server where the source (multimedia files or live broadcast) is transmitted and viewed by the client.
  • Progressive Download uses HTTP (Hyper-Text Translation Protocol) as the transport protocol, with a regular web server. In this method, a portion of the video is downloaded onto the client's storage and begins playing before it is completely downloaded.

Multimedia files are usually compressed and encoded in certain formats on the server end and then decompressed and decoded by a media player on the client side. The formats that currently can be streamed include RealNetworks (*.rm), Windows Media (*.wmv, *.asf), QuickTime (*.mov), MPEG (*.mpg), audio files (*.mp3), and other video files (*.mp2).

Either type of streaming or multimedia source requires a scalable and efficient server with a fast storage solution and good network bandwidth to address the needs of streaming media applications. This paper describes how to tune servers running on the Solaris Operating Environment to improve performance for streaming media applications. These suggestions are based on observations while working with ISVs (independent software vendors) developing streaming media applications. The things to consider when tuning a server for media applications include network, I/O, and local and remote file systems.

Network Tuning

To take advantage of high-speed networks and thus increase network throughput, the Solaris platform offers tunable parameters for UDP and TCP, which are the underlying protocols for RTSP, RTP, and HTTP. The UDP and TCP tunables that are beneficial to streaming media applications relate to window size, buffer size, and checksumming.

Setting the TCP window size or UDP socket buffer size to the application read or write call size minimizes the fragmentation of the packets in the transport layer. For example, if the application server sends out or writes out 32 Kbyte, and the UDP socket buffer size is set to 8 Kbyte, then the server will break this 32-Kbyte request into 8-Kbyte packets before sending it over the network. By setting the UDP receive and send buffer size or TCP window size, you will reduce the overhead by reducing fragmentation and thus increase performance.

The tunables for UDP that would modify the socket buffer size include:

  • udp_xmit_hiwat - maximum UDP socket datagram in bytes (Default=8192)

  • udp_recv_hiwat - maximum UDP socket receive buffer size in bytes (Default=8192)

  • udp_max_buf - controls how large send and receive buffers (in bytes) can be for a UDP socket. This should be set to a value higher than the previous tunables, otherwise an error will be returned. (Default=256*1024)

The tunables for TCP that modify the window size include:

  • tcp_xmit_hiwat - maximum value of the TCP send window size in bytes (Default=16384)

  • tcp_recv_hiwat - maximum value of the TCP receive window size in bytes (Default=24576)

  • tcp_max_buf - controls how large the send and receive buffers can be (in bytes). This should be set to a value higher than the previous tunables, otherwise an error will be returned. (Default=1048576)

You can use ndd to modify these values. For example:

ndd -set /dev/udp udp_xmit_hiwat <value_in_bytes>

To maintain these settings across reboots, you should put a line for this in the file /etc/rc2.d/S69inet:

  #
  # Set configurable parameters.
  #
  ndd -set /dev/tcp tcp_xmit_hiwat  65536

Another way you could achieve a performance boost is to turn OFF this udp tunable:

  • udp_do_checksum - This enables UDP checksumming to ensure data integrity. (Default=1)

If your network card already has some sort of hardware checksumming, then this feature can be turned off to avoid double checks. Use ndd to turn this off (for example, ndd -set /dev/udp udp_do_checksum 0).

It is important to note that currently UDP is the desired protocol for streaming media because it does not make acknowledgments of packets or retransmissions. Since the audio or video is being listened to or viewed as it is transmitted, retransmission of lost bytes is not necessary. Also, if the stream can be multicasted (sent to one or more clients), UDP is the only protocol that supports this mode.

For Solaris servers that have Sun GigaSwift Ethernet adapter network cards (only supported in the line of servers that run on UltraSPARC III® processors), there are device driver parameters to consider for possible PCI interperformance improvements. The effects on performance vary among different applications, and it would be good to try out different values for these parameters.

The device parameters include:

  • tx_dma_weight - multiplication factor for granting priority to the transmit (TX) side during a weighted round-robin arbitration. The weight values are power of 2. (Default=0)

  • rx_dma_weight - multiplication factor for granting priority to the receive (RX) side during a weighted round-robin arbitration. The weight values are power of 2. (Default=0)

These tunables control the priority access to the PCI. For example, if the tx_dma_weight = 0 and the rx_dma_weight = 3, then the RX traffic will have 8 times greater priority over TX traffic.

Another device driver parameter adjustment that may improve performance is to turn on infinite_burst:

  • infinite_burst - allows the adapter to not free the bus until packets are transferred across the bus (Default=0)

For large packets, turning this parameter ON could be beneficial. Use ndd to set these device parameters or set these values in the ce.conf file. (For example, with ndd -set /dev/ce <parameter> <value>, you need to set the instance number to select the device if multiple devices are active.)

In addition, for good bus performance with Sun GigaSwift Ethernet adapters, use a 66-MHz, 64-bit PCI slot to take advantage of the faster, wider bus. Also, select a slot that does not share bus bandwidth with other slots.

I/O Tuning

Multimedia files can be located local to the server, on a remote server, or on a SAN (storage area network). For streaming media applications, sequential read/write access is the most predominant I/O disk operation since the multimedia files are stored on disk before they are transmitted (except for live broadcast). Thus selection and configuration of the storage device are factors that can affect overall performance. Configuration of your I/O disk devices as stripe or utilizing hardware stripe (RAID 0, RAID 5) would be optimal for sequential access since the data can be spread across several disks and retrieved in parallel. If you choose to configure the disks as stripe, the stripe size should be a multiple of the read/write size to make the access more efficient.

The numerous ways to optimize I/O are beyond the scope of this paper; however, these two books are good references for planning disk layout:

  • Configuration and Capacity Planning for Solaris Servers, Brian Wong, Sun Microsystems Press, 1997 

  • Sun Performance and Tuning, Java and the Internet, 2nd Edition, Adrian Cockcroft and Richard Pettit, Sun Microsystems Press, 1998

File System Tuning

The types of file systems that are supported on the Solaris Operating Environment include UFS, VxFS, QFS, and NFS (for remote files). For each of these file system types, there are tuning considerations that relate to directio and blocksize. Also, for NFS, other NFS tunables and parameters can be set for better NFS throughput.

If your streaming media application accesses very large files and does not redistribute the same file often, mounting the file system with the directio option will increase file access performance. Mounting the file system with the directio option bypasses the kernel buffer cache (paged I/O) when reading and writing to disk. This reduces the CPU overhead considerably since the read/write goes directly to disk instead of copying the data between the kernel buffer and user buffer. For an I/O operation to be performed as direct I/O, it must meet the alignment criteria. The read/write request must be aligned to a sector, block size, or stripe size boundary, or it will buffer the I/O. Use mount with forcedirectio option to mount the file system as directio (for example, in UFS: mount -F ufs -o forcedirectio <device> <mountpt>).

Multimedia files consisting of high bit rate audio and video can be in the gigabyte file size range. Tuning the file system block size or extent size can definitely improve performance for larger files. Currently the maximum block size or extent size supported for UFS and VxFS, respectively, is 8 Kbyte. For larger files, there is an extra layer of indirection to access the rest of the data or an increase in the number of data blocks or extents used per file; this creates extra overhead when accessing large files. QFS, however, supports variable block sizes up to a maximum of 32 Mbyte. For a 64-Mbyte file, it requires only 2 data blocks to represent this file, however, in UFS or VXFS it would require 8000 data blocks or extents to represent this file, and it would require another level of indirection. Thus, consider using the largest possible block sizes when creating the file system. Use the mkfs command with the correct option for each of the file systems to set the block size.

Files on a remote server reside on a network file system (NFS). The overall performance of NFS is affected by the underlying file system type, I/O (disk layout), network bandwidth, and performance of the remote server. However, you can consider NFS (network file system) tunables to improve NFS throughput.

One tunable is:

  • nfs3_max_transfer_size - maximum size of the data portion of an NFS version 3 READ, WRITE, READDIR, or READDIRPLUS request (Default=32KB)

It would be best to tune the maximum transfer size to the size of the data being passed over the network, for example, the read/write I/O request size. However, depending on the configuration of the remote server and network bandwidth, you need to tune this value as closely as you can to the read/write I/O request size without degrading performance on the server. If you modify this parameter, you would also need to modify the parameter nfs3_bsize, otherwise the over-the-wire request size would be limited to the nfs3_bsize.

  • nfs3_bsize - the logical block size used by the NFS version 3 client. This block size represents the amount of data that the client attempts to read from or write to the server when it needs to do an I/O. (Default=32KB)

These NFS tunables need to be applied to both the client and server. For UDP, there is a hard limit of 64 Kbyte per datagram (including headers and data) for the max_transfer_size.

To modify these values, you would need to set them in the /etc/system file, and it would require a reboot for the values to take effect. For example, the contents of /etc/system could be:

nfs:nfs3_max_transfer_size=65536
nfs:nfs3_bsize=65536

Also, increasing the number of NFS threads (nfsd) on a server enables the server to handle more NFS requests in parallel. For streaming media applications that require access to remote files, setting this to a higher value would provide increased NFS throughput. How to set this parameter would depend on the size of the server. A rule of thumb is to take the maximum of the following:

  1. Use 16 to 32 NFS threads for each CPU.
  2. Use 2 NFS threads for each active client.
  3. Use 16 NFS threads for each 10 Mbit of network capacity.

To set the NFS threads, you would need to modify the /etc/init.d/nfs.server to start nfsd with the specified amount of threads.

Summary

As streaming media technology continues to expand and grow, different demands and needs will arise, and system tuning will change to address these needs. This paper focuses on parameters you can tune for streaming media applications on servers running on the Solaris platform. Other tunable parameters may exist for enhancing performance for these applications. However, so far, these are the tuning suggestions that we have found to produce the greatest improvement in performance for streaming media applications on these servers.

References

  • Sun Performance and Tuning, Java and the Internet, 2nd Edition, Adrian Cockcroft and Richard Pettit, Sun Microsystems Press, 1998 

  • Configuration and Capacity Planning for Solaris Servers, Brian Wong, Sun Microsystems Press, 1997 

  • "NFS Server Performance and Tuning Guide for Sun Hardware," Sun Microsystems, 2000 

  • "Platform Notes: Sun GigaSwift Ethernet Device Driver" from Solaris 8 10/01 on Sun Hardware Collection, Sun Microsystems, 2001 

  • "QFS - Technical Overview v3.4/3.5," LSC, Inc. (now Sun Microsystems), March 9, 2000 

  • "Solaris Tunable Parameters Reference Manual" from Solaris 8 10/01 Update Collection, Sun Microsystems, 2001 

  • "VxFS System Administrator's Guide," VERITAS Corporation, 2000 

About the Authors

Agnes Jacob is a staff engineer in Sun Market Development Engineering, working with ISVs to improve performance of their applications on Sun servers by means of tuning and sizing studies. Her experience includes kernel development in the areas of file system, I/O, and network file system, as well as Java technology.

Kelly Kishore has been working with streaming media technologies since 2000. At Sun, he is currently in Market Development Engineering, working with digital media customers.

May 2002