ࡱ> _ a H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ ` jbjb (dd' '''''"'*'^-8² δ4^-T4BBBFz``,4R'zz''FF4NJF 'F'Fr>'+\''''"'' v <^NH "'^-D^-6 Open Source Networking Solutions 99% of the people who reject using the software until it gets open sourced will never even look at its source code when its done. Most of the people are not planning to use airbags in cars, but they want them anyway. From the conversation between Yakov and Murat. Introduction The selection of a communication protocol can be as crucial for the success of your RIA as a professionally designed UI. LiveCycle Data Services (LCDS) is an excellent solution for building enterprise grade scalable RIAs, but some enterprises just dont have the budget for it. Many smaller IT organizations still use the more familiar HTTP or SOAP Web Services because its an easy route into the world of RIA with only minor changes on the back end. Now theres a faster, more powerful open-source option: In February 2008, Adobe released BlazeDS in conjunction with open sourcing the specification of the Action Message Format (AMF) communication protocol. Offering many of the same capabilities as LCDS, BlazeDS is a Java-based open source implementation of the AMF, which sends the data over the wire in a highly compressed binary form. Large distributed applications greatly benefit by working with the strongly typed data. Sooner or later developers will need to refactor the code, and if there were no data type information available, changing the code in one place may break the code in another and the compiler may not help you in identifying such newly introduced bugs. This chapter will unleash the power of AMF and provide illustrations of how to create a robust platform for development of modern RIA without paying hefty licensing fees. It will discuss polling and server-side push techniques for the client-server communications, as well as how to extend the capabilities of BlazeDS to bring it closer to LCDS. BlazeDS vs. LCDS Prior to Adobes BlazeDS, Flex developers who wanted to use AMF protocol to speed up the data communication between Flex and the server side of their application had to select one of the third-party libraries such as Open AMF, WebOrb, or Granite DS. The release of the open-source BlazeDS, however, brought a lot more than just support of AMF. You can think of BlazeDS as a scaled down version of LCDS. As opposed to LCDS, BlazeDS doesnt support RTMP protocol, Data Management Services, PDF Generation and has limited scalability. But even with these limitations, its AMF support, ability to communicate with Plain Old Java Objects (POJO), and support of messaging via integration with the Java Messaging Protocol makes BlazeDS is a highly competitive player in the world of RIA. These features alone make it a good choice for architechting RIA data communication comparing to any AJAX library or a package that just implements AMF protocol. Figure 6-1 provides a capsule comparison of BlazeDS and LiveCycle functions. The items shown in grey represent the features available only in LCDS. The features of BlazeDS are highlighted in black.  Figure  STYLEREF "ChapterLabel,cl" \* MERGEFORMAT 6- SEQ 43024_Figure \* MERGEFORMAT 1. Comparing functionality of BlazeDS and LCDS One limitation of BlazeDS is that its publish-subscribe messaging is implemented over HTTP using long-running connections rather than supporting RTMP as in LCDS. Under the HTTP approach, the client opens a connection with the server, which allocates a thread that holds this connection on the server. The server thread gets the data and flushes them down to the client but then continues to hold the connection. You can see the limit right there: because creating each thread has some overhead, the server can hold only a limited number of threads. By default, BlazeDS is configured to hold 10 threads, but it can be increased to several hundred depending on the server being used. Even so, this may be not enough for enterprise-grade applications that need to accommodate thousands of concurrent users. Real-Time Messaging Protocol (RTMP) is not HTTP based. It works like a two-way socket channel without having the overhead of the AMF that is built on top of HTTP. One data stream goes from the server to the client, and the other goes in the opposite direction. Because the RTMP solution requires either a dedicated IP address or port, it is not firewall-friendly, which may become a serious drawback for enterprises that are very strict about security. Adobe has announced their plans to open source RTMP. With a little help, however, BlazeDS can handle this level of traffic, as well as close some of the other gaps between it and LCDS. For example, the Networking Architecture of BlazeDS section offers a scalable solution based on BlazeDS/Jetty server. Later in this chapter, youll learn how to enhance BlazeDS to support data synchronization, PDF generation, and scalable real-time data push. In addition to feature support, youll also examine the other piece of the puzzle: increase the scalability of the AMF protocol in BlazeDS. Why AMF is Important? You may ask, Why should I bother with AMF instead of using standard HTTP, Rest, SOAP or similar protocols? The short answer is because the AMF specification is open sourced and publicly available ( HYPERLINK "http://download.macromedia.com/pub/labs/amf/amf3_spec_121207.pdf"http://download.macromedia.com/pub/labs/amf/amf3_spec_121207.pdf). The longer answer begins with the fact that AMF is a compact binary format that is used to serialize ActionScript object graphs. An object can include both primitive and complex data types, and the process of serialization turns an object into a sequence of bytes, which contains all required information about the structure of the original object. Because AMFs format is open to all, Adobe as well as third-party developers can implement it in various products to de-serialize such pieces of binary data into an object in a different VM, which does not have to be Flash Player. For example, both BlazeDS and LCDS implement AMF protocol to exchange objects between Flash Player and Java VM. There are third-party implementations of AMF to support data communication between Flash Player and such server-side environments as Python, PHP, .Net, Ruby and others. Some of the technical merits of this protocol when used for the enterprise application are: Serialization and de-serialization with AMF is fast. BlazeDS (and LCDS) implementation of AMF is done in C and native to the platform where Flash Player runs. Because of this, AMF has small memory footprint and is easy on CPU. Objects are being created in a single pass there is no need to parse the data (i.e. XML or strings of characters), which is common for non-native protocols. AMF data streams are small and well compressed (in addition to GZip). AMF tries to recognize the common types of data, group them by type so every value doesnt have to carry the information about its type. For example, if there are numeric values that fit in two bytes, AMF wont use four as was required by the variable data type. AMF supports the native data types and classes. You can serialize and de-serialize any object with complex data types including the instances of custom classes. Flex uses AMF in such objects as RemoteObject, SharedObject, ByteArray, LocalConnection, SharedObject, and all messaging operations and any class that implements IExternalizable interface. Connections between the client and the server are being used much more efficiently. The connections are more efficient because the AMF implementation in Flex uses automatic batching of the requests and built-in failover policies providing robustness that does not exist in HTTP or SOAP. The remainder of the chapter will focus on how you can leverage these merits for your own applications, as well as contrast AMF and the technologies that use it to traditional HTTP approaches. AMF Performance Comparison AMF usually consumes half of the bandwidth and outperforms (has the shortest execution time) other text-based data transfer technologies by three to ten times depending on the amount of data you are bringing to the client. It also usually takes several times less memory compared to other protocols that use un-typed objects or XML. If your application has a server that just sends to the client a couple of hundred bytes once in a while, AMF performance benefits over text protocols are not obvious. To see for yourself, visit http://www.jamesward.com/census, a useful Web site that enables you to compare the data transfer performance of various protocols. Created by James Ward, a Flex evangelist at Adobe, the test site lets you specify the number of database records youd like to bring to the client, then graphs the performance times and bandwidth consumed for multiple protocols.  Figure  STYLEREF "ChapterLabel,cl" \* MERGEFORMAT 6- SEQ 43024_Figure \* MERGEFORMAT 2. James Wards benchmark site Figure 6-2 shows the results of a test conducted for a medium result set of 5000 records using out of the box implementations of the technologies using standard GZip compression. Visit this Web site and run some tests on your own. The numbers become even more favorable toward AMF, if you run these tests on slow networks and low-end client computers. The other interesting way to look a performance is to consider what happens to the data when it finally arrives to the client. Since HTTP and SOAP are text-based protocols, they include a parsing phase, which is pretty expensive in terms of time. The RIA application needs to operate with native data types, such as numbers, dates, Booleans. Think about the volume of data conversion that has to be made on the client after arrival of 5000 of 1Kb records. Steve Souder, a Yahoo! expert in performance tuning of traditional (DHTML) Web sites, stresses that major improvements can be achieved by minimizing the amount of data processing performed on the client in an HTML page (see High Performance Web Sites, OReilly, 2007). Using the AMF protocol allows you to substantially lower the need of such processing because the data arrive to the client already strongly typed. AMF and the Client-Side Serialization AMF is crucial for all types of serialization and communications. All native data serialization is customarily handled by the class ByteArray. When serialized, the data type information is marked out by the name included in the metadata tag RemoteClass. Here is a small example from the Flex Builders NetworkingSamples project that comes with the book. It includes an application RegisteredClassvsUnregistered.mxml and two classes: RegisteredClass and Unregistered class: package { [RemoteClass(alias="com.RegisteredClass")] public class RegisteredClass{ } } package { public class UnregisteredClass{ } } Example  STYLEREF "ChapterLabel,cl" \* MERGEFORMAT 6- SEQ 43024_Example \* MERGEFORMAT 1. RegisteredClassvsUnregistered.mxml Example  STYLEREF "ChapterLabel,cl" \* MERGEFORMAT 6- SEQ 43024_Example \* MERGEFORMAT 2. Serialization with and without RemoteObject metatag In the example above, the function serializeDeserialize() serializes the object passed as an argument into a ByteArray, and then reads it back into a variable aa of type Object. The application makes two calls to this function. During the first call, it passes an object that contains the metadata tag marking the object with a data type RegisteredClass; the second call passes the object that does not use this metadata tag. Running this program through a debugger displays the following output in the console: [SWF] /NetworkingSamples/NetworkingSamples.swf - 798,429 bytes after decompression [object RegisteredClass] [object Object] Annotating a class with RemoteClass metadata tag allows Flash Player to store, send and restore information in the predictable, strongly typed format. If you need to persist this class, say in AIR disconnected mode or communicate with another SWF locally via the class LocalConnection, following the rules of AMF communications is crucial. In the example, RemoteClass ensures that during serialization the information about the class will be preserved. HTTP Connection Management To really appreciate the advantages of binary data transfers and persistent connection to the server, take a step back and consider how Web browsers in traditional Web applications connect to servers. For years, Web browsers would allow only two connections per domain. Since Flash Player uses the browsers connection for running HTTP requests to the server, it shares the same limitations as all browser-based applications. The latest versions of IE and Mozilla increased default number of simultaneous parallel HTTP requests per domain/window from two to six. Its probably the biggest news in AJAX world in the last 3 years. For the current crop of AJAX sites serving real WAN connections it means increasing the load speed and fewer timeouts/reliability issues. By the way, most of Opera and Safari performance gains over IE and Mozilla in the past are attributed to the fact that they allowed and used four connections ignoring the recommendations of the WWW consortium (they suggested allowing only two connections). The fact that increasing the number of parallel connections increases network throughput is easy to understand. Todays request/response approach for browsers communications is very similar to the village bike concept. Imagine that there are only a couple of bikes that serve the entire village. People ride and come back to give it to the next person in line. People wait for their turns, keeping their fingers crossed that person in front of you wont get lost in the woods during his ride. Otherwise, you need to wait till all hopes are gone (called timeout) and the village authorities provide you with a new bike circa 1996. Pretty often by the time the new bike arrives its too late, the person decided to get engaged in a different activity (abandon this site). As the travel destinations become more distant (WAN) you are exposed to real world troubles of commuting - latency (500ms for geostatic satellite network), bandwidth limitations, jitter (errors), unrecoverable losses, etc. Besides that, the users may experience congestions caused by the fact that your ISP decided to make some extra cash by trying to become a TV broadcaster and a phone VOIP company, but lacks required infrastructure. The applications that worked perfectly on local/fast networks will crumble in every imaginable way. Obviously, more bikes (read browsers connections) mean that with some traffic planning you can offer a lot more fun to the bikers - get much better performance and reliability. You might even allocate one bike to sheriff/fireman/village doctor so he will provide information on conditions and lost/damaged goods carried by the bikers. You can route important goods in parallel so they would not get lost or damaged that easy. You can really start utilizing long running connection for real data push now. But first, lets go ten years back and try to figure out how the early adopters of RIA developed with AJAX were surviving. Even though AJAX as a term has been coined only in 2005, the authors of this book started using the DHTML/XMLHttpRequest combo (currently known as AJAX) since the year 2000. The Hack to Increase Web Browsers Performance In the beginning of this century, most of enterprises we worked with quietly rolled out in the browser builds/service packs increasing the number of allowed HTTP connections. This was just a hack. For Internet Explorer the following changes to Windows registry keys would increase the number of the browser connections to 10: HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Internet Settings MaxConnectionsPer1_0Server 10 MaxConnectionsPerServer 10 With Mozillas Firefox you have to recompile the source code of the entire browser. It does solve most of performance and reliability issues for a short while. The main reason is that without imposed limits, software increases in size faster than Moores law for electronics. And unlike in private networks in enterprises, without proper city framework rampant requests will cause overall Internet meltdown as initial rollout of more capable browser will give them unfair advantage in terms of bandwidth share. If a server receives eight connection requests, itll try to allocate the limited available bandwidth accordingly, and say Mozillas requests will enjoy better throughput than Internet Explorer, which on older and slower networks will cause quality of service (QoS) problems. In other words, this solution has a very real potential to cause more of the same problems its expected to solve. Other Ways of Increasing Web Browsers Performance Most enterprises have to control QoS of their clients communications. For example, a company that trades stock has a service level agreement (SLA) with their clients promising pushing the new price quotes twice a second. To keep such a promise the enterprises should create and adopt a number of point-to-point solutions that provide more efficient communication models, which fall into three categories: HTTP batching and streaming of multiple requests in a single HTTP call and Comet communications. Comet, a.k.a. reverse AJAX, allows Web server to push data to the Web browser as opposed to a traditional request/response model. AMF performs automatic batching of the requests. If your program executes a loop that generates fifty HTTP requests to the server, AMF will batch them and will send them as one HTTP request. Imagine that someone wrote a loop in JavaScript that makes an HTTP server request on each iteration. The browser can batch these requests and send, say ten requests at a time. This is HTTP batching. In this scenario, the browser would assign a message id to each request included in the batch, and arriving responses would contain correlation ids that would allow the browser to find the matching requestors. Binary components that work with two-directional sockets. This is the case used in multimedia streaming, where there are two separate channels, and each is used for sending data in one direction: either to or from the server. Pluggable protocols, which are wrappers for standard protocols. Say, you can develop some custom protocol called HTTPZ, which for the browsers will look like HTTP, but under the hood it uses streaming or even a socket-based protocol like RTMP. The browser believes that it uses HTTP, the Web server receives RTMP, and the translation is done by HTTPZ - every party is happy. The pluggable protocol option did not become popular, even though it allows moving most of the problems from the browser to the OS level. The batching and streaming options, however, did. Regular HTTP is based on the request-response model, which has an overhead of establishing connection (and disconnecting) on each request. In case of streaming, connection is opened only once (for more information, see the Using Streaming section). HTTP batching and streaming is a combination of few technologies with close resemblance of how the car traffic is controlled on some highways. There are dedicated lanes for high-occupancy vehicles (HOV) that move faster during the rush hours. Such HOV lanes can be compared to the HTTP channels opened for streaming. For example, you can program network communications in such way that one channel allows only two data pushes per second (a guaranteed QoS), while the other channel will try to push all the data, which may cause network congestion, delays, and queuing, As an example, Flex/Flash AMF protocol tries to squeeze every bit of bandwidth and optimize queuing of the requests in the most efficient way both on client and server. As a result, your application uses the maximum bandwidth, and request queues are short. The results of such batching were so good, that at Farata Systems we started recommending using AMF to most of our customers (even to those that have to use WebService or HTTPService objects for communication). Using AMF to proxy requests via AMF-enabled server, delivers results from the HTTP servers more efficiently. Hint: If a client request uses specific destination on a proxy server, this destination can be configured to use an AMF channel even if an HTTPService has been used as a means of communications. With AMF, the data gets loaded faster than with non-batched requests/responses. And it plays nicely with the typical infrastructures that use firewalls as it piggybacks on the existing browser HTTP requests. However, for critical applications built on plain infrastructures a problem remains: There are no QoS provided by HTTP protocol, which may become a showstopper. For example, think of a financial application that sends real-time price quotes to the users. The server keeps sending messages even regardless of the current throughput of the network, which in case of a network congestions will be causing problems with queues overruns or lost packages. The binary always on (re)connected socket protocols are a more logical and efficient solution. Unlike the request/response model, a typical socket connection is like a two-way highway, with data moving in opposite directions independently. But before we would fully depart into the Communications 2.0 world, let understand how HTTP is shaping up these days. The disconnected model of HTTP 1.0 was not practical. The overhead of connecting/disconnecting for each request was not tolerable, and for the last eight years we have not seen a single Web browser using it. It has been completely replaced by HTTP 1.1 the protocol that keeps connections open beyond request/response so the next communications with the server happen faster. Under the hood, there are two-way sockets that stay open but browsers diligently follow the old model. They dont create bidirectional pipe-like connections, as in flash.bet.NetConnection. As the Web browsers started to host business applications, the need to process the real-time data forced people to look into better solutions then polling, and few server-side push solutions were discovered. While there were different in implementations, the main theme remained the same the server would get requests, and hold them for long time, flushing packages down when it becomes available. The packages would reach the browser to be interpreted either by programs upon arrival or executed in the iFrame (if packaged as