<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Bhavin's Blog</title>
	<atom:link href="http://bhavin.directi.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://bhavin.directi.com</link>
	<description></description>
	<lastBuildDate>Fri, 05 Aug 2011 12:27:37 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Unix domain sockets vs TCP Sockets</title>
		<link>http://bhavin.directi.com/unix-domain-sockets-vs-tcp-sockets/</link>
		<comments>http://bhavin.directi.com/unix-domain-sockets-vs-tcp-sockets/#comments</comments>
		<pubDate>Sat, 21 May 2011 07:44:29 +0000</pubDate>
		<dc:creator>Bhavin Turakhia</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[scalability]]></category>
		<category><![CDATA[sockets]]></category>
		<category><![CDATA[tcp]]></category>

		<guid isPermaLink="false">http://bhavin.directi.com/?p=744</guid>
		<description><![CDATA[Here are two interesting links I found comparing the features and performance differences between using Unix Domain Sockets and TCP Loopback Sockets
http://lists.freebsd.org/pipermail/freebsd-performance/2005-February/001143.html
Excerpt: IP sockets over localhost are basically looped back network on-the-wireIP.  There is intentionally &#8220;no special knowledge&#8221; of the fact that the  connection is to the same system, so no effort is [...]]]></description>
			<content:encoded><![CDATA[<p>Here are two interesting links I found comparing the features and performance differences between using Unix Domain Sockets and TCP Loopback Sockets</p>
<p><a href="http://lists.freebsd.org/pipermail/freebsd-performance/2005-February/001143.html">http://lists.freebsd.org/pipermail/freebsd-performance/2005-February/001143.html</a></p>
<p><em>Excerpt: </em>IP sockets over localhost are basically looped back network on-the-wireIP.  There is intentionally &#8220;no special knowledge&#8221; of the fact that the  connection is to the same system, so no effort is made to bypass the  normal IP stack mechanisms for performance reasons.  For example,  transmission over TCP will always involve two context switches to get to  the remote socket, as you have to switch through the netisr, which  occurs following the &#8220;loopback&#8221; of the packet through the synthetic  loopback interface.  Likewise, you get all the overhead of ACKs, TCP  flow control, encapsulation/decapsulation, etc.  Routing will be  performed in order to decide if the packets go to the localhost.  Large sends will have to be broken down into MTU-size datagrams, which  also adds overhead for large writes.  It&#8217;s really TCP, it just goes over  a loopback interface by virtue of a special address, or discovering that  the address requested is served locally rather than over an ethernet  (etc).</p>
<p>UNIX domain sockets have explicit knowledge that they&#8217;re executing on  the same system.  They avoid the extra context switch through the  netisr, and a sending thread will write the stream or datagrams directly  into the receiving socket buffer.  No checksums are calculated, no  headers are inserted, no routing is performed, etc.  Because they have  access to the remote socket buffer, they can also directly provide  feedback to the sender when it is filling, or more importantly,  emptying, rather than having the added overhead of explicit  acknowledgement and window changes.  The one piece of functionality that  UNIX domain sockets don&#8217;t provide that TCP does is out-of-band data. In practice, this is an issue for almost noone.</p>
<p><a href="http://osnet.cs.binghamton.edu/publications/TR-20070820.pdf">http://osnet.cs.binghamton.edu/publications/TR-20070820.pdf</a></p>
<p><em>Excerpt: </em>It was hypothesized that pipes would have the highest throughtput due to its limited functionality, since it is half-duplex, but this was not true. For almost all of the data sizes transferred, Unix domain sockets performed better than both TCP sockets and pipes, as can be seen in Figure 1 below. Figure 1 shows the transfer rates for the IPC mechanisms, but it should be noted that they do not represent the speeds obtained by all of the test machines. The transfer rates are consistent across the machines with similar hardware conﬁgurations though. On some machines, Unix domain sockets reached transfer rates as high as 1500 MB/s.</p>
 <img src="http://bhavin.directi.com/wp-content/plugins/feed-statistics.php?view=1&post_id=744" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://bhavin.directi.com/unix-domain-sockets-vs-tcp-sockets/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Achieving 100% uptime through the CRABS model</title>
		<link>http://bhavin.directi.com/achieving-100-uptime-through-the-crabs-model/</link>
		<comments>http://bhavin.directi.com/achieving-100-uptime-through-the-crabs-model/#comments</comments>
		<pubDate>Sun, 24 Oct 2010 11:02:55 +0000</pubDate>
		<dc:creator>Bhavin Turakhia</dc:creator>
				<category><![CDATA[0-cosmos]]></category>
		<category><![CDATA[TechTalk]]></category>
		<category><![CDATA[scalability]]></category>
		<category><![CDATA[uptime]]></category>

		<guid isPermaLink="false">http://bhavin.directi.com/?p=418</guid>
		<description><![CDATA[As a web 2.0 company today five nine&#8217;s no longer cuts it wrt uptime. We do not have the luxury of providing 99.999% availability. Users expect 100% uptime. This post is a macro model of things that need to be taken care of to achieve 100% uptime. Inkeeping with the industry&#8217;s love for acronyms I [...]]]></description>
			<content:encoded><![CDATA[<p>As a web 2.0 company today five nine&#8217;s no longer cuts it wrt uptime. We do not have the luxury of providing 99.999% availability. Users expect 100% uptime. This post is a macro model of things that need to be taken care of to achieve 100% uptime. Inkeeping with the industry&#8217;s love for acronyms I call it the CRABS model <img src='http://bhavin.directi.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<h2>Capacity</h2>
<p>You must be aware of the exact capacity that your infrastructure can handle. In terms of requests, number of users, amount of storage, number of transactions, network throughput and so on. This is applicable to every component within the system. Each service has its own capacity limitations. If your architecture comprises of a database, an app server, a queue, a mail server, and a memory cache, each of these components have their own capacity limitations. Capacity also depends on the state of the system, time of the day, user patterns etc. For instance if you are heavily dependant on memory caches, and in your application design there is a possibility that you may start out with a cold cache, then the requests your application can handle during this time will be different from the requests it can handle with a warm cache.</p>
<p>Knowing the capacity of every component in the system allows you to do the following -<br />
* determine the peak load your system can handle<br />
* put limits into place to ensure your system never gets more requests than it can handle<br />
* determine when the system is reaching close to peak capacity and pre-emptively scale the infrastructure to account for growth</p>
<h2>Redundancy</h2>
<p>Every component must have adequate redundancy in an active-active model. These days a simple n+1 does not cut it out, nor does a standby failover. Most redundant clusters consist of capacity well beyond that required during peak loads. Additionally it is not acceptable, anymore, to require even a few minutes of downtime for a standby to start-up incase of downtime of the primary node. And it is certainly not acceptable to lose any data. Downtime of any node or any component is expected to be completely transparent to end users. This starts becoming difficult when you take into account user sessions, state and data storage. This requires thought at design time. Applications have to be designed ground up to be redundant to an extent where downtime of multiple hardware and software components do not impact the end user in any way. Larger applications take into account geo-redundancy and the possibility of entire datacenters or geographical locations being unavailable for a certain period of time. As many components as possible should run in active-active mode where failure of one of a set does not result in any impact to the end user. Think of every component (hardware and software) in your setup and allow for several of them to fail at the same time. Ensure adequate capacity and data redundancy.</p>
<h2>Abuse mitigation</h2>
<p>Expect users, hackers, customers, vendors, developers and unrelated 3rd parties to intentionally or unintentionally abuse your system. I divide abuse into the following categories -</p>
<ul>
<li>Denial of Service: Someone sending unwarranted requests to your system utilizes the peak capacity of your system resulting in a denial of service to your other users. These can be application requests or network requests. The requests maybe intentional or un-intentional and maybe distributed. The requests may even be legitimate. For instance one may legitimately use your mail system to send out a million emails. Preventing DOS requires identifying all potential scenarios and ensuring none of the services and devices in your infrastructure permit any user or system to send more than a warranted number of requests. Network based DDOS attacks must be mitigated by using special DDOS mitigation equipment that cleans the traffic</li>
<li>Security breaches: Someone accessing your system with the intention of damaging it by exploiting a vulnerabliity in the network, application, OS etc to gain access and disparage your services. One needs to employ server hardening, firewalls, strict security processes, access policies, intrusion detection systems, following owasp guidelines, ensuring application security and much more to ensure tight security of one&#8217;s services.</li>
<li>Manual booboos: Many a downtime has been a result of an unsuspecting sysad running &#8220;rm -fr&#8221; or a fatigued developer running a &#8220;delete from table&#8221; without a where clause. One can prevent these by defining structured processes and policies.</li>
</ul>
<h2>Bugs</h2>
<p>Another frequent cause of downtime or service unavailability is bugs in the software. Heed the following tips to ensure zero defects in a live scenario -</p>
<ul>
<li>Adequate automated and manual unit and functional testing of the software</li>
<li>Dog-fooding and Staggerred release wherein new versions are always released to limited internal and external audiences before releasing them to the entire user base</li>
</ul>
<h2>Scalability</h2>
<p>Careful capacity planning does not prevent getting tech-crunched, slash-dotted or dugg. Your application design must support infinite scalability. This again requires careful planning with respect to application design and hardware selection. Vertical and Horizontal partitioning, clustering, stateless configurations and more help in creating a design that scales linearly by adding additional nodes without requiring any downtime. Always think of millions of users.</p>
 <img src="http://bhavin.directi.com/wp-content/plugins/feed-statistics.php?view=1&post_id=418" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://bhavin.directi.com/achieving-100-uptime-through-the-crabs-model/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Methods of opening a bi-directional socket from a Browser</title>
		<link>http://bhavin.directi.com/opening-a-bi-directional-socket-from-a-browser/</link>
		<comments>http://bhavin.directi.com/opening-a-bi-directional-socket-from-a-browser/#comments</comments>
		<pubDate>Fri, 04 Jun 2010 22:05:15 +0000</pubDate>
		<dc:creator>Bhavin Turakhia</dc:creator>
				<category><![CDATA[0-cosmos]]></category>
		<category><![CDATA[TechTalk]]></category>
		<category><![CDATA[bosh]]></category>
		<category><![CDATA[browser]]></category>
		<category><![CDATA[comet]]></category>
		<category><![CDATA[http]]></category>
		<category><![CDATA[socket]]></category>
		<category><![CDATA[websockets]]></category>

		<guid isPermaLink="false">http://bhavin.directi.com/?p=397</guid>
		<description><![CDATA[It is no surprise that 6 of the top 10 desktop applications by usage time are browsers (source: Wakoopa). We all have our gripes with a browser as an application container &#8211; sandboxing, cross browser compatibility issues,  no access to native APIs. The developments over the last few years however have been very promising &#8211; [...]]]></description>
			<content:encoded><![CDATA[<p>It is no surprise that 6 of the top 10 desktop applications by usage time are browsers (source: <a href="http://wakoopa.com">Wakoopa</a>). We all have our gripes with a browser as an application container &#8211; sandboxing, cross browser compatibility issues,  no access to native APIs. The developments over the last few years however have been very promising &#8211; Ajax, Flex,HTML5, Web Sockets, Web Hooks, Google gears &#8211; with all thats afoot a browser application nowadays provides a near native experience.</p>
<p>One of my many personal peeves has been the lack of raw socket connection capabilities and bi-directional communication from a browser. This too has changed considerably over the years. This article lists various bi-directional communication methods that one can use from a browser -</p>
<ul>
<li><em>Comet:</em> Comet is more a collection of techniques that provide bi-directional communication between a browser and a server. It is a superset of Long-polling, BOSH, and other such techniques</li>
<li><em>Long-polling:</em> This merely refers to an HTTP connection that is maintained for a long duration, without disconnection. A server, upon receiving a request, keeps the connection with the client open, and sends streams of data back to the client. The response is never deemed to have completed, hence the server can continue to keep pushing data to a client over this connection, thus emulating push</li>
<li><em>BOSH:</em> A BOSH library uses upto 2 connections to a server  - one connection for the client to send data to the server, and another for the server to send data to the client. The client opens a first connection and sends a request to the server. The server does not respond, and then subsequently can use this connection to send a response whenever it is ready. If the client meanwhile needs to send data to the server it does so through a request from a second connection. The moment the server receives this request from a second connection, it sends a response out to the first connection thus reversing the roles of the the two connections</li>
<li><em>Flash:</em> One can use Flash to establish a socket connection to a server. This is a far more efficient method for bi-directional communication. However it has certain limitations. Firstly flash supports two types of socket connections &#8211; XMLSocket and a Raw TCP Socket. So no UDP. Secondly from a Flash widget, one can only make a socket connection to the domain from where the page was loaded. No cross-domain calls are permitted unless an explicit cross domain policy file is provided for by the server you are making a connection to. Therefore one cannot load a flash widget from server1 and make a socket connection to server2. For instance, if one were to write a flash based MSN client, the client would not be able to directly connect to the MSN servers. One solution would be to proxy the connection through a TCP proxy installed on your server. However this would mean that you would need server infrastructure to relay the connection. A nice article describing how to achieve this is <a href="http://coderslike.us/2009/01/23/flash-socket-code-and-crossdomain-policy-serving/">available here</a>.</li>
<li><em>Web Sockets:</em> Plagiarising from <a href="http://en.wikipedia.org/wiki/Web_Sockets">Wikipedia</a> &#8211; &#8220;WebSockets is a technology providing for bi-directional, full-duplex communications channels, over a TCP socket and is being standardized by the W3C and IETF&#8221;. Websockets is still limited in the sense that it is not a protocol-independent binary socket connection. A reference implementation (client and server) is available at <a href="http://jwebsocket.org/">http://jwebsocket.org/</a></li>
<li><em>Java Applets:</em> Java applets are way more powerful than flash when it comes to raw socket capability. You have a choice of protocols (UDP/TCP) and most of the java stack at your disposal. Java applets too have a sandboxing restriction, which though is easier to circumvent than Flash. As a Java applet, you can only make a socket connection to the server from where the page was loaded, unless the applet is a signed applet. So one can use a signed applet to setup a socket connection to any server without having to use a proxy server. To my mind this would be the ideal method if only I had the slightest confidence in the applet working as advertised. In the last few years I am yet to see a single Java applet run without error in my browser. Have never bothered to troubleshoot it, but it does not give me confidence <img src='http://bhavin.directi.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </li>
<li><em>Browser plugin:</em> One can write a browser plugin to perform the socket communication. The plugin would expose a function in Javascript enabling a web page to use it to make socket connections. There is a certain degree of friction for the user who would need to download install the browser plugin. Browser plugins are also difficult to maintain given that there are several browsers each running on various platforms. Soon the cross-browser cross-OS compatibility can become a nightmare</li>
<li><em>External application:</em> I now come to the most elegant method of achieving powerful bi-directional access between a browser and any server with complete native capabilities. Infact this method is the raison d&#8217;etre for this article. While most of the above methods would work in most scenarios, they still lack the power of a native desktop application (barring the browser plugin). Most of the methods above are sand-boxed, inefficient, require server proxies, and cannot access underlying native OS functionality. This brings me to a far simpler yet superior method &#8211; writing a native application that runs on the users machine and exposes a web server (or some socket server) to which the app in the browser can communicate using &#8230; you guessed it &#8230; any of the above methods (Flash/BOSH/Comet/HTTP). Seemingly Google&#8217;s video chat plugin works in this manner. All the cool P2P, UDP, ICE, NAT traversal magic is written as an external application that the user downloads. The data is then streamed from this out-of-process app into the browser and can be played using the Flash player. This method infact reminds me very much of how <a href="http://rhomobile.com">Rhomobile</a> works on the mobile phone. As a part of my research I also came across numerous other applications that use this technique. Another interesting project worth mentioning is <a href="http://www.littleshoot.org/">Littleshoot</a> by <a href="http://twitter.com/adamfisk">Adam Fisk</a>. LittleShoot is an opensource implementation of P2P in the browser. It works by downloading an application that runs on your machine as a service, and then when you visit the LittleShoot website the webpage detects that you have the app installed and can use the app (which is a mini-web server with complete OS access) to pretty much do anything.</li>
<li><em>Other methods:</em> I havent researched ActiveX and Silverlight, but I would assume that ActiveX behaves much like Java w.r.t socket connections and Silverlight possibly behaves much like Flash <img src='http://bhavin.directi.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> . I also came across this other cool tool &#8211; <a href="http://orbited.org/">Orbited</a> &#8211; which essentially provides a server proxy and a client javascript library to simulate a TCP socket implementation within javascript. Essentially a combination of the techniques I have described in the first few methods &#8211; but pre-packaged for you.</li>
</ul>
 <img src="http://bhavin.directi.com/wp-content/plugins/feed-statistics.php?view=1&post_id=397" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://bhavin.directi.com/opening-a-bi-directional-socket-from-a-browser/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>To Trie or not to Trie &#8211; a comparison of efficient data structures</title>
		<link>http://bhavin.directi.com/to-trie-or-not-to-trie-a-comparison-of-efficient-data-structures/</link>
		<comments>http://bhavin.directi.com/to-trie-or-not-to-trie-a-comparison-of-efficient-data-structures/#comments</comments>
		<pubDate>Tue, 18 May 2010 03:31:32 +0000</pubDate>
		<dc:creator>Bhavin Turakhia</dc:creator>
				<category><![CDATA[0-cosmos]]></category>
		<category><![CDATA[TechTalk]]></category>
		<category><![CDATA[datastructure]]></category>
		<category><![CDATA[hash]]></category>
		<category><![CDATA[scalability]]></category>
		<category><![CDATA[tree]]></category>
		<category><![CDATA[trie]]></category>

		<guid isPermaLink="false">http://bhavin.directi.com/?p=393</guid>
		<description><![CDATA[Since my discussion thread on the efficiency of the in-memory data structure of ZeroMQ with Martin Sustrik, I have been reading up a bit by bit on efficient data structures, primarily from the perspective of memory utilization. Data structures that provide constant lookup time with minimal memory utilization can give a significant performance boost since [...]]]></description>
			<content:encoded><![CDATA[<p>Since my <a href="http://www.mail-archive.com/zeromq-dev@lists.zeromq.org/msg01133.html">discussion thread</a> on the efficiency of the in-memory data structure of <a href="http://zeromq.org">ZeroMQ</a> with Martin Sustrik, I have been reading up a bit by bit on efficient data structures, primarily from the perspective of memory utilization. Data structures that provide constant lookup time with minimal memory utilization can give a significant performance boost since access to CPU cache is considerably faster than access to RAM. This post is a compendium of a few data structures I came across and salient aspects about them</p>
<p><strong>Judy arrays</strong> <a href="http://judy.sourceforge.net/doc/10minutes.htm">http://judy.sourceforge.net/doc/10minutes.htm<br />
</a>Excerpt: A Judy tree is generally faster than and uses less memory than contemporary forms of trees such as binary (AVL) trees, b-trees, and skip-lists. When used in the &#8220;Judy Scalable Hashing&#8221; configuration, Judy is generally faster then a hashing method at all populations. A (CPU) <em>cache-line fill</em> is additional time required to do a read reference from RAM when a word is not found in cache. In today&#8217;s computers the time for a cache-line fill is in the range of 50..2000 machine instructions. Therefore a cache-line fill should be avoided when fewer than 50 instructions can do the same job. Judy rarely compromises speed/space performance for simplicity (Judy will never be called simple except at the API). Judy is designed to avoid cache-line fills wherever possible. The Achilles heel of a simple digital tree is very poor memory utilization, especially when the N in N-ary (the degree or fanout of each branch) increases. The Judy tree design was able to solve this problem. In fact a Judy tree is more memory-efficient than almost any other competitive structure (including a simple linked list).</p>
<p><strong>HAT-trie &#8211; a cache concious trie</strong>  <a href="http://portal.acm.org/citation.cfm?id=1273761">http://portal.acm.org/citation.cfm?id=1273761<br />
</a>Excerpt: Tries are the fastest tree-based data structures for managing strings in-memory, but are space-intensive. The burst-trie is almost as fast but reduces space by collapsing trie-chains into buckets. This is not however, a cache-conscious approach and can lead to poor performance on current processors. In this paper, we introduce the HAT-trie, a cache-conscious trie-based data structure that is formed by carefully combining existing components. We evaluate performance using several real-world datasets and against other high-performance data structures. We show strong improvements in both time and space; in most cases approaching that of the cache-conscious hash table. Our HAT-trie is shown to be the most efficient trie-based data structure for managing variable-length strings in-memory while maintaining sort order.</p>
<p><strong>Burst Trie</strong> <a href="http://goanna.cs.rmit.edu.au/~jz/fulltext/acmtois02.pdf">http://goanna.cs.rmit.edu.au/~jz/fulltext/acmtois02.pdf<br />
</a>Excerpt: Many applications depend on efficient management of large sets of distinct strings in memory. We propose a new data structure, the burst trie, that has significant advantages over existing options for such applications: it requires no more memory than a binary tree; it is as fast as a trie; and, while not as fast as a hash table, a burst trie maintains the strings in sorted or near-sorted order. These experiments show that the burst trie is particularly effective for the skewed frequency distributions common in text collections, and dramatically outperforms all other data structures for the task of managing strings while maintaining sort order.</p>
<p><strong>Radix trie (aka Patricia trie)</strong> <a href="http://en.wikipedia.org/wiki/Radix_tree">http://en.wikipedia.org/wiki/Radix_tree<br />
</a>Excerpt: The radix tree is easiest to understand as a space-optimized trie where each node with only one child is merged with its child. Unlike balanced trees, radix trees permit lookup, insertion, and deletion in O(k) time rather than O(log n)</p>
<p><strong>Ternary Search Trees <span style="font-weight: normal;"><a href="http://en.wikipedia.org/wiki/Ternary_search_tree">http://en.wikipedia.org/wiki/Ternary_search_tree<br />
</a>Excerpt: A trie is optimized for speed at the expense of size. The ternary search tree replaces each node of the trie with a modified binary search tree. For sparse tries, this binary tree will be smaller than a trie node. Each binary tree implements a single-character lookup. It has the typical left and right children which are checked if the lookup character is greater or less than the node&#8217;s character, respectively. A third child is used if the lookup character is found on that particular node. Unlike the other children, it links to the root of the binary search tree for the next character in the string</span></strong></p>
<p><strong><span style="font-weight: normal;">Next steps: to trie <img src='http://bhavin.directi.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' />  and setup benchmarks for some of these on a practical application</span></strong></p>
 <img src="http://bhavin.directi.com/wp-content/plugins/feed-statistics.php?view=1&post_id=393" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://bhavin.directi.com/to-trie-or-not-to-trie-a-comparison-of-efficient-data-structures/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Using Javascript to read a users browser history</title>
		<link>http://bhavin.directi.com/using-javascript-to-read-a-users-browser-history/</link>
		<comments>http://bhavin.directi.com/using-javascript-to-read-a-users-browser-history/#comments</comments>
		<pubDate>Sun, 09 May 2010 21:25:01 +0000</pubDate>
		<dc:creator>Bhavin Turakhia</dc:creator>
				<category><![CDATA[0-cosmos]]></category>
		<category><![CDATA[TechTalk]]></category>
		<category><![CDATA[browser history]]></category>
		<category><![CDATA[javascript]]></category>

		<guid isPermaLink="false">http://bhavin.directi.com/?p=390</guid>
		<description><![CDATA[While doing some research I came across this article by Mike Nolet on figuring out the gender of a user based on the websites the user has visited. The article has a javascript that does this &#8211; so yea I am adding &#8220;Allows you to check your testosterone levels&#8221; as a feature of Javascript.
But on [...]]]></description>
			<content:encoded><![CDATA[<p>While doing some research I came across <a href="http://www.mikeonads.com/2008/07/13/using-your-browser-url-history-estimate-gender/">this article</a> by Mike Nolet on figuring out the gender of a user based on the websites the user has visited. The article has a javascript that does this &#8211; so yea I am adding &#8220;Allows you to check your testosterone levels&#8221; as a feature of Javascript.</p>
<p>But on a more serious note &#8211; I was impressed (and puzzled) mostly by the fact that his javascript managed to figure out which websites exist in my browser history. Now that makes me curious. So a few clicks and a google search later I figure that your browser history is NOT private. There is a nifty javascript hack that can allow any website to figure out which other websites you have visited in the past, from a potential list of websites.</p>
<p>I just had to blog about this. The hack uses the property of the browser which results in changing the color of an already visited link. Basically through javascript one can find out the color of any item in the DOM. So in order to find out whether you have visited a particular website, all I need to do is insert that website in the DOM as a link (albiet in an invisible manner) and check its color property. If its color matches that of a &#8220;visited link&#8221; then you have visited that website. Seemingly dell already uses this on their website to determine if a user has visited any of its competitors. Think of the potential uses -</p>
<ul>
<li>You can check if a user coming to your website has already visited any of your competitors, and if so target specific offers to them</li>
<li>If you rank at the 5th position in Google for a keyword you can check if the user has visited any of the previous 4 links</li>
<li>Lets say you have an offer coupon that you only want an anonymous user to see once. You may use cookies, but a user could delete their cookies if they are on to you. You can now check whether the user has been to that URL before through this hack if the user has not deleted their history</li>
</ul>
<p>Espionage courtesy Javascript!!</p>
<p>More details available here</p>
<ul>
<li><a href="http://www.merchantos.com/makebeta/tools/spyjax/">http://www.merchantos.com/makebeta/tools/spyjax/</a></li>
<li><a href="http://www.merchantos.com/makebeta/tools/spyjax/"></a><a href="http://www.stevenyork.com/tutorial/getting_browser_history_using_javascript">http://www.stevenyork.com/tutorial/getting_browser_history_using_javascript</a></li>
<li><a href="http://www.stevenyork.com/tutorial/getting_browser_history_using_javascript"></a><a href="http://www.mikeonads.com/2008/07/13/using-your-browser-url-history-estimate-gender/">http://www.mikeonads.com/2008/07/13/using-your-browser-url-history-estimate-gender/</a></li>
</ul>
 <img src="http://bhavin.directi.com/wp-content/plugins/feed-statistics.php?view=1&post_id=390" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://bhavin.directi.com/using-javascript-to-read-a-users-browser-history/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>RabbitMQ vs Apache ActiveMQ vs Apache qpid</title>
		<link>http://bhavin.directi.com/rabbitmq-vs-apache-activemq-vs-apache-qpid/</link>
		<comments>http://bhavin.directi.com/rabbitmq-vs-apache-activemq-vs-apache-qpid/#comments</comments>
		<pubDate>Fri, 07 May 2010 04:46:04 +0000</pubDate>
		<dc:creator>Bhavin Turakhia</dc:creator>
				<category><![CDATA[0-cosmos]]></category>
		<category><![CDATA[TechTalk]]></category>
		<category><![CDATA[queue]]></category>
		<category><![CDATA[scalability]]></category>

		<guid isPermaLink="false">http://bhavin.directi.com/?p=386</guid>
		<description><![CDATA[We need a simple message queue to ensure asynchronous message passing across a bunch of our server side apps. The message volume is not intended to be very high, latency is not an issue, and order is not important, but we do need to guarantee that the message will be received and that there is [...]]]></description>
			<content:encoded><![CDATA[<p>We need a simple message queue to ensure asynchronous message passing across a bunch of our server side apps. The message volume is not intended to be very high, latency is not an issue, and order is not important, but we do need to guarantee that the message will be received and that there is no potential for failure irrespective of infrastructure downtime.</p>
<p><a href="http://dhruvbird.blogspot.com">Dhruv</a> from my team had taken up the task of researching various persistent message queue options and compiling notes on them. This is a compendium of his notes (disclaimer &#8211; this is an outline of our experience, there may be inaccuracies) -</p>
<h2>RabbitMQ</h2>
<p><strong>General:</strong></p>
<ul>
<li>Some reading on clustering <a href="http://www.rabbitmq.com/clustering.html">http://www.rabbitmq.com/clustering.html</a></li>
<li>DNS errors cause the DB(mnesia) to crash</li>
<li>A RabbitMQ instance won&#8217;t scale to LOTS of queues with each queue having fair load since all queues are stored in memory (queue metadata) and also in a clustered setup, each queue&#8217;s metadata (but not the queue&#8217;&#8217;s messages) is replicated on each node. Hence, there is the same amount of overhead due to queues on every node in a cluster</li>
<li>No ONCE-ONLY semanamntics. Messages may be sent twice by RabbitMQ to the consumer(s)</li>
<li>Multiple consumers can be configured for a single queue, and they will all get mutually exclusive messages</li>
<li>Unordered; not FIFO delivery</li>
<li>Single socket multiple connections. Each socket can have multiple channels and each channel can have multiple consumers</li>
<li>No provision for ETA</li>
<li>maybe auto-requeue (based on timeout) &#8212; needs investigation</li>
<li>Only closing connection NACKs a message. Removing the consumer from that channel does NOT. Hence, all queues being listened to on that channel/connetion are closed for the current consumer</li>
<li>NO EXPONENTIAL BACKOFF for failed consumers. Failed messages are re-tried almost immediately. Hence an error in the consumer logic that crashes the consumer while consuming a particular message may potentially block the whole queue. Hence, the consumer needs to be programmed well &#8212; error free. However, apps are like; well apps&#8230;</li>
<li>Consumer has to do rate limiting by not consuming messages too fast (if it wants to); no provision for this in RabbitMQ</li>
</ul>
<p><strong>Persistence:</strong></p>
<ul>
<li>It will use only it&#8217;s own DB &#8212; you can&#8217;t configure mySQL or any such thing</li>
</ul>
<p><strong>Clustering and Replication:</strong></p>
<ul>
<li>A RabbitMQ cluster is just a set of nodes running the RabbitMQ. No master node is involved.</li>
<li>You need to specify hostname of cluster nodes in a cluster manually on the command line or in a config file.</li>
<li>Basic load balancing by nodes in a cluster by redirecting requests to other nodes</li>
<li>A node can be a RAM node or a disk node. RAM nodes keep their state only in memory (with the exception of the persistent contents of durable queues which are still stored safely on disc). Disk nodes keep state in memory and on disk.</li>
<li>Queue metadata shared across all nodes.</li>
<li>RabbitMQ brokers tolerate the failure of individual nodes. Nodes can be started and stopped at will</li>
<li>It is advisable to have at least 1 disk node in a cluster of nodes</li>
<li>You need to specify which nodes are part of a cluster during node startup. Hence, when A is the first one to start, it will think that it is the only one in the cluster. When B is started it will be told that A is also in the cluster and when C starts, it should be told that BOTH A and B are part of the cluster. This is because if A or B go down, C still knows one of the machines in the cluster. This is only required for RAM nodes, since they don&#8217;t persist metadata on disk. So, if C is a memory node and it goes down and comes up, it will have to be manually told which nodes to query for cluster membership (since it itself doesn&#8217;t store that state locally).</li>
<li>Replication needs to be investigated (check addtl resources) however, from initial reading, it seems queue data replication does not exist</li>
<li>FAQ: &#8220;How do you migrate an instance of RabbitMQ to another machine?&#8221;. Seems to be a very manual process.</li>
</ul>
<p><strong>Transactions:</strong></p>
<ul>
<li>Any number of queues can be involved in a transaction</li>
</ul>
<p><strong>Addtl Resources</strong></p>
<ul>
<li><a href="http://somic.org/2008/11/11/using-rabbitmq-beyond-queueing/">http://somic.org/2008/11/11/using-rabbitmq-beyond-queueing/</a></li>
<li><a href="http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2009-August/004598.html">http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2009-August/004598.html</a></li>
<li><a href="http://old.nabble.com/Durable-queues---bindings-td22959443.html">http://old.nabble.com/Durable-queues&#8212;bindings-td22959443.html</a></li>
<li><a href="http://groups.google.com/group/rabbitmq-discuss/msg/e55ae8c821405044">http://groups.google.com/group/rabbitmq-discuss/msg/e55ae8c821405044</a></li>
<li>RabbitMQ benchmarks (inconclusive): <a href="http://www.sheysrebellion.net/blog/2009/06/">http://www.sheysrebellion.net/blog/2009/06/</a></li>
<li>Some more RabbitMQ benchmarks: <a href="http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2009-October/005189.html">http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2009-October/005189.html</a></li>
<li>If you are still thirsty: <a href="http://www.rabbitmq.com/faq.html">http://www.rabbitmq.com/faq.html</a></li>
</ul>
<h2>Apache qpid</h2>
<ul>
<li>Supports transactions</li>
<li>Persistence using a pluggable layer &#8212; I believe the default is Apache Derby</li>
<li>This like the other Java based product is HIGHLY configurable</li>
<li>Management using JMX and an Eclipse Management Console application - <a href="http://www.lahiru.org/2008/08/what-qpid-management-console-can-do.html">http://www.lahiru.org/2008/08/what-qpid-management-console-can-do.html</a></li>
<li><a href="http://www.lahiru.org/2008/08/what-qpid-management-console-can-do.html"></a>The management console is very feature rich</li>
<li>Supports message Priorities</li>
<li>Automatic client failover using configurable connection properties -
<ul>
<li><a href="http://qpid.apache.org/cluster-design-note.html">http://qpid.apache.org/cluster-design-note.html</a></li>
<li><a href="http://qpid.apache.org/cluster-design-note.html"></a><a href="http://qpid.apache.org/starting-a-cluster.html">http://qpid.apache.org/starting-a-cluster.html</a></li>
<li><a href="http://qpid.apache.org/starting-a-cluster.html"></a><a href="http://qpid.apache.org/cluster-failover-modes.html">http://qpid.apache.org/cluster-failover-modes.html</a></li>
</ul>
</li>
<li>Cluster is nothing but a set of machines have all the queues replicated</li>
<li>All queue data and metadata is replicated across all nodes that make up a cluster</li>
<li>All clients need to know in advance which nodes make up the cluster</li>
<li>Retry logic lies in the client code</li>
<li>Durable Queues/Subscriptions</li>
<li>Has bindings in many languages</li>
<li>For the curious: <a href="http://qpid.apache.org/current-architecture.html">http://qpid.apache.org/current-architecture.html</a></li>
<li><a href="http://qpid.apache.org/current-architecture.html"></a>In our tests -
<ul>
<li>Speed: Non-persistent mode: 5000 messages/sec (receive rate), Persistent mode: 1100 messages/sec (receive rate) (send rate will be typically a bit more, but when you start off with an empty queue, they are almost the same for most queue implementations). However, the interesting bit is that even in transacted mode, I saw a lot of message loss if I crashed the broker (by crash I mean Ctrl+C, not even the more -9 signal type of thing that I usually do). Why I stress this is that apps. can usually hook on to Ctrl+C and save data before quitting, but qpid didn&#8217;t think it prudent to do so. Out of 1265 messages sent (and committed), only 1218 were received by the consumer (before the inflicted crash). Even on restarting the broker and consumer, that didn&#8217;t change. We observed similar behaviour with RabbitMQ in our tests. However, RabbitMQ docs. mention that you need to run in TRANSACTED mode (not just durable/persistent) for guaranteed delivery. We haven&#8217;t run that test yet.</li>
</ul>
</li>
</ul>
<h2>Apache ActiveMQ</h2>
<ul>
<li>HIGHLY configurable. You can probably do anything you want it to with it</li>
<li>You can choose a message store. 4 are already available</li>
<li>Has lots of clustering options:
<ul>
<li>Shared nothing Master-Slave: ACK sent to client when master stores the message</li>
<li>Shared Database: Acquires a lock on the DB when any instance tries to access the DB</li>
<li>Shared Filesystem: Locks a file when accessing the FS. Issues when using NFS with file-locking; or basically any network based file system since file locking is generally buggy in network file systems</li>
</ul>
</li>
<li>Network of brokers: This is an option that allows a lot of flexibility. However, it seems to be a very problematic/buggy way of doing things since people face a lot of issues with this configuration</li>
<li>Scaling:
<ul>
<li>A. Default transport is blocking I/O with a thread per connection. Can be changed to use nio</li>
<li>Horizontal scaling: Though they mention this, the way to achieve this is by using a network of brokers</li>
<li>Patitioning: We all know Mr. Partitioning, don&#8217;t we. The client decides where to route packets and hence must maintain multiple open connections to different brokers</li>
</ul>
</li>
<li>Allows producer flow-control!!</li>
<li>Has issues wrt lost/duplicate messages, but there is an active community that fixes these issues</li>
<li>Active MQ crashes fairly frequently, at least once per month, and is rather slow - <a href="http://stackoverflow.com/questions/957507/lightweight-persistent-message-queue-for-linux">http://stackoverflow.com/questions/957507/lightweight-persistent-message-queue-for-linux</a></li>
<li><a href="http://stackoverflow.com/questions/957507/lightweight-persistent-message-queue-for-linux"></a>Seems to have bindings in many languages(just like RabbitMQ)</li>
<li>Has lots of tools built around it 12. JMS compliant; supports XA transactions: <a href="http://activemq.apache.org/how-do-transactions-work.html">http://activemq.apache.org/how-do-transactions-work.html</a></li>
<li><a href="http://activemq.apache.org/how-do-transactions-work.html"></a>Less performant as compared to RabbitMQ</li>
<li>We were able to perform some tests on Apache Active MQ today, and here are the results:
<ul>
<li>Non persistent mode: 5k messages/sec</li>
<li>Persistent mode: 22 messages/sec (yes that is correct)</li>
</ul>
</li>
<li>There are multiple persisters that can be configured with ActiveMQ, so we are planning to run another set of tests with MySQL and file as the persisters. However, the current default (KahaDB) is said to be more scalable (and offers faster recoverability) as compared to the older default(file/AMQ Message Store: http://activemq.apache.org/amq-message-store.html).</li>
<li>The numbers are fair. Others on the net have observed similar results: <a href="http://www.mostly-useless.com/blog/2007/12/27/playing-with-activemq/">http://www.mostly-useless.com/blog/2007/12/27/playing-with-activemq/</a></li>
<li>With MySQL, I get a throughput of 8 messages/sec. What is surprising is that it is possible to achieve much better results using MySQL but ActiveMQ uses the table quite unwisely.</li>
<li>ActiveMQ created the tables as InnoDB instead of MyISAM even though it doesn&#8217;t seem to be using any of the InnoDB features.</li>
<li>I tried changing the tables to MyISAM, but it didn&#8217;t help much. The messages table structure has 4 indexes !! Insert takes a lot of time because MySQL needs to update 4 indexes on every insert. That sort of kills performance. However, I don&#8217;t know if performance should be affected for small (&lt; 1000) messages in the table. Either ways, this structure won&#8217;t scale to millions of messages since everyone will block on this one table.</li>
</ul>
 <img src="http://bhavin.directi.com/wp-content/plugins/feed-statistics.php?view=1&post_id=386" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://bhavin.directi.com/rabbitmq-vs-apache-activemq-vs-apache-qpid/feed/</wfw:commentRss>
		<slash:comments>20</slash:comments>
		</item>
		<item>
		<title>A mini compendium for mobile website development</title>
		<link>http://bhavin.directi.com/a-mini-compendium-for-mobile-website-development/</link>
		<comments>http://bhavin.directi.com/a-mini-compendium-for-mobile-website-development/#comments</comments>
		<pubDate>Thu, 15 Apr 2010 07:09:31 +0000</pubDate>
		<dc:creator>Bhavin Turakhia</dc:creator>
				<category><![CDATA[0-cosmos]]></category>
		<category><![CDATA[TechTalk]]></category>
		<category><![CDATA[mobile]]></category>

		<guid isPermaLink="false">http://bhavin.directi.com/?p=379</guid>
		<description><![CDATA[At Directi, we have been toying with some ideas around making some of our web apps mobile friendly. I spent sometime reading and reviewing various online guides on mobile website development. Here are a few of the good resources I found -


http://mobiforge.com/designing/story/effective-design-multiple-screen-sizes &#8211; Designing a mobile website for multiple screen sizes
http://mobiforge.com/designing/story/mobile-web-design-getting-point-part-i - This article investigates salient [...]]]></description>
			<content:encoded><![CDATA[<p>At <a href="http://directi.com">Directi</a>, we have been toying with some ideas around making some of our web apps mobile friendly. I spent sometime reading and reviewing various online guides on mobile website development. Here are a few of the good resources I found -</p>
<div id="_mcePaste">
<ul>
<li><a href="http://mobiforge.com/designing/story/effective-design-multiple-screen-sizes">http://mobiforge.com/designing/story/effective-design-multiple-screen-sizes</a> &#8211; Designing a mobile website for multiple screen sizes</li>
<li><em><span style="font-style: normal;"><a href="http://mobiforge.com/designing/story/mobile-web-design-getting-point-part-i">http://mobiforge.com/designing/story/mobile-web-design-getting-point-part-i</a> - <em><span style="font-style: normal;">This article investigates salient aspects of Google, Facebook and Twitter&#8217;s mobile websites</span><br />
</em></span></em></li>
<li><a href="http://mobiforge.com/designing/story/mobile-web-design-getting-point-part-ii">http://mobiforge.com/designing/story/mobile-web-design-getting-point-part-ii</a> &#8211; <em><span style="font-style: normal;">This article applies principles from part i towards building an online store</span><br />
</em></li>
<li><em><span style="font-style: normal;"><a href="http://mobithinking.com/best-practices/a-three-step-guide-usability-mobile-web">http://mobithinking.com/best-practices/a-three-step-guide-usability-mobile-web</a> -</span><span style="font-style: normal;"> A Three Step Guide to Usability on the Mobile Web</span><br />
</em></li>
<li><a href="http://mobithinking.com/">http://mobithinking.com/</a> &#8211; <em><span style="font-style: normal;">Nice articles on stats, marketing advice etc for mobile devices</span><br />
</em></li>
<li><a href="http://eng.designerbreak.com/2009/tutorial/create-a-mobile-site/">http://eng.designerbreak.com/2009/tutorial/create-a-mobile-site/</a> &#8211; A tutorial on creating a mobile website</li>
<li><a href="http://www.w3.org/TR/mobile-bp/">http://www.w3.org/TR/mobile-bp/</a> &#8211; W3C guide on Mobile Web Best Practices 1.0</li>
<li><a href="http://deviceatlas.com/">http://deviceatlas.com/</a> &#8211; the most comprehensive data source on handset detection and handset information &#8211; provides APIs and tools</li>
<li><a href="http://ready.mobi/">http://ready.mobi/</a> &#8211; The mobiReady testing tool evaluates mobile-readiness of a website using industry best practices &amp; standards. The free report provides both a score (from 1 to 5) and in-depth analysis of pages to determine how well your site performs on a mobile device</li>
<li><a href="http://www.scribd.com/doc/12641/Mobile-Web-Developers-Guide">A Mobile web developers guide</a></li>
<li><a href="http://www.amazon.com/Mobile-Design-Development-Practical-Techniques/dp/0596155441/ref=sr_1_1?ie=UTF8&amp;s=books&amp;qid=1271312818&amp;sr=1-1">Oreilly book &#8211; Mobile Design and Development: Practical Concepts and Techniques for Creating Mobile Sites and Web Apps</a></li>
</ul>
</div>
 <img src="http://bhavin.directi.com/wp-content/plugins/feed-statistics.php?view=1&post_id=379" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://bhavin.directi.com/a-mini-compendium-for-mobile-website-development/feed/</wfw:commentRss>
		<slash:comments>17</slash:comments>
		</item>
		<item>
		<title>My mini OAuth resource compendium</title>
		<link>http://bhavin.directi.com/my-mini-oauth-resource-compendium/</link>
		<comments>http://bhavin.directi.com/my-mini-oauth-resource-compendium/#comments</comments>
		<pubDate>Wed, 14 Apr 2010 04:15:33 +0000</pubDate>
		<dc:creator>Bhavin Turakhia</dc:creator>
				<category><![CDATA[0-cosmos]]></category>
		<category><![CDATA[TechTalk]]></category>
		<category><![CDATA[http]]></category>
		<category><![CDATA[nonce]]></category>
		<category><![CDATA[oauth]]></category>

		<guid isPermaLink="false">http://bhavin.directi.com/?p=377</guid>
		<description><![CDATA[We are beginning implementation of OAuth in one of our projects. I just finished reading up a ton of resources. In the end I only needed to readup a few. Here they are in the recommended order -

http://hueniverse.com/oauth/ &#8211; The best layman explanation of how OAuth works &#8211; strongly recommended resource. Read every section.
http://oauth.net/ &#8211; [...]]]></description>
			<content:encoded><![CDATA[<p>We are beginning implementation of OAuth in one of our projects. I just finished reading up a ton of resources. In the end I only needed to readup a few. Here they are in the recommended order -</p>
<ul>
<li><a href="http://hueniverse.com/oauth/">http://hueniverse.com/oauth/</a> &#8211; The best layman explanation of how OAuth works &#8211; strongly recommended resource. Read every section.</li>
<li><a href="http://hueniverse.com/oauth/"></a><a href="http://oauth.net/">http://oauth.net/</a> &#8211; The official OAuth site, contains the protocol specifications</li>
<li><a href="http://tools.ietf.org/html/draft-hammer-oauth-10">http://tools.ietf.org/html/draft-hammer-oauth-10</a> &#8211; The latest spec</li>
<li><a href="http://oauth.net/code/">http://oauth.net/code/</a> &#8211; Links to ready OAuth libraries in every language</li>
</ul>
<p>OAuth is a fairly simple protocol, especially if you are familiar with the basics of HTTP, nonce, basic encryption/digital signatures etc.</p>
 <img src="http://bhavin.directi.com/wp-content/plugins/feed-statistics.php?view=1&post_id=377" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://bhavin.directi.com/my-mini-oauth-resource-compendium/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Selecting a Message Queue &#8211; AMQP or ZeroMQ</title>
		<link>http://bhavin.directi.com/selecting-a-message-queue-amqp-or-zeromq/</link>
		<comments>http://bhavin.directi.com/selecting-a-message-queue-amqp-or-zeromq/#comments</comments>
		<pubDate>Sun, 04 Apr 2010 16:55:18 +0000</pubDate>
		<dc:creator>Bhavin Turakhia</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://bhavin.directi.com/?p=367</guid>
		<description><![CDATA[I have spent a better part of my Sunday researching some of the messaging queue options available out there. My research was divided primarily between AMQP queues and ZeroMQ. For those who came in late - iMatix the company behind ZeroMQ was also involved in defining the AMQP specs. iMatix recently announced that they would no longer continue [...]]]></description>
			<content:encoded><![CDATA[<p>I have spent a better part of my Sunday researching some of the messaging queue options available out there. My research was divided primarily between <a href="http://amqp.org">AMQP</a> queues and <a href="http://zeromq.org">ZeroMQ</a>. For those who came in late - iMatix the company behind ZeroMQ was also involved in defining the AMQP specs. iMatix <a href="http://lists.openamq.org/pipermail/openamq-dev/2010-March/001598.html">recently announced</a> that they would no longer continue supporting the AMQP spec and instead focus on ZeroMQ &#8211; their brokerless queue product</p>
<p>ZeroMQ is a fundamentally different approach to message queues from AMQP. Pieter and folks from iMatix have made several posts on the flaws in the design and philosophy of AMQP (check the resources section below). Infact during the course of my study today these posts were amongst the most interesting material I have read.</p>
<p>Here are some thoughts around considerations on our part with respect to selecting a queue for our application needs -</p>
<h2>Ordered Messages</h2>
<p>Some queues provide guarantees of ordered delivery, while others do not. Order of delivery may not be required in all applications, and adds complexity. However as an eg &#8211; If your messages consist of updates to state, then two messages updating the same entity, may result in inconsistency if their order is switched. These complexities would then have to be resolved in your application if the underlying messaging system does not offer ordering guarantees.</p>
<h2>Persistence</h2>
<p>If the queue itself does not manage persistence, it will have to be handled by your application. Persistence is required for reliability. A queue that manages all messages and state in memory will lose any undelivered messages incase of a node restart/downtime/crash.</p>
<h2>Clustering</h2>
<p>Broker based queues may or may not offer clustering. Some message queues offer fail-overs but not clustering. Clustering allows multiple nodes to be used as a single active-active instance, increasing availability.</p>
<p>I also have a short summary on the 2 options I am currently considering -</p>
<p><strong>ZeroMQ</strong></p>
<ul>
<li>Crazy fast</li>
<li>Brokerless architecture</li>
<li>In-process library</li>
<li>Lower latencies</li>
<li>Very simple to use</li>
<li>No persistence &#8211; requiring higher layers to manage persistence</li>
</ul>
<p><strong>RabbitMQ</strong></p>
<ul>
<li>AMQP compliant</li>
<li>Written in erlang</li>
<li>Small footprint and seemingly fewer lines of code in comparison to other AMQP compliant queue managers</li>
</ul>
<h2>Resources</h2>
<p>Here is a list of the more interesting resources I read through today:</p>
<ul>
<li>Whats wrong with AMQP &#8211; <a href="http://www.imatix.com/articles:whats-wrong-with-amqp">http://www.imatix.com/articles:whats-wrong-with-amqp</a> - The most interesting article I read in my research. This is a fairly long article but I would strongly recommend anyone who is planning on using queues to read it end to end. Irrespective of whether we choose ZeroMQ or any AMQP compliant queue, this article provides some good insights.</li>
<li><a href="http://www.imatix.com/articles:introduction-to-restms">http://www.imatix.com/articles:introduction-to-restms</a> &#8211; A nice document explaining restms protocol by pieter. RestMS can act as a bridge between Messaging systems and HTTP/REST based systems</li>
<li><a href="http://www.ipocracy.com/blog:10-principles-for-amqp">http://www.ipocracy.com/blog:10-principles-for-amqp</a> &#8211; Ten ways that AMQP can be made simpler, more backwards compatible, more interesting, and overall more enjoyable and successful for all who work on it and use it</li>
<li><a href="http://www.amqp.org/">http://www.amqp.org/</a></li>
<li><a href="http://www.amqp.org/"></a><a href="http://www.zeromq.org/">http://www.zeromq.org/</a></li>
<li><a href="http://www.zeromq.org/"></a><a href="http://www.openamq.org/">http://www.openamq.org/</a></li>
<li><a href="http://www.openamq.org/"></a><a href="http://www.restms.org/">http://www.restms.org/</a></li>
<li><a href="http://www.restms.org/"></a><a href="http://www.zeromq.org/area:docs-v20">http://www.zeromq.org/area:docs-v20</a></li>
<li><a href="http://www.zeromq.org/area:docs-v20"></a><a href="http://api.zeromq.org/zmq.html">http://api.zeromq.org/zmq.html</a><a href="http://storage.synchost.com/eanderson/2010/2010-02-18%2010.02%20Low%20Latency_%20High%20Throughput_%20Durable_%20RESTful_%20Open_%20Standards_%20___.wmv">http://storage.synchost.com/eanderson/2010/2010-02-18%2010.02%20Low%20Latency_%20High%20Throughput_%20Durable_%20RESTful_%20Open_%20Standards_%20___.wmv</a></li>
<li><a href="http://storage.synchost.com/eanderson/2010/2010-02-18%2010.02%20Low%20Latency_%20High%20Throughput_%20Durable_%20RESTful_%20Open_%20Standards_%20___.wmv"></a><a href="http://www.zeromq.org/local--files/area:whitepapers/messaging-2010-02-17.pdf">http://www.zeromq.org/local&#8211;files/area:whitepapers/messaging-2010-02-17.pdf</a></li>
<li><a href="http://www.zeromq.org/local--files/area:whitepapers/messaging-2010-02-17.pdf"></a><a href="http://www.zeromq.org/area:whitepapers">http://www.zeromq.org/area:whitepapers</a></li>
<li><a href="http://www.zeromq.org/area:whitepapers"></a><a href="http://www.zeromq.org/faq">http://www.zeromq.org/faq</a></li>
<li><a href="http://www.zeromq.org/faq"></a><a href="http://www.zeromq.org/local--files/area:studies/solvians.pdf">http://www.zeromq.org/local&#8211;files/area:studies/solvians.pdf</a></li>
<li><a href="http://www.zeromq.org/local--files/area:studies/solvians.pdf"></a><a href="http://www.zeromq.org/whitepapers:message-matching">http://www.zeromq.org/whitepapers:message-matching</a> &#8211; highly optimized message maching algorithm</li>
<li><a href="http://www.zeromq.org/results:ib-tests-v206">http://www.zeromq.org/results:ib-tests-v206</a> &#8211; 4.7 million messages per second for a 64 byte message</li>
<li><a href="http://www.imatix.com/articles:how-to-build-utterly-reliable-systems">http://www.imatix.com/articles:how-to-build-utterly-reliable-systems</a></li>
<li><a href="http://www.imatix.com/articles:how-to-build-utterly-reliable-systems"></a><a href="http://www.slideshare.net/mattmatt/rabbitmq-and-nanite">http://www.slideshare.net/mattmatt/rabbitmq-and-nanite</a></li>
<li><a href="http://www.slideshare.net/mattmatt/rabbitmq-and-nanite"></a><a href="http://lists.zeromq.org/pipermail/zeromq-dev/2008-December/000246.html">http://lists.zeromq.org/pipermail/zeromq-dev/2008-December/000246.html</a></li>
<li><a href="http://lists.zeromq.org/pipermail/zeromq-dev/2008-December/000246.html"></a><a href="http://www.zeromq.org/docs:zsock">http://www.zeromq.org/docs:zsock</a></li>
</ul>
 <img src="http://bhavin.directi.com/wp-content/plugins/feed-statistics.php?view=1&post_id=367" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://bhavin.directi.com/selecting-a-message-queue-amqp-or-zeromq/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
<enclosure url="http://storage.synchost.com/eanderson/2010/2010-02-18%2010.02%20Low%20Latency_%20High%20Throughput_%20Durable_%20RESTful_%20Open_%20Standards_%20___.wmv" length="40151863" type="video/x-ms-wmv" />
		</item>
		<item>
		<title>Crowd Sourcing &#8211; Harnessing the power of the people</title>
		<link>http://bhavin.directi.com/crowd-sourcing-harnessing-the-power-of-the-people/</link>
		<comments>http://bhavin.directi.com/crowd-sourcing-harnessing-the-power-of-the-people/#comments</comments>
		<pubDate>Sun, 14 Mar 2010 09:30:07 +0000</pubDate>
		<dc:creator>Bhavin Turakhia</dc:creator>
				<category><![CDATA[0-cosmos]]></category>
		<category><![CDATA[Random Musings]]></category>
		<category><![CDATA[TechTalk]]></category>
		<category><![CDATA[crowdsourcing]]></category>

		<guid isPermaLink="false">http://bhavin.directi.com/?p=361</guid>
		<description><![CDATA[Most of us have heard of the NetFlix million dollar competition (read here, here and here)  that lasted 3 years, attracted 51,000 contestants from 186 countries, all competing AND co-operating to build a better recommendation engine for NetFlix so that users of NetFlix can get more accurate movie suggestions. The winners &#8211; BellKor&#8217;s Pragmatic Chaos [...]]]></description>
			<content:encoded><![CDATA[<p>Most of us have heard of the NetFlix million dollar competition (read <a href="http://www.wired.com/epicenter/2009/06/1-million-netflix-prize-so-close-they-can-taste-it/">here</a>, <a href="http://www.wired.com/techbiz/media/magazine/16-03/mf_netflix">here</a> and <a href="http://www.usatoday.com/tech/news/2009-09-21-netflix-prize_N.htm">here</a>)  that lasted 3 years, attracted 51,000 contestants from 186 countries, all competing AND co-operating to build a better recommendation engine for NetFlix so that users of NetFlix can get more accurate movie suggestions. The winners &#8211; <a href="http://www2.research.att.com/~volinsky/netflix/bpc.html">BellKor&#8217;s Pragmatic Chaos</a> &#8211; a team from AT&amp;T research took the $1 million prize by providing the winning algorithm. The innovations and ideas generated on this subject during the course of 3 years was a feat unachievable by any single corporate research division.</p>
<p>Crowdsourcing (as coined by Jeff Howe of Wired Magazine) has been gaining considerable traction as a feasible, scalable, practical and even cost-effective method of getting stuff done &#8211; whether it is design, development, ideating, problem solving and more. We are not unfamiliar with the concept &#8211; everyone who has ever used Wikipedia has used a product of crowdsourcing. Over the last several years, many web applications and portals have emerged that have taken crowd sourcing to the next level by webifying the process and making it accessible to the masses. Taking a page from <a href="http://techcrunch.com/2009/06/23/engineers-are-the-best-deal-so-stock-up-on-them/">Auren Hoffman</a> and <a href="http://bnoopy.typepad.com/bnoopy/2005/06/its_a_great_tim.html">Joe Kraus&#8217;</a> articles &#8211; it has never been a better time to be an entrepreneur. What used to take millions of dollars, swanky offices, expensive 64-way sun solaris boxes, and an elite team, can now be achieved by a single person with a smart idea. Think about it. All you need is a great idea. Dont have programmers? Make your way to <a href="http://topcoder.net">TopCoder</a> or <a href="http://rentacoder.com">Rent-a-coder</a> and hire a just-in-time team. Need to give your brand visibility? Head over to <a href="http://crowdspring.com">crowdSpring</a> or <a href="http://99designs.com">99 Designs</a> and get a logo and a look from hundreds of contributors for cheap. Need servers? You can now run on the same scalable infrastructure that <a href="http://aws.amazon.com/ec2/">Amazon</a> and <a href="http://code.google.com/appengine/">Google</a> run on. From design and marketing, to development and deployment &#8211; you can avail the best of the resources realtime without offices, infrastructure, capital or people. Crowd Sourcing and Cloud Computing will take innovation and starting up to a whole new level.</p>
<p>Enough of a digression though <img src='http://bhavin.directi.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  &#8211; having spent a better part of my Sunday researching Crowdsourcing &#8211; here is a compendium of resources for your benefit -</p>
<p><strong>Articles</strong></p>
<div id="_mcePaste">
<ul>
<li><a href="http://www.wired.com/wired/archive/14.06/crowds.html?pg=1&amp;topic=crowds&amp;topic_set=">http://www.wired.com/wired/archive/14.06/crowds.html?pg=1&amp;topic=crowds&amp;topic_set=</a></li>
<li><a href="http://www.billionswithzeroknowledge.com/2006/11/07/goldsourcing/">http://www.billionswithzeroknowledge.com/2006/11/07/goldsourcing/</a></li>
<li><a href="Look who's Crowdsourcing - http://www.wired.com/wired/archive/14.06/look.html">Look who&#8217;s Crowdsourcing &#8211; http://www.wired.com/wired/archive/14.06/look.html</a></li>
<li><a href="http://www.businessweek.com/innovate/content/jun2009/id20090615_946326.htm">http://www.businessweek.com/innovate/content/jun2009/id20090615_946326.htm</a></li>
</ul>
</div>
<p><strong>Videos</strong></p>
<div id="_mcePaste">
<ul>
<li>Video explaining uTest&#8217;s crowd sourcing offer &#8211; <a href="http://www.utest.com/watch-our-demo">http://www.utest.com/watch-our-demo</a></li>
<li>Nice video explaining the Chaordix Crowdsourcing platform &#8211; <a href="http://www.chaordix.com/why-crowdsource">http://www.chaordix.com/why-crowdsource</a></li>
</ul>
</div>
<p><strong>Books</strong></p>
<div id="_mcePaste">
<ul>
<li><a href="http://www.amazon.com/Crowdsourcing-Power-Driving-Future-Business/dp/0307396207">http://www.amazon.com/Crowdsourcing-Power-Driving-Future-Business/dp/0307396207</a></li>
<li>The Wisdom of Crowds &#8211; <a href="http://www.amazon.com/Wisdom-Crowds-James-Surowiecki/dp/0385721706/ref=pd_sim_b_1">http://www.amazon.com/Wisdom-Crowds-James-Surowiecki/dp/0385721706/ref=pd_sim_b_1</a></li>
</ul>
</div>
<p><strong>Crowdsourcing websites</strong></p>
<div>
<ul>
<li>Lists
<ul>
<li><a href="http://www.openinnovators.net/list-open-innovation-crowdsourcing-examples/">http://www.openinnovators.net/list-open-innovation-crowdsourcing-examples/</a></li>
<li><a href="http://www.readwriteweb.com/archives/crowdsourced_workforce_guide.php">http://www.readwriteweb.com/archives/crowdsourced_workforce_guide.php</a></li>
<li>A collection of &gt;100 successful crowdsourcing examples &#8211; <a href="http://crowdsourcingexamples.pbworks.com/">http://crowdsourcingexamples.pbworks.com/</a></li>
<li><a href="http://innovationzen.com/blog/2006/08/01/top-10-crowdsourcing-companies/">http://innovationzen.com/blog/2006/08/01/top-10-crowdsourcing-companies/</a></li>
<li><a href="http://en.wikipedia.org/wiki/List_of_crowdsourcing_projects">http://en.wikipedia.org/wiki/List_of_crowdsourcing_projects</a></li>
<li><a href="http://econsultancy.com/blog/4355-10-kickass-crowdsourcing-sites-for-your-business">http://econsultancy.com/blog/4355-10-kickass-crowdsourcing-sites-for-your-business</a></li>
<li><a href="http://www.chaordix.com/blog/2009/08/13/crowdsourcing-whos-doing-it/">http://www.chaordix.com/blog/2009/08/13/crowdsourcing-whos-doing-it/</a></li>
<li><a href="http://www.inspiredm.com/2009/07/06/10-crowdsourcing-marketplaces-for-all-the-designers-and-freelancers/">http://www.inspiredm.com/2009/07/06/10-crowdsourcing-marketplaces-for-all-the-designers-and-freelancers/</a></li>
</ul>
</li>
<li>Engineers/Scientists
<ul>
<li><a href="http://www.innocentive.com/">http://www.innocentive.com/</a></li>
<li><a href="http://ninesigma.com">http://ninesigma.com</a></li>
<li><a href="http://yourencore.com">http://yourencore.com</a></li>
</ul>
</li>
<li>Finance
<ul>
<li><a href="http://www.marketocracy.com">http://www.marketocracy.com</a></li>
</ul>
</li>
<li>Consultants
<ul>
<li><a href="http://www.cambrianhouse.com/">http://www.cambrianhouse.com/</a></li>
<li><a href="http://crowdadopter.com">http://crowdadopter.com</a></li>
<li><a href="http://www.romanlogic.com">http://www.romanlogic.com</a></li>
</ul>
</li>
<li>Software
<ul>
<li>Software Testing &#8211; <a href="http://www.utest.com/">http://www.utest.com/</a></li>
<li><a href="http://topcoder.com">http://topcoder.com</a></li>
<li><a href="http://www.rentacoder.com/">http://www.rentacoder.com/</a></li>
</ul>
</li>
<li>Design
<ul>
<li><a href="http://www.crowdspring.com/">http://www.crowdspring.com/</a></li>
</ul>
</li>
<li>Manual tasks
<ul>
<li><a href="https://www.mturk.com/mturk/welcome">https://www.mturk.com/mturk/welcome</a></li>
</ul>
</li>
<li>Translation
<ul>
<li><a href="http://meglobe.com/">http://meglobe.com/</a></li>
</ul>
</li>
</ul>
</div>
 <img src="http://bhavin.directi.com/wp-content/plugins/feed-statistics.php?view=1&post_id=361" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://bhavin.directi.com/crowd-sourcing-harnessing-the-power-of-the-people/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
	</channel>
</rss>

