21 May, 2011
Here are two interesting links I found comparing the features and performance differences between using Unix Domain Sockets and TCP Loopback Sockets
Excerpt: IP sockets over localhost are basically looped back network on-the-wireIP. There is intentionally “no special knowledge” of the fact that the connection is to the same system, so no effort is made to bypass the normal IP stack mechanisms for performance reasons. For example, transmission over TCP will always involve two context switches to get to the remote socket, as you have to switch through the netisr, which occurs following the “loopback” of the packet through the synthetic loopback interface. Likewise, you get all the overhead of ACKs, TCP flow control, encapsulation/decapsulation, etc. Routing will be performed in order to decide if the packets go to the localhost. Large sends will have to be broken down into MTU-size datagrams, which also adds overhead for large writes. It’s really TCP, it just goes over a loopback interface by virtue of a special address, or discovering that the address requested is served locally rather than over an ethernet (etc).
UNIX domain sockets have explicit knowledge that they’re executing on the same system. They avoid the extra context switch through the netisr, and a sending thread will write the stream or datagrams directly into the receiving socket buffer. No checksums are calculated, no headers are inserted, no routing is performed, etc. Because they have access to the remote socket buffer, they can also directly provide feedback to the sender when it is filling, or more importantly, emptying, rather than having the added overhead of explicit acknowledgement and window changes. The one piece of functionality that UNIX domain sockets don’t provide that TCP does is out-of-band data. In practice, this is an issue for almost noone.
Excerpt: It was hypothesized that pipes would have the highest throughtput due to its limited functionality, since it is half-duplex, but this was not true. For almost all of the data sizes transferred, Unix domain sockets performed better than both TCP sockets and pipes, as can be seen in Figure 1 below. Figure 1 shows the transfer rates for the IPC mechanisms, but it should be noted that they do not represent the speeds obtained by all of the test machines. The transfer rates are consistent across the machines with similar hardware conﬁgurations though. On some machines, Unix domain sockets reached transfer rates as high as 1500 MB/s.
4 Apr, 2010
I have spent a better part of my Sunday researching some of the messaging queue options available out there. My research was divided primarily between AMQP queues and ZeroMQ. For those who came in late - iMatix the company behind ZeroMQ was also involved in defining the AMQP specs. iMatix recently announced that they would no longer continue supporting the AMQP spec and instead focus on ZeroMQ – their brokerless queue product
ZeroMQ is a fundamentally different approach to message queues from AMQP. Pieter and folks from iMatix have made several posts on the flaws in the design and philosophy of AMQP (check the resources section below). Infact during the course of my study today these posts were amongst the most interesting material I have read.
Here are some thoughts around considerations on our part with respect to selecting a queue for our application needs -
Some queues provide guarantees of ordered delivery, while others do not. Order of delivery may not be required in all applications, and adds complexity. However as an eg – If your messages consist of updates to state, then two messages updating the same entity, may result in inconsistency if their order is switched. These complexities would then have to be resolved in your application if the underlying messaging system does not offer ordering guarantees.
If the queue itself does not manage persistence, it will have to be handled by your application. Persistence is required for reliability. A queue that manages all messages and state in memory will lose any undelivered messages incase of a node restart/downtime/crash.
Broker based queues may or may not offer clustering. Some message queues offer fail-overs but not clustering. Clustering allows multiple nodes to be used as a single active-active instance, increasing availability.
I also have a short summary on the 2 options I am currently considering -
- Crazy fast
- Brokerless architecture
- In-process library
- Lower latencies
- Very simple to use
- No persistence – requiring higher layers to manage persistence
- AMQP compliant
- Written in erlang
- Small footprint and seemingly fewer lines of code in comparison to other AMQP compliant queue managers
Here is a list of the more interesting resources I read through today:
- Whats wrong with AMQP – http://www.imatix.com/articles:whats-wrong-with-amqp - The most interesting article I read in my research. This is a fairly long article but I would strongly recommend anyone who is planning on using queues to read it end to end. Irrespective of whether we choose ZeroMQ or any AMQP compliant queue, this article provides some good insights.
- http://www.imatix.com/articles:introduction-to-restms – A nice document explaining restms protocol by pieter. RestMS can act as a bridge between Messaging systems and HTTP/REST based systems
- http://www.ipocracy.com/blog:10-principles-for-amqp – Ten ways that AMQP can be made simpler, more backwards compatible, more interesting, and overall more enjoyable and successful for all who work on it and use it
- http://www.zeromq.org/whitepapers:message-matching – highly optimized message maching algorithm
- http://www.zeromq.org/results:ib-tests-v206 – 4.7 million messages per second for a 64 byte message
27 Jan, 2010
So – OBVIOUSLY – I called up these guys and asked them to Graffiti Directiplex and THEY ARE COMING TONIGHT. Starting 10:45pm, if you visit our offices, you be able to write / draw ANYTHING on the surface of our office building with lasers.
To share this on facebook or twitter you can use - ”Come and watch #directiplex get trashed using laser graffiti tonight - http://bit.ly/aNyKP0”
Feeling creative – be there -
Time: 10:45 pm
Venue: DirectiplexAddress: Check http://directi.com/about/offices
Feel free to call your friends / family etc to watch the show as our very own HQ becomes a canvas for creativity.
24 Aug, 2009
This article provides a rough idea of how much money some of the highest ranking web destinations are making from their users -
- April-June 2009 Revenue: $5.5 billion
- 97% of above revenues are from advertising
- April-June Revenue from Google Properties: $3.6 billion
- Total US revenue April-June Revenue from Google Properties: $2.6 billion
- Number of searches performed by Americans on Google Apr-June: 27.5 billion (approx) (source comscore)
- Revenue per search: 9.5 cents
- Revenue per 1000 searches: $95
- IAC total April-June Revenue: $340 million
- Revenue from Media and Ads (Ask.com, Citysearch, Dictionary.com etc): $168 million
- 84% of this is from US: $141
- 72% of this is proprietary properties => $101 million
- Bulk of this can be assumed to come from Ask.com (lets say $90 million)
- Number of searches performed by Americans on Google Apr-June: 1.5 billion (approx) (source comscore)
- Revenue per search: 6 cents
- Revenue per 1000 searches: $60
- Dec 2008: 80 billion pageviews
- Registered users: 222 million
- Page views per user: 360 pageviews per user per month (or 12 pageviews per day avg)
- June 2009: 340 million unique visitors (77 million from US)
- May 2009: 87 billion page views (20 billion from US)
- Expected to generate over $500 million in revenue in 2009
- Rough total pageviews in 2009 => 1000 billion
- Rough Revenue per 1000 pageviews: 50 cents
- Breakup of their $550 million revenue – 125 – brand ads, 150 – deal with Msft, 75 – virtual goods, 200 – self service ads
- April-June 2009 revenue: $160 million
- Revenue per advertising customer: $791
- July 2008 Searches: 7.4 billion
- July 2008 quarter extrapolated: 22.2 billion searches
- Revenue in quarter of July 2008: $135.4 million
- Revenue per search: 0.6 cents
- Revenue per 1000 searches: $6
- Projected revenues in 2008: $100 million
- Revenue from Advertising: 25%
- Funding so far: $103 million
- Unique users as of 2009: 45 million
- March 2008 monthly visitors: 11 million
- March 2008 monthly pageviews: 115 million
- March 2008 avg minutes per visitor: 7.8 min
Ebay, Skype and Paypal
- April-June 2009 Revenue: $2 billion
- Marketplaces revenue (ebay.com, shopping.com etc): $1 billion (transaction) + $200 million (advertising)
- Marketplace Gross volume: $13.4 billion (ebay made around 10% of this in its revenues which is impressive)
- Payments revenue (paypal.com, bill-me-later): $630 million (transaction) + $39 million (advertising)
- International component of Payments revenue: $286.2 million (45%)
- Payments Total volume: $16.7 billion (ebay made around 3.9% here – which is surprising considering their paypal rates are much lower)
- Communications revenue (skype): $155 million (transaction) + $14 million (advertising)
- International component of Communications revenue: $128.5 million (83%)
- Skype registered users – 480 million
- Skypeout minutes – 2.9 billion
- Per user revenue – 32 cents per registered user per quarter
- Per user minutes – 6 minutes per user per quarter
- US revenue: $959 million
- International revenue: $1.1 billion
- Skype Q3 2007: 10 million concurrent online users at peak. 4 million at trough.
- Apr-June 2009 sales: $8.3 billion
- Geo distribution – America – 3.8, Europe – 2, others – remaining
- Product distribution – Mac – 3.3, ipod – 1.5, other music products – 1, iphone and related services – 1.6
- Units of product sold – Desktops – 0.8m, Portables – 1.7m, ipod – 10.2m, iphone – 5.2m
18 Jun, 2009
I just came across the Nginx XSLT module and had an epiphany. The module essentially accepts an HTTP request, passes it through to the backend server, receives XML from the backend server, and converts the XML to HTML by applying XSLT transforms as per XSLT stylesheets available.
So now one can essentially focus solely on a REST-XML-HTTP API when building out an application, and expose the same as an API as well as a web app by simply creating XSLT files that transform the XML into HTML. Kickass!!!