Understanding the dropbox protocol and quantifying the usage of cloud storage services

Measurement studies show that video is consuming more and more bandwidth in the global Internet. Another service whose usage is growing are the cloud-based storage services like dropbox, icloud, skydrive or GDrive. These cloud storage services use proprietary protocols and allow users to exchange files and share folders efficiently. dropbox is probably one of the most widely known storage services. It heavily relies on the amazon EC2 and AWS services. The dropbox application is easy to use, but few is known open the operation of the underlying protocol. In a paper that will be presented this fall at IMC’12, Idilio Drago and his colleagues provide a very detailed analysis of the dropbox protocol and its usage in home and campus networks [DMMunafo+12].

Several of the findings of this paper are very interesting. First, despite its popularity, dropbox is still provided from servers located mainly in the US. This implies a long round-trip-time for the large population of dropbox users who do not reside in North America. Since dropbox uses the Amazon infrastructure, it is surprising that they do not seem to use Amazon datacenters outisde the US. All the files that you store in your dropbox folder are likely stored on US servers. Another surprising result is that dropbox divides the files to be transferred in chunks of 4 MBytes and each chunk needs be acknowledged by the application. Coupled with the long round-trip-time, this results in a surprisingly low transfer rate of about 500 Kbps. This performance issue seems to have been solved recently by dropbox with the ability to send chunks in batches.

[DMMunafo+12] also provides an analysis of the operation of the main dropbox protocol. dropbox uses mainly servers hosted on Amazon datacenters for various types of operation. Although dropbox uses TLS to encrypt the data, the authors used SSLBump running on squid to perform a man-in-the-middle attack between a dropbox client and the official servers.

../../../_images/dropbox.png

An example of a storage operation with dropbox (source [DMMunafo+12])

Another interesting information provided in [DMMunafo+12] is an analysis of the dropbox traffic in campus and home networks. This analysis performed by using tstat shows that cloud storage services already contribute to a large volume of data in the global Internet. The analysis also considers the percentage of clients that are uploading, downloading and silent. Users who have installed dropbox but are not using it should be aware that the dropbox client always opens connections to dropbox servers, even if no data needs to be exchanged. The entire dataset collected for is available from http://www.simpleweb.org/wiki/Dropbox_Traces

[DMMunafo+12](1, 2, 3, 4) Idilio Drago, Marco Mellia, Maurizio M. Munafò, Anna Sperotto, Ramin Sadre, and Aiko Pras. Inside Dropbox: Understanding Personal Cloud Storage Services. In Proceedings of the 12th ACM SIGCOMM Conference on Internet Measurement, IMC’12. 2012.