Author Archives: paulo

Emulating enums in python

Enums are common constructs in languages like C++ and Java that can improve readability (and maybe performance?) when dealing with status codes and conditional statements over constant sequences, such as in switch-case statements (which, as a matter of fact, also … Continue reading

Posted in Uncategorized | 4 Comments

A Glimpse on Storage Area Networks

The conventional approach for storage in organizations is to have a set of application servers with storage interconnected through a LAN.  In this approach, data is shared between application servers using ad-hoc methods which vary according to application’s requirements. Additionally, … Continue reading

Posted in Uncategorized | Leave a comment

Enabling deduplication in a distributed object storage

In this post I will describe the initial architecture of a distributed object storage that supports deduplication: both in the object level and in the block level. This architecture is one of the main contributions of my master thesis on distributed … Continue reading

Posted in Uncategorized | 2 Comments

Using deduplication to reduce storage demands on Cloud Providers

In the paper, “The Effectiveness of Deduplication on Virtual Machine Disk Images“, the authors perform an in-depth analysis of several factors that may or may not impact the level of deduplication of virtual machine images. So, what’s exactly deduplication? The main … Continue reading

Posted in Uncategorized | 1 Comment

Leveraging data commonality with content-addressable storage

There are a lot of similarities between subsequent releases of a software at the binary level. In the figures below, each series represent how many percent of binary data blocks are exactly the same between a reference release of a software … Continue reading

Posted in Uncategorized | 1 Comment

A Scalable Architecture for a Distributed Content-Addressable Storage System

Definition Content-Addressable-Storage (CAS) – A fancy name for a simple storage technique: instead of indexing stored objects by their location (such as file://home/user/example or http://www.example.com/file.jpg), as done in traditional storage systems, index objects by their content. This is typically done by hashing the … Continue reading

Posted in Uncategorized | 6 Comments

Hello, World!

I’ve always wanted to have a tech blog to write about what I’ve been doing at university, eventual side-projects and some random stuff I come across every day. However I never took the time to start it, but now I … Continue reading

Posted in Uncategorized | 1 Comment