Author Archives: paulo
Emulating enums in python
Enums are common constructs in languages like C++ and Java that can improve readability (and maybe performance?) when dealing with status codes and conditional statements over constant sequences, such as in switch-case statements (which, as a matter of fact, also … Continue reading
A Glimpse on Storage Area Networks
The conventional approach for storage in organizations is to have a set of application servers with storage interconnected through a LAN. In this approach, data is shared between application servers using ad-hoc methods which vary according to application’s requirements. Additionally, … Continue reading
Enabling deduplication in a distributed object storage
In this post I will describe the initial architecture of a distributed object storage that supports deduplication: both in the object level and in the block level. This architecture is one of the main contributions of my master thesis on distributed … Continue reading
Using deduplication to reduce storage demands on Cloud Providers
In the paper, “The Effectiveness of Deduplication on Virtual Machine Disk Images“, the authors perform an in-depth analysis of several factors that may or may not impact the level of deduplication of virtual machine images. So, what’s exactly deduplication? The main … Continue reading
Leveraging data commonality with content-addressable storage
There are a lot of similarities between subsequent releases of a software at the binary level. In the figures below, each series represent how many percent of binary data blocks are exactly the same between a reference release of a software … Continue reading
A Scalable Architecture for a Distributed Content-Addressable Storage System
Definition Content-Addressable-Storage (CAS) – A fancy name for a simple storage technique: instead of indexing stored objects by their location (such as file://home/user/example or http://www.example.com/file.jpg), as done in traditional storage systems, index objects by their content. This is typically done by hashing the … Continue reading
Hello, World!
I’ve always wanted to have a tech blog to write about what I’ve been doing at university, eventual side-projects and some random stuff I come across every day. However I never took the time to start it, but now I … Continue reading