Abstract: Historically, the term "Hadoop" has been considered synonymous with its core technologies: MapReduce and the Hadoop Distributed File System (HDFS). But today the definition of Hadoop is rapidly evolving.
The Hadoop community is generalizing the application runtime model beyond MapReduce. On the storage front, we're seeing the emergence of many alternative Hadoop-compatible file systems. Red Hat has built an interface layer for its Red Hat Storage Server product. This complete implementation of the Hadoop file system interface lets Hadoop-related projects run transparently, directly on a Red Hat Storage Server cluster.
This talk will concentrate on contrasting the architectures of the HDFS against alternate Hadoop file systems—focused primarily on GlusterFS (the underlying technology for Red Hat Storage Server), but also examining implementations done on top of NoSQL and object storage. We'll talk about:
* The current implementation.
* What we've learned about compatibility, performance, and scalability.
* The roadmap for next-generation implementations.