NoSQL: A Newer Kind of Database
An incredible amount of startups developing Big Data technologies have made their debut in the last few years. Many of them are companies developing their own flavor of NoSQL. Some of these companies are well established, and are thinking about going public. But does that mean you should invest in them when they do? The key is in understanding NoSQL’s strengths, weaknesses, and potential clients.
The Business of NoSQL and Big Data
The term Big Data refers to data sets so large that it becomes too difficult to process using traditional relational database technology. A vast amount of astronomical data produced when a modern telescope scans the sky is one example. Millions of clicks on a Web retailer’s site is another. The growing volume of information has given Big Data technologies a new importance in the IT world. NoSQL is such a technology. It captures, stores, and facilitates analysis of these massive quantities of data. It is also much better suited to the three Vs of Big Data — volume, velocity, and variety — than its relational counterpart. However, the type of performance NoSQL databases have become popular for does not come without some tradeoffs.
Startups Developing NoSQL
The reason so many NoSQL startups have popped up virtually overnight may be due to the fact that NoSQL is open source software. The source code is freely available to all, so engineers can use an already existing base of code to build on rather than reinvent the wheel. They can unleash their creativity by adding new features, or by improving on the NoSQL database engine. This can be a good thing regarding technological advancements and breakthroughs. However, it could also lead to a lot of overlap as many companies offer the same technology with minor modifications and moderately desirable features. Some of the more well-known and established NoSQL databases include Cassandra, Riak, CouchDB, and MongoDB. A comprehensive list of companies working on NoSQL technologies can be found at nosql-database.org. Some of these startups will go public in the very near future. The trick to picking the winners is to be able to differentiate the ones offering something truly revolutionary from the ones just working on another flavor of the same NoSQL database.
NoSQL vs. Relational
The speed and ease with which NoSQL can capture, process, and allow the analysis of massive volumes of data makes Big Data analytics possible. Does this mean that databases using the relational model, like Oracle, MySQL, and Microsoft SQL Server are on their way out? On the contrary, these two very different database models complement each other. The most notable differences between NoSQL and relational databases are scalability and schema flexibility.
- Scalability: The horizontal scalability of NoSQL makes it a superior choice for dealing with Big Data. When the workload increases, relational databases scale up. This means existing hardware has to be upgraded. Powerful, expensive servers are purchased to be able to cope with the increased load. NoSQL databases can scale out rather than up. Scaling out involves adding several inexpensive servers, also known as commodity servers. In this way, a cluster of servers can pool resources to deal with the increased demand. This, however, is not exclusive to NoSQL. Relational databases can scale out as well, but cost can quickly become prohibitive. NoSQL was designed to make scaling out as easy, efficient, and cost-effective as possible.
- Schema: In the relational model, the schema is a logical grouping of database objects, acting as a kind of blueprint for the database. Relational databases have very rigid schemas. This is necessary for the accurate and consistent storage of data. However, even minor changes to the database can be a big ordeal that will require careful planning and database downtime. In contrast, NoSQL databases have a very flexible schema. Some versions do not even have a schema at all. This is how NoSQL can handle unstructured data from the Web and other sources so well. NoSQL databases are very forgiving when it comes to changes to the database structure, making it less complicated and disruptive. The tradeoff, however, can be a significant reduction in database integrity. This is an area that relational databases have evolved to near-perfection over decades.
NoSQL will not replace the relational database; at least not for the foreseeable future. Both have strengths and weaknesses. If data integrity and consistency across the enterprise is a priority, the maturity of the relational model is unbeatable. If gargantuan amounts of data must be captured and analyzed, the least costly, most efficient way to do this is with Big Data technologies like NoSQL. In fact, it is not uncommon for organizations to have both types of database working side by side. Looking at it from an investment point of view, NoSQL is not yet a mature enough technology that can completely replace the relational model. Before investing in any of these startups, we must ask ourselves two important questions. First, is their version of NoSQL a truly revolutionary new concept that may one day replace the traditional relational database? Second, how many organizations will see their technology as indispensable? If both these questions can be answered favorably, then it could be a very interesting investment.