Writing a database in Ruby

I decided to write a database! I have been using databases for years but how they work is still mostly a mystery to me. For instance, how do database systems do ACID?

Today I am writing about how to answer queries. Familiarity with SQL, Ruby, and functional programming is assumed.

Using Ruby's Lazy Enumerators to Answer Queries

A lot of the time you want your database to store a lot of data and respond to queries quickly. For example you might want to get the first ten books in your database that Ryan owns.

You do not necessarily want your database to read every one of millions of rows to do this. You just want it to read rows until it finds ten that have the value "Ryan" in the owner field. The challenge is to do this in a modular, composable way.

Ruby's Enumerator is very helpful. Enumerators are often called generators in other languages. You can read more about them here.

This code defines a class Query that uses Enumerators to help us answer queries.

initialize() creates a new Enumerator. The first time the enumerator is called, it opens a file, reads a record from the file, and returns the record. The next time the enumerator is called, it reads the next record from the file and returns that record. It continues to return the next record until it has run out of records to read.

where() updates the enumerator so it only yields records that meet certain criteria. It works by calling select on the enumerator object. select is sometimes called filter in other languages (such as javascript). This select is different though. This select belongs to the Lazy Enumerator class. Calling Enumerator::Lazy#select returns a new Lazy Enumerator that only yields records that meet certain criteria. This is good because we still only need to have one record in memory at a time instead of reading them all into memory at once.

select() updates the enumerator so it yields only certain attributes of each record. It is similar to where() except it uses Enumerator::Lazy#map.

And finally top() takes the first num records from our enumerator and returns them.

Using this Query class we could translate a SQL query like this


      SELECT TOP 5 title
      FROM book
      WHERE owner = 'Ryan'
  
into this code Query.new("book.db").where({:owner=>"Ryan"}).select(:name).top(5)