Overview
This is part two out of five in a tutorial series on testing data-intensive code. In part one, I covered the design of an abstract data layer that enables proper testing, how to handle errors in the data layer, how to mock data access code, and how to test against an abstract data layer. In this tutorial, I'll go over testing against a real in-memory data layer based on the popular SQLite.
Testing Against an In-Memory Data Store
Testing against an abstract data layer is great for some use cases where you need a lot of precision, you understand exactly what calls the code under test is going to make against the data layer, and you're OK with preparing the mock responses.
Sometimes, it's not that easy. The series of calls to the data layer may be difficult to figure, or it takes a lot of effort to prepare proper canned responses that are valid. In these cases, you may need to work against an in-memory data store.
The benefits of an in-memory data store are:
- It's very fast.
- You work against an actual data store.
- You can often populate it from scratch using files or code.
In particular if your data store is a relational DB then SQLite is a fantastic option. Just remember that there are differences between SQLite and other popular relational DBs like MySQL and PostgreSQL.
Make sure you account for that in your tests. Note that you still access your data through the abstract data layer, but now the backing store during tests is the in-memory data store. Your test will populate the test data differently, but the code under test is blissfully unaware of what's going on.
Using SQLite
SQLite is an embedded DB (linked with your application). There is no separate DB server running. It typically stores the data in a file, but also has the option of an in-memory backing store.
Here is the InMemoryDataStore
struct. It is also part of the concrete_data_layer
package, and it imports the go-sqlite3 third-party package that implements the standard Golang "database/sql" interface.
package concrete_data_layer import ( "database/sql" . "abstract_data_layer" _ "github.com/mattn/go-sqlite3" "time" "fmt" ) type InMemoryDataLayer struct { db *sql.DB }
Constructing the In-Memory Data Layer
The NewInMemoryDataLayer()
constructor function creates an in-memory sqlite DB and returns a pointer to the InMemoryDataLayer
.
func NewInMemoryDataLayer() (*InMemoryDataLayer, error) { db, err := sql.Open("sqlite3", ":memory:") if err != nil { return nil, err } err = createSqliteSchema(db) return &InMemoryDataLayer{db}, nil }
Note that each time you open a new ":memory:" DB, you start from scratch. If you want persistence across multiple calls to NewInMemoryDataLayer()
, you should use file::memory:?cache=shared
. See this GitHub discussion thread for more details.
The InMemoryDataLayer
implements the DataLayer
interface and actually stores the data with correct relationships in its sqlite database. In order to do that, we first need to create a proper schema, which is exactly the job of the createSqliteSchema()
function in the constructor. It creates three data tables—song, user, and label—and two cross-reference tables, label_song
and user_song
.
It adds some constraints, indexes, and foreign keys to relate the tables to each other. I will not dwell on the specific details. The gist of it is that the entire schema DDL is declared as a single string (consisting of multiple DDL statements) that are then executed using the db.Exec()
method, and if anything goes wrong, it returns an error.
func createSqliteSchema(db *sql.DB) error { schema := ` CREATE TABLE IF NOT EXISTS song ( id INTEGER PRIMARY KEY AUTOINCREMENT, url TEXT UNIQUE, name TEXT, description TEXT ); CREATE TABLE IF NOT EXISTS user ( id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT, email TEXT UNIQUE, registered_at TIMESTAMP, last_login TIMESTAMP ); CREATE INDEX user_email_idx ON user(email); CREATE TABLE IF NOT EXISTS label ( id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT UNIQUE ); CREATE INDEX label_name_idx ON label(name); CREATE TABLE IF NOT EXISTS label_song ( label_id INTEGER NOT NULL REFERENCES label(id), song_id INTEGER NOT NULL REFERENCES song(id), PRIMARY KEY (label_id, song_id) ); CREATE TABLE IF NOT EXISTS user_song ( user_id INTEGER NOT NULL REFERENCES user(id), song_id INTEGER NOT NULL REFERENCES song(id), PRIMARY KEY(user_id, song_id) );` _, err := db.Exec(schema) return err }
It's important to realize that while SQL is standard, each database management system (DBMS) has its own flavor, and the exact schema definition will not necessarily work as is for another DB.
Implementing the In-Memory Data Layer
To give you a taste of the implementation effort of an in-memory data layer, here are a couple of methods: AddSong()
and GetSongsByUser()
.
The AddSong()
method does a lot of work. It inserts a record into the song
table as well as into each of the reference tables: label_song
and user_song
. At each point, if any operation fails, it just returns an error. I don't use any transactions because it is designed for testing purposes only, and I don't worry about partial data in the DB.
func (m *InMemoryDataLayer) AddSong(user User, song Song, labels []Label) error { s := `INSERT INTO song(url, name, description) values(?, ?, ?)` statement, err := m.db.Prepare(s) if err != nil { return err } result, err := statement.Exec(song.Url, song.Name, song.Description) if err != nil { return err } songId, err := result.LastInsertId() if err != nil { return err } s = "SELECT id FROM user where email = ?" rows, err := m.db.Query(s, user.Email) if err != nil { return err } var userId int for rows.Next() { err = rows.Scan(&userId) if err != nil { return err } } s = `INSERT INTO user_song(user_id, song_id) values(?, ?)` statement, err = m.db.Prepare(s) if err != nil { return err } _, err = statement.Exec(userId, songId) if err != nil { return err } var labelId int64 s := "INSERT INTO label(name) values(?)" label_ins, err := m.db.Prepare(s) if err != nil { return err } s = `INSERT INTO label_song(label_id, song_id) values(?, ?)` label_song_ins, err := m.db.Prepare(s) if err != nil { return err } for _, t := range labels { s = "SELECT id FROM label where name = ?" rows, err := m.db.Query(s, t.Name) if err != nil { return err } labelId = -1 for rows.Next() { err = rows.Scan(&labelId) if err != nil { return err } } if labelId == -1 { result, err = label_ins.Exec(t.Name) if err != nil { return err } labelId, err = result.LastInsertId() if err != nil { return err } } result, err = label_song_ins.Exec(labelId, songId) if err != nil { return err } } return nil }
The GetSongsByUser()
uses a join + sub-select from the user_song
cross-reference to return songs for a specific user. It uses the Query()
methods and then later scans each row to populate a Song
struct from the domain object model and return a slice of songs. The low-level implementation as a relational DB is hidden safely.
func (m *InMemoryDataLayer) GetSongsByUser(u User) ([]Song, error) { s := `SELECT url, title, description FROM song L INNER JOIN user_song UL ON UL.song_id = L.id WHERE UL.user_id = (SELECT id from user WHERE email = ?)` rows, err := m.db.Query(s, u.Email) if err != nil { return nil, err } for rows.Next() { var song Song err = rows.Scan(&song.Url,&song.Title,&song.Description) if err != nil { return nil, err } songs = append(songs, song) } return songs, nil }
This is a great example of utilizing a real relational DB like sqlite for implementing the in-memory data store vs. rolling our own, which would require keeping maps and ensuring all the book-keeping is correct.
Running Tests Against SQLite
Now that we have a proper in-memory data layer, let's have a look at the tests. I placed these tests in a separate package called sqlite_test
, and I import locally the abstract data layer (the domain model), the concrete data layer (to create the in-memory data layer), and the song manager (the code under test). I also prepare two songs for the tests from the sensational Panamanian artist El Chombo!
package sqlite_test import ( "testing" . "abstract_data_layer" . "concrete_data_layer" . "song_manager" ) const ( url1 = "https://www.youtube.com/watch?v=MlW7T0SUH0E" url2 = "https://www.youtube.com/watch?v=cVFDlg4pbwM" ) var testSong = Song{Url: url1, Name: "Chacaron"} var testSong2 = Song{Url: url2, Name: "El Gato Volador"}
Test methods create a new in-memory data layer to start from scratch and can now call methods on the data layer to prepare the test environment. When everything is set up, they can invoke the song manager methods and later verify that the data layer contains the expected state.
For example, the AddSong_Success()
test method creates a user, adds a song using the song manager's AddSong()
method, and verifies that later calling GetSongsByUser()
returns the added song. It then adds another song and verifies again.
func TestAddSong_Success(t *testing.T) { u := User{Name: "Gigi", Email: "gg@gg.com"} dl, err := NewInMemoryDataLayer() if err != nil { t.Error("Failed to create in-memory data layer") } err = dl.CreateUser(u) if err != nil { t.Error("Failed to create user") } lm, err := NewSongManager(u, dl) if err != nil { t.Error("NewSongManager() returned 'nil'") } err = lm.AddSong(testSong, nil) if err != nil { t.Error("AddSong() failed") } songs, err := dl.GetSongsByUser(u) if err != nil { t.Error("GetSongsByUser() failed") } if len(songs) != 1 { t.Error(`GetSongsByUser() didn't return one song as expected`) } if songs[0] != testSong { t.Error("Added song doesn't match input song") } // Add another song err = lm.AddSong(testSong2, nil) if err != nil { t.Error("AddSong() failed") } songs, err = dl.GetSongsByUser(u) if err != nil { t.Error("GetSongsByUser() failed") } if len(songs) != 2 { t.Error(`GetSongsByUser() didn't return two songs as expected`) } if songs[0] != testSong { t.Error("Added song doesn't match input song") } if songs[1] != testSong2 { t.Error("Added song doesn't match input song") } }
The TestAddSong_Duplicate()
test method is similar, but instead of adding a new song the second time, it adds the same song, which results in a duplicate song error:
u := User{Name: "Gigi", Email: "gg@gg.com"} dl, err := NewInMemoryDataLayer() if err != nil { t.Error("Failed to create in-memory data layer") } err = dl.CreateUser(u) if err != nil { t.Error("Failed to create user") } lm, err := NewSongManager(u, dl) if err != nil { t.Error("NewSongManager() returned 'nil'") } err = lm.AddSong(testSong, nil) if err != nil { t.Error("AddSong() failed") } songs, err := dl.GetSongsByUser(u) if err != nil { t.Error("GetSongsByUser() failed") } if len(songs) != 1 { t.Error(`GetSongsByUser() didn't return one song as expected`) } if songs[0] != testSong { t.Error("Added song doesn't match input song") } // Add the same song again err = lm.AddSong(testSong, nil) if err == nil { t.Error(`AddSong() should have failed for a duplicate song`) } expectedErrorMsg := "Duplicate song" errorMsg := err.Error() if errorMsg != expectedErrorMsg { t.Error(`AddSong() returned wrong error message for duplicate song`) } }
Conclusion
In this tutorial, we implemented an in-memory data layer based on SQLite, populated an in-memory SQLite database with test data, and utilized the in-memory data layer to test the application.
In part three, we will focus on testing against a local complex data layer that consists of multiple data stores (a relational DB and a Redis cache). Stay tuned.