Introduction to Postgres in Node

February 12, 2020

This article is a written version of Rithm School’s Introduction to Postgres in Node.

The goals of this lecture are:

  1. Use pg to connect and execute SQL queries
  2. Understand what SQL injection is and how to prevent it
  3. Examine CRUD on a single resource with Express and pg

The Node SQL Ecosystem

Before we dive into pg as a library, let's first review the options we have when we've determined that we want to use SQL in our Node project.

Let's cover each of our options, from most "full-featured" to least.

ORMs

Previously in our work with Python, we learned about ORMs. ORMs (Object Relational Mappers) are our "heavy-lifters", handling for us the mapping of our relational database to model classes. These relationships are what allow us to write our code in the Object-Oriented way we've become accustomed to.

In Node, a popular choice for this category is Sequelize.

Query Builders

A step below ORMs sit our query builders. Query builders provide us with a thin layer of abstraction over our database, allowing us to use query-like methods in place of raw SQL.

We can even chain methods to create more complex queries, which should feel familiar to the JavaScript (particularly jQuery) that we've written previously:

const student = await knex.first("fname", "lname")
                          .from("students")
                          .where("id", id);

Knex, used above, is one of the available options within this category.

Drivers

Finally, we have our third and "lowest level" option, and the one that we'll be covering today: drivers.

Drivers allow us to connect directly to our database and send in SQL queries to our database. If we think back to our previous work in Python, the library psycopg2 did exactly this.

The Node driver that we'll be using (and covering in the remainder of this post) is pg.

pg

Installing pg

To install pg, we run the following command within our project:

$ npm install pg
pg + Database Configuration

As we always do, we first need to set up pg to understand what database it's going to be communicating with.

To do so, we'll create a separate db.js file:

/** Database setup for users. */

const { Client } = require("pg");

let DB_URI;

if (process.env.NODE_ENV === "test") {
  DB_URI = "postgresql:///users_test";
} else {
  DB_URI = "postgresql:///users";
}

let db = new Client({
  connectionString: DB_URI
});

db.connect();

module.exports = db;

The code above does several important things for us:

  1. Identifies Database(s) Specifies which database to connect to, based on our current environment [Lines 7-11].
  2. Establishes Database Connection Passing in the address of our database to a new instance of Client, we establish a connection to our database [Lines 13-17].
  3. Export Database Connection We export the database connection we've established for us in the rest of our Node application [Line 19].

pg Queries

Now that we've established a connection with our database, let's look at how to execute some SQL queries.

To start, at the top of the file we want to make queries, we need to be sure to import our database connection:

const db = require("../db");

In our example, we will be working in our routes.js file, which will hold all of our application's routes.

In the snippet below, we're creating a route to get all of the users in our database. When a user visits our /all endpoint, we run a SQL query and return the results as JSON:

router.get("/all", function (req, res, next) {
  const results = db.query(`SELECT * FROM users`);
  return res.json(results.rows);
});

But there's a problem here! This res won't actually return any users, even if they exist as rows in our users table.

Database Queries are Asynchronous

In order to return the results of a query, we have to make sure we (a)wait for the operation to be completed before we return it.

With a minor refactor to include async and await, our query will return the results we'd expect:

router.get("/all", async function (req, res, next) {
  const results = await db.query(`SELECT * FROM users`);
  return res.json(results.rows);
});

Query String Interpolation

Sometimes it's the case that we'd like to send dynamic information to our queries.

Perhaps our end user is searching for a particular keyword or column value, and we'd like to incorporate that directly into the query we're executing:

router.get("/search", async function (req, res, next) {
    const type = req.body.type;
    const results = await db.query(
      `SELECT id, name, type 
       FROM users
       WHERE type='${type}'`);

    return res.json(results.rows);
});

In the case above, we're accepting a type of employee that has been submitted by the user. This query would return exactly what we'd expect, but there is a huge problem with this code.

SQL Injection

SQL injection is a technique where an attacker executes undesirable SQL statements on a database.

If we look back to our previous example, as long as the user behaves in the way that we think (i.e., sending in an innocent string value), there is no problem.

However, we should always be coding defensively. If we assume instead that our user is not a good actor, but a malicious one, we discover that our use of direct string interpolation allows a little too much freedom by way of query execution.

For example, if our user searches for a type of "staff", everything works fine.

But, if they instead set the type to "bwah-hah'; DELETE FROM users; --", we have a problem.

The resulting query would be as follows:

SELECT id, type, name
FROM users
WHERE type='bwah-hah'; DELETE FROM users; --'

By placing the user input directly into our query, we are also giving them the ability to hijack the query, terminate it early, and finish it off with their own subclause (in this case, DELETE).

How can we avoid being the victims of a SQL injection attack?

Parameterized Queries

By using a feature called parameterized queries, we can ensure that any values injected into our queries that come from our users are sanitized and therefore safe to run against our database.

In the past, when we've worked with an ORM to communicate SQL to our database, we haven't had to worry about this. That's because the ORM, doing many things for us, also took care of this sanitization.

Instead of sending in the user-submitted values directly, we instead will write placeholders in the query, and then send in an array of the actual values as our second argument, as seen below:

router.get("/users", async function (req, res, next) {
  const type = req.body.type;

  const results = await db.query(`SELECT id, name, type 
                                  FROM users
                                  WHERE type=$1`, [type]);

  return res.json(results.rows);
});

Take note of the parameterized query on Lines 4-6, which is the first argument to our db.query() method.

We've used a value placeholder, which is designated by the $1. We've also passed in an array of the value, corresponding to the placeholder, as the second argument of the function.

If we were including more than one variable, our placeholders would continue on with $2, $3, etc., and those values would also appear -- in order -- in the array that follows.

We can see this in action in the route snippet below:

router.post("/", async function (req, res, next) {
  const { name, type } = req.body;

  const result = await db.query(
    `INSERT INTO users (name, type) 
     VALUES ($1, $2)
     RETURNING id, name, type`, 
    [name, type]);

  return res.status(201).json(result.rows[0]);
});

Notice that we're using two variables (with placeholders $1 and $2 in our query), whose actual values we're passing in as the second argument on Line 8.

💡In SQL, we can use the RETURN clause to specify which columns we'd like returned in our response. Similar to what we've learned about using *, it follows that it's always better to be explicit and specific about the values you'd like to return, rather than returning everything in one go.
Committing Database Changes

During our work with SQLAlchemy, we got into the habit of having to explicitly commit our changes. This was because SQLAlchemy put the work that we did into a database transaction.

With pg, this is not the case: every action we make against our database is committed automatically.

Testing Our Database

As we saw in our setup for pg within our db.js file at the beginning of this post, we want to make sure that we have a separate database for testing purposes.

To do so, we put some conditional logic at the top of our db.js file that indicates if our process is running in a test environment, we will use the test database, rather than production:

const { Client } = require("pg");

const DB_URI = (process.env.NODE_ENV === "test")
  ? "postgresql:///cats_test"
  : "postgresql:///cats";

let db = new Client({
  connectionString: DB_URI
});

db.connect();

module.exports = db;
💡If you run your tests and they just "hang" (don't do anything at all), make sure that you've actually created the test database that you're attempting to connect with!

Setting Up and Tearing Down Test Database

Before writing any of our actual tests, in our test.js file, we need to be sure that we're setting the NODE_ENV to test, so that we connect with the test database rather than our production database.

After that, we import any dependencies we need, as well as our applciation and our database connection:

// connect to right DB --- set before loading db.js
process.env.NODE_ENV = "test";

// npm packages
const request = require("supertest");

// app
const app = require("../app");
const db = require("../db");

Once we've set up our database connection, as well as brought in everything we need, we can start writing our tests.

beforeEach

With beforeEach, we specify something that we want to occur before each one of our tests is run. In this case, we'd like to create a single record in the test database that we can work with:

let testUser;

beforeEach(async function() {
  let result = await db.query(`
    INSERT INTO
      cats (name) VALUES ('testUser')
      RETURNING id, name`);
  testUser = result.rows[0];
});
afterEach and afterAll

Using afterEach, we clean up our test database by deleting all records that were inadvertantly created during the course of our last test:

afterEach(async function() {
  // delete any data created by test
  await db.query("DELETE FROM cats");
});

Finally, we use afterAll to close our database connection once all of our tests have finished running:

afterAll(async function() {
  // close db connection
  await db.end();
});
Testing CRUD Actions

Let's look at some example tests, given what we've covered so far.

In the case that we'd like to ensure that our route is returning all users, our test would look like the following:

describe("GET /users", function() {
  test("Gets a list of 1 user", async function() {
    const response = await request(app).get(`/users`);
    expect(response.statusCode).toEqual(200);
    expect(response.body).toEqual({
      users: [testUser]
    });
  });
});

And, if we'd like to test an endpoint for a specific user:

describe("GET /users/:id", function() {
  test("Gets a single user", async function() {
    const response = await request(app).get(`/users/${testUser.id}`);
    expect(response.statusCode).toEqual(200);
    expect(response.body).toEqual({user: testUser});
  });

  test("Responds with 404 if can't find user", async function() {
    const response = await request(app).get(`/users/0`);
    expect(response.statusCode).toEqual(404);
  });
});

To test the creation of a user:

describe("POST /users", function() {
  test("Creates a new user", async function() {
    const response = await request(app)
      .post(`/users`)
      .send({
        name: "Alissa"
      });
    expect(response.statusCode).toEqual(201);
    expect(response.body).toEqual({
      user: {id: expect.any(Number), name: "Alissa"}
    });
  });
});

To test the update of a user:

describe("PATCH /users/:id", function() {
  test("Updates a single user", async function() {
    const response = await request(app)
      .patch(`/users/${testUser.id}`)
      .send({
        name: "Troll"
      });
    expect(response.statusCode).toEqual(200);
    expect(response.body).toEqual({
      user: {id: testUser.id, name: "Troll"}
    });
  });

  test("Responds with 404 if can't find user", async function() {
    const response = await request(app).patch(`/users/0`);
    expect(response.statusCode).toEqual(404);
  });
});

And, last but not least, testing to ensure user deletion works as expected:

describe("DELETE /users/:id", function() {
  test("Deletes a single a user", async function() {
    const response = await request(app)
      .delete(`/users/${testUser.id}`);
    expect(response.statusCode).toEqual(200);
    expect(response.body).toEqual({ message: "User deleted" });
  });
});