×
By the end of this chapter, you should be able to:
One of the most fundamental and important security measures you can take as an API developer is preventing bad user inputs from messing up your API. The overall goal here is that you identify bad inputs as quickly as possible and respond with more 400 - Bad Request
responses than 500 - Internal Server Error
responses. Recall that a 400
status code is issued when the server proactively identifies that the user has sent some invalid or incorrect data, while a 500
status code indicates that the server itself broke because it was unable to handle the request.
A server lacking adequate validation can result in:
In this section, we are particularly interested in validating data at any API endpoint that accepts a user-defined JSON payload (for example a POST, PUT, or PATCH to /users
). User-written payloads can often be large, complex, and often manually entered by a user (for example, form input). Therefore, these payloads are extra prone to having errors in them.
This is where JSON Schema comes in.
There are three main reasons for using a schema validation system:
Before we get into how to use it, let's talk about rolling our own validation first.
Let's assume you have a /books
endpoint, and the JSON payload to add a new book looks like this:
{ "data": { "amazon-url": "http://a.co/eobPtX2", "author": "Matthew Lane", "isbn-10": "0691161518", "isbn-13": "978-0691161518", "language": "english", "pages": 264, "publisher": "Princeton University Press", "title": "Power-Up: Unlocking the Hidden Mathematics in Video Games", "year": 2017 } }
Your /books
POST request handler might look like this:
router.route('/books').post((req, res, next) => { const book = request.body.data; if (!book) { // pass a 400 error to the error-handler let error = new Error('Book payload is required'); error.status = 400; return next(error); } /* (not implemented) insert the book into the database here */ return res.status(201).json(book); });
In the above example, there is some very light validation going on, consisting of checking if the request.body.data
is not null
or undefined
.
This is the bare minimum amount of validation you would need.
But what about if you want title
and author
to be required fields?
if (!book.author || !book.title) { let error = new Error('Book "author" and "title" are required fields.'); error.status = 400; return next(error); }
Not too bad, and we're getting more validation...
But what if users send invalid amazon URLs or ISBNs that are numbers instead of strings?
/** * let's assume you've written a validateUrl function */ if (book['amazon-url'] && !validateUrl(book['amazon-url'])) { let error = new Error('Amazon URL is not valid.'); error.status = 400; return next(error); } if (book['isbn-10'] && typeof book['isbn-10'] !== 'string') { let error = new Error('ISBN-10 needs to be a string.'); error.status = 400; return next(error); }
As you can see in the above examples, if we want to roll our own validation this way, every request handler is just going to have tons of conditional logic checking for all the edge cases. And trust me, there are tons of edge cases! If this backend is powering a web form for example, you can count on getting tons of bad data just from bot spam.
While this can sometimes be a perfectly fine approach, it doesn't scale that well, unless you want to write your own extensive validation framework or constantly be adding more conditionals once you discover more loopholes.
JSON Schema is a standard specification for describing JSON documents in a human- and machine-readable format. You can go here to see the exact specification, and here to see a more readable guide of what the specification means.
We're going to jump right into it using our previous example. Recall the example "Book" JSON payload using Matt's book:
{ "data": { "amazon-url": "http://a.co/eobPtX2", "author": "Matthew Lane", "isbn-10": "0691161518", "isbn-13": "978-0691161518", "language": "english", "pages": 264, "publisher": "Princeton University Press", "title": "Power-Up: Unlocking the Hidden Mathematics in Video Games", "year": 2017 } }
Instead of manually writing a JSON schema doc, since we have this nice example already filled out we can head over to jsonschema.net to auto-generate a schema for us. Simply paste the JSON in the box on the left and click "SUBMIT":
In the main box marked "HTML" we have our resulting JSON schema as interpreted by our input.
The easiest thing we can do to customize this, is to click on the "EDIT" tab, and adjust the fields. Let's make "data", "author", and "title" required.
Click the save button at the top right (looks like a floppy disk). And then the schema should update.
This is what the resulting schema should look like (on JSONschema.net you can click the copy button and paste it into any .json
file):
{ "$id": "http://example.com/example.json", "type": "object", "properties": { "data": { "$id": "/properties/data", "type": "object", "properties": { "amazon-url": { "$id": "/properties/data/properties/amazon-url", "type": "string", "title": "The Amazon-url Schema ", "default": "", "examples": ["http://a.co/eobPtX2"] }, "author": { "$id": "/properties/data/properties/author", "type": "string", "title": "The Author Schema ", "default": "", "examples": ["Matthew Lane"] }, "isbn-10": { "$id": "/properties/data/properties/isbn-10", "type": "string", "title": "The Isbn-10 Schema ", "default": "", "examples": ["0691161518"] }, "isbn-13": { "$id": "/properties/data/properties/isbn-13", "type": "string", "title": "The Isbn-13 Schema ", "default": "", "examples": ["978-0691161518"] }, "language": { "$id": "/properties/data/properties/language", "type": "string", "title": "The Language Schema ", "default": "", "examples": ["english"] }, "pages": { "$id": "/properties/data/properties/pages", "type": "integer", "title": "The Pages Schema ", "default": 0, "examples": [264] }, "publisher": { "$id": "/properties/data/properties/publisher", "type": "string", "title": "The Publisher Schema ", "default": "", "examples": ["Princeton University Press"] }, "title": { "$id": "/properties/data/properties/title", "type": "string", "title": "The Title Schema ", "default": "", "examples": [ "Power-Up: Unlocking the Hidden Mathematics in Video Games" ] }, "year": { "$id": "/properties/data/properties/year", "type": "integer", "title": "The Year Schema ", "default": 0, "examples": [2017] } }, "required": ["author", "title"] } }, "required": ["data"] }
Great! We now have a massive blob of JSON Schema that we can use for validation (as well as testing, although we will not cover that in this section).
We'll be using the jsonschema
npm package (links: npm and github).
The package works basically works like this:
We install this with npm install jsonschema
.
Once installed, we can use it in any file like so:
// import the validator class const { validate } = require('jsonschema'); // require the book schema (a JSON file that we generated on jsonschema.net) const bookSchema = require('./bookSchema.json'); router.route('/books').post((req, res, next) => { // check if the current request.body payload is a valid book const result = validate(req.body, bookSchema); // jsonschema validation results in a "valid" key being set to "false" if the instance doesn't match the schema if (!result.valid) { // pass the validation errors to the error handler // the "stack" key is generally the most useful return next(result.errors.map(error => error.stack)); } // at this point in the code, we know we have a valid payload with a data key const book = req.body.data; /* (not implemented) insert the book into the database here */ return res.status(201).json(book); });
That's all there is to it! With an auto-generated schema from JSONschema.net (with perhaps a few minor tweaks) and the jsonschema
npm package, you can easily add robust validation to your Node/Express API to prevent bad inputs.
Final Note: you may have to make your error handler more robust to handle arrays of errors given to you by the validation result. Basically, the validate
function will tell you everything wrong with the instance in relation to the supplied schema, so just make sure you have a way to tell the user all of their errors in the error handler:
app.use((error, req, res, next) => { // by default get the error message let err = error.message; let key = 'error'; // for display purposes, if it's an array call it "errors" if (Array.isArray(error)) { key = 'errors'; } return res.status(err.status || 500).json({ [key]: err }); });
When you're ready, move on to Environment Variables