Introduction to JSON Schema
An introduction to JSON Schema for those interested in understanding without the gritty technical details.
JSON Schema boiled down to its most fundamental is a file that describes JSON files in both a human-readable and machine-readable format. JSON Schema is written in JSON which means like JSON the schema is data about data.
JSON Comment
JSON is a language independent data interchange format. It is backed by multiple international standards and is in use across many applications, client/server systems and services that define the Internet and make it work. JSON is an acronym for JavaScript Object Notation. The function of JSON is to convert JavaScript objects into a language indepent format so that it can be implemented in many different programming languages. It is distinguishable from other files because it uses the file extension .json
.
JSON was developed to provide real-time data exchange between clients and servers without the need for installing browser plug-ins like Flash or Java applets. It has moved beyond that inital use case and can now be found in http API request and responses, output from document databases, API design, and documentation.
JSON supports the following data types:
- string — For character values and denotes with double quotes
- number — For integer and decimal values
- object — For associative arrays
- array — For collections of the other data types
- boolean — For true or false values
- null — For uninitiated data
The example below is a JSON object describing a person.
{
"firstName": "Jane",
"lastName": "Smith",
"isAlive": true,
"age": 27,
"address": {
"streetAddress": "21 2nd Street",
"city": "New York",
"state": "NY",
"postalCode": "10021-3100"
},
"phoneNumbers": [
{
"type": "home",
"number": "212 555-1234"
},
{
"type": "office",
"number": "646 555-4567"
}
],
"spouse": null
}
In the example above, you will notice that the data structure used is a key/value pair. The left side of the :
is the key and what is on the right is the value for that key. The same data structure is used in JSON Schema, but there is one key/value pair in the specification used to immediately recognize a JSON file as a JSON schema.
In the example above, you will notice that the data structure used is a key/value pair. The left side of the :
is the key and what is on the right is the value for that key. The same data structure is used in JSON Schema, but there is one key/value pair in the specification used to immediately recognize a JSON file as a JSON schema.
JSON Schema
{
"$schema": "https://json-schema.org/draft-07/schema#"
}
In the sample JSON above, the keyword “$schema” by convention appearing at the top of the file identifies this file as a JSON Schema file. The URL that is in the position of the value determines which version (or the “Dialect”) of JSON Schema the file is using and the location of that version.
JSON Schema supports the following data types
- string — For character values and denotes with double quotes
- number — For integer and decimal values
- object — For associative arrays
- array — For collections of the other data types
- boolean — For true or false values
- null — For uninitiated data
A central characteristic of JSON Schema that is important to know when first building them is that unlike database schemas it is a collection of constraints
. With database schemas, as you add to them the database grows, but with JSON Schema adding keywords reduces the JSON that is allowed in your application or other use case. For example, if you created an empty JSON Schema file, that file would make all JSON valid and allowed. The subset of JSON defined in a schema describes the structure of a JSON file.
We can get a better understanding of JSON Schema by examining how it is used.
API Validation
The most common use of JSON Schema is validation of HTTP requests and responses. When a client makes an API request, it is processed and output is returned in the form of a response.
JSON Schema allows the server to ensure the JSON file it receives is valid and in the expected structure before it processes the request.
The same JSON Schema file or a different one if needed can be used to validate the data generated by the server before it returns that data to the client in an HTTP response.
These validation steps are where we employ JSON Schema to guarantee the data received in an API request and the data generated by the server’s processing are delivered in a valid response back to the client.
BENEFITS
Being able to perform validation at this step improves user experience, integration reliability and security. JSON Schema will identify not only that there is an error in the JSON submitted, but at which keyword the error occurred and the nature of the error. Some examples might be a missing keyword, or an incorrect type: integer when a string was excepted. This additional information can improve the user experience by showing a user where and what the error in input needs to be corrected instead of simply returning the message: “JSON error”.
Validating the data before it passes to different components in our system makes the whole system more reliable and reduces risk for new or updated clients interacting with this server.
Finally, validating the JSON input before it is processed by the server can prevent corrupted data from making the server vulnerable. In addition, the same is true for responses. Validating the server response JSON can help prevent data leakage if the Schema is constrained correctly.
Documentation
The annotations that JSON Schema provides can be used to generate documentation and it is another of the most common uses of the specification especially in OpenAPI. Below is a snippet of the JSON Schema for the JSON file presented earlier in this post.
{
"$id": "https://example.com"
"$schema": "https://json-schema.org/draft-07/schema#"
"title": "Person",
"type": "object",
"properties": {
"firstName" {
"type": "string",
"description": "The person's first name."
},
"lastName" {
"type": "string",
"description": "The person's last name."
},
"isAlive" {
"type": boolean,
"description": "A true or false answer to the question of whether the person is living or deceased."
}
"age" {
"type": integer,
"description": "Age in years which must be greater than or equal to zero.",
"minimum": 0
},
},
}
API Design
In modern enterprises applications and services receive data from many sources. Using JSON Schema can ensure that integrations are reliable and predictable as it creates a sort of contract that can be used by teams other that yours that the data they will receive is in the format expected and agreed on. Comparing this to legal contracts, business partners might have conversations about collaborating, but before either side will take action they create a contract specifying all the details of their interactions to ensure that there is agreement on the structure and terms of their collaboration. JSON Schema developed in the design phase creates a contract between systems and services that ensure the data structure and format expected is the structure and format is what will be delivered. The value of this step can’t be understated as it makes our systems and therefore business more predictable. This step is more complicated than presented here, but we can cover it more fully in a future post.
Automated Testing
Testing differs from validation because it asserts that two systems are able to communicate by agreeing on what interactions can be sent between them and providing concrete examples to test the agreed behaviour (how deep these tests go may be the subject of a separate post). For exanple, contract testing goes beyond schema testing, requiring both parties to come to a consensus on the allowed set of interactions allowing evolution over time.
Summary
JSON Schema is a powerful tool ensuring the integrity of design and development through its use in validating the data that is exchanged between our applications, systems and services. Beyond validation it is used to generate accurate and complete documentation of our APIs and testing those APIs. JSON Schema improves user experience and the security of the data passing through systems and connecting services.
Details of the expanse of tooling available for JSON Schema makes it the go to choice for creating the contracts that improve systems reliably. And the language independence of JSON Schema makes possible its implementation in most modern programming languages. Again, a topic to explore in an intermediate level post.
Finally the purpose of JSON Schema is validating the data before it passes to different components in our system which makes the whole system reliable and reduces risk for new or updated clients interacting with this server.