How Schema-First approach creates better API, reduces boilerplate and eliminates human error

In the server team at Cato Networks, we are responsible for building the web console for network and security configuration. Cato Networks is currently experiencing... Read ›
How Schema-First approach creates better API, reduces boilerplate and eliminates human error In the server team at Cato Networks, we are responsible for building the web console for network and security configuration. Cato Networks is currently experiencing rapid growth in which bigger customers require control over the Cato API and the old solutions, that were built quickly can no longer stand the scale. Obviously, in tandem the development teams are also growing rapidly. All this led us to decide to move from a large and complex json over http API, which was only used by our UI, to a public graphQL API that is exposed to customer use, while still serving our web application. We needed to choose the development approach for the API. Use code as a single source of truth and have schema as an artifact generated from the code (aka. Code-First approach), or create a schema and implement code to match its definitions (aka. Schema-First approach). We decided to go Schema-First, and in this post I’d like to explain why. FOCUS When API development starts from writing the code, it is hard to stay focused on its structure, consistency, and usability. On the other hand, when development starts from describing the API in the schema, while abstracting from the actual implementation, it creates clear focus. You also have the entire schema in front of your eyes, and not scattered through the codebase,  which helps to keep it consistent and well organized. [boxlink link="https://www.catonetworks.com/resources/management-application-walkthrough/?utm_source=blog&utm_medium=top_cta&utm_campaign=management_demo"] Cato Management Application | 30 min Walkthrough [/boxlink] INTERFACE We treat our schema as an interface between front and back, which is visible and clear to both sides. After agreeing on a schema, code on both ends can be written in parallel, without worrying about misaligned bridge situations:   DECOUPLING  When working with a Code-First approach there is a backend-frontend dependency. Code is required to be written at the backend first, so the schema can be generated from it, slowing down the frontend development, and leading to slower schema evolution. In contrast, there is no dependency on the backend team when working Schema-First. No need to wait for any code to be written (and schema to be generated from it). Schema modifications are fast and the development cycle is shortened. CODE-GEN After designing the schema, it is easy to start implementing it, building from a generated code both on frontend and backend. We hooked code-gen in our build process, and were getting server stubs that are perfectly aligned with schema in terms of arguments, return types, and routing rules.  The only thing that is left to be done is writing actual business logic. There is no need to worry that your server-side code diverges from schema, because upon any change to schema, code is re-generated and you can spot the problem early, at compile time. Things are even better at the client side. There are tools that allow you to generate a fully functional client library from the schema. We use Apollo's official codegen tool for this purpose. Getting data from the server became a no-brainer, just calling a method on a generated client library that we spend zero effort creating. BEING DECLARATIVE GraphQL allows you to create custom directives and handlers for them. We utilized this to cut the security and validation concerns from the query resolvers code and centralized them, declaratively, inside the schema. For example, we have @requireAuth directive that can be set on a type, field or argument to define that this is a restricted part of the API. Here are some self-explanatory examples of our custom-built validation directives: @stringValue(regex:String, maxLength:Int, minLength:Int, oneOf: [String!]) @numberValue(max:Int, min:Int, oneOf: [Int!]) @email @date(format:String,future:Boolean) This not only reduces the code on the server side, but also gives some hints to the frontend, in terms of validations that need to be implemented to eliminate unnecessary roundtrips of not valid inputs. SUMMARY Almost all of the points described here apply not only for graphQL API, but to any schema-based API, like OpenAPI (aka. Swagger)  for json over HTTP or  gRPC. In any case Schema-First or Code-First is a decision that must be made on a per project basis, taking into consideration its specifics and needs. I do hope I've managed to encourage you to at least give a Schema-First approach a chance.