GraphQL: why a schema-first approach?

by Sammy S.

At first sight, GraphQL sounds like just a fancy alternative to REST. Perhaps... But, if used in the right way, it can be so much more than that. We believe, at Digalyze, that a federated GraphQL implementation is the answer that every corporate has been looking for.

Imagine a world where all the all electricity plugs were the same in all countries; no more need for plug adapters. Now scale that concept out to APIs. This is what a GraphQL schema-first approach proposes: a uniform understanding (throughout a company) of all entities.

In our experience we've seen too often where many different implementations of data structures would exist in a single large company, say, for example, a "User" entity: team A might agree that a user should have a first name, last name, and an address; while, team B might have decided that just "name" was sufficient (instead of a first & last), and that the address should be simply a 2-letter country code instead of a free-form country name input (e.g. "United States"). Now, if A wanted to one day access B's api, then they would come the realization that the data structure doesn't match up, and adapters must now be introduced. Surprisingly, for team A to change their structure at that point could get very difficult!

In retrospective, we would argue that team A should have been in better communication with team B. But, in reality, this kind of communication is very difficult to achieve. Unless... You are using a GraphQL schema-first approach!

Imagine this solution: you setup a new "golden-source" schema repository — which all teams must contribute to. This repo will contain microservice level GraphQL schemas, e.g. Users, Products, Payments, etc. Each team will own a single microservice/schema. The repo will not contain any functional code; it's just to hold the schema definitions. If changes are to be made to what is understood by the term "User", then it must be updated in this repo first as a Pull Request. Once merged, then the responsible team can pull that agreed-schema and start developing against that.

Sound a bit idealistic, right? Well, actually not: a lot of concerns can and have been addressed. For example, what about a service being linked to another service? The separation of concerns proposed above allows handling that by "extending" an external type in the relevant microservice, for example: User should be extended to include a "payments" field (which would give us the payments belonging to a user). The way it would be done, the User team wouldn't see in their schema file that this extension has been made, and instead the extension would live in the payments schema; hence, separation of concerns. This precisely what Apollo Federation is for: tying together microservices that run independent of one another — essentially plugging in a microservice to another without the need for an adapter.

What about breaking changes? How will teams communicate breaking changes with other teams? What if the breaking change is missed during code-reviews? No probs! The GraphQL SDL is simple enough that a tool can be written, or, even better, you can use The Apollo Engine's Graph Manager which will point out any breaking changes for you on any schema change.

How about deprecating? If you're worried about ending up with a bunch of clutter with new vs old deprecated fields, once again, there are very elegant ways of handling this. First is adding a "@deprecated(reason: String)" directive, which is quite simple. Next, the simple SDL to resolver mappings would allow you add tracing to every single query/mutation/subscription; this way, you can detect which fields are no longer being used in each of your environments, allowing you to clean up. This is a great example of the great communication that GraphQL gives between the Front-End and the Back-End teams (once again, provided by Apollo Engine out of the box).

Using a tool like graphql-code-generator is capable of converting an entire schema into TypeScript. Meaning the entire front-end developer team will have 100% auto-complete in of the backend their IDEs. The options here are limitless: you could even go as far as to fail schema changes on CI if the new implementation will break production.

Apollo extends this tracing idea to also trace usage stats or even record the query/performance times, so you can use a VS Code plugin which will advise front-end devs with an estimate for how long their queries could take as they write them out (based on historical timings).

Last, but not least, a concern that many people have at first-sight with GraphQL: "allowing the front-end to query anything they want is too much power, whereas a REST endpoint would limit the query to what the UI is supposed to consume". But, of course, there's a solution for that too: query whitelisting/safelisting. Basically, a tool that would scan your entire project for any graphql queries and create a whitelist of what the UI can consume — maybe good to point out this whole workflow is fully automated, and instead of giving the power to UI, this gives the power to UI developer only.

So, this whole article is beginning to sound like an ad for Apollo Engine! But, you would be pleased to know that you could implement all this tooling yourself as well if you are using just GraphQL without Apollo; if you don't need all the bells and whistles, then it's very realistic to do so as well. But yes, the Apollo team is really working on some game changers.

In summary, a schema-first GraphQL approach can provide a great communication workflow between teams, allow healthy silos, enable skill-based hiring (e.g. get an ElasticSearch expert to thrive in the Products team, but an OAuth expert to thrive in the Users team), provide fully-automated never seen before levels of compatibility between front-end & back-end teams , greatly reduce breaking changes, and even improve application performance.

Using AWS EC2 VMs for a better Remote Experience

Recently we have been setting up EC2 Linux instances on AWS and have shifted over our development from local to remote machines. We SSH into our VMs with the help of VS Code's Remote SSH plugin. So far, it's been great! It's just like having a local development environment, except with the benefit of everything running in the cloud. Your plugins will run on the remote machine, e.g. TypeScript or ESLint.

Why? We initially thought it was to explore our inner geekiness :) But, we soon discovered it's a great experience to keep our development environments completely isolated from our local machines. First of all, the resources of our VMs can be fully dedicated to development, which keeps our CPU fans quiet. We also get to scale up/down our machines on-demand. 16GBs of RAM not enough? Bring it up to 4000GBs of RAM for the next 1 hour if you wish (for $32/hr)! Imagine being able to develop on a used $100 laptop you have lying around somewhere with full computing capacity delivered from a VM. Exposing a port to others to test our local app versions is another perk. Another benefit is getting a Linux or a Mac development environment despite wanting a Window UI (or swap them around as you wish). And if you break your VM, simply restore it from an AMI, or set it up from scratch again!

Some of us have even taken this a level-further by using Windows EC2 instances as our "virtual-desktops". We found that RDP'ing into a Windows VM from our local Windows machines works very well — practically feels real time (unless you are doing things that require a GPU). The main benefit of this really shows when you want to install tools/VPNs/logins/have bookmarks specific to a client but leave your local machine untouched. Also, the idea of being able to switch on & RDP into your remote workspace from any computer in the whole world truly feels remote.

Starting a new project?

Get in touch