I've recently changed teams at work, and have shifted from "safe" code bases that I've either been working on daily for over a year, or new applications that I've had a hand in from the ground up. Now I'm doing maintenance and adding features to existing code-bases written by different teams for different requirements, developed to different coding guidelines. So that's all a bit of a rude awakening that I am still settling into. I'm definitely out of my comfort-zone, and having to ask different questions than I usually might. That said: it's a good move and I kinda like that sort of thing, so cool!
For better of for worse, we use a lot of REST web services even for our internal processes. I personally believe this is architecturally questionable, but as no-one thought to ask me the question, I didn't get to partake in that particular discussion. To be fair to the situation, it was a decision made before I was part of the department; and - equally - I'm not entirely convinced of my position that it's a daft idea. Just mostly convinced ;-) Anyway, as the over-used truism declares: we are where we are.
Anyway, I find myself doing some maintenance work on one of our web services. This particular service handles the back-end processing for our web applications which make image file uploads. As part of the file upload we need to do a few things:
- upload the file and stick it in the appropriate place;
- store some user-provided data about the file (name, description, etc);
- create a few different resizings of the image for various purposes;
- contrive some metadata on all the file variations based on [stuff].
- distribute all the files across our CDN.
The first two of those steps are fast and can be done in real time. The file operations (resizing and distribution) take some time, and obviously the metadata extraction cannot be peformed until the files actually exist.
We have these split into two processes: one that always occurs immediately that uploads the master file and stores it and its user-enter data; then a second process which handles all the slow stuff. This is just so we can provide fast feedback to the UI. Note that all this is still an atomic operation: both processes need to run.
On the first iteration of the implementation of this the two processes were fired off by the UI. Basically the upload form was submitted and the main request would handle the upload and the data-write, and then a second request was fired off by AJAX. This is not a robust solution as it isn't atomic. It's possible that the first call can be made but the second one doesn't for some reason. From an architectural point of view it's just not up to the view layer to make this call either; the versioning and metadata processing is an adjunct to the main file-handling process on the server: it's more the second half of the same process, not a separate process in its own right. Realistically the two are only separated at all because of performance considerations. From the perspective of the consumer of the web service: it should be one call to the API.
This was driven home to us by a second web app needing to do the same operations with the same web service, and the bods on that team knew about the first call but not the second.
So my task was to remediate this.
The solution we adopted was to have the front-end web app only call the one process, POSTing to
http://api.example.com/file
, passing along the file-upload form information as the POST body: this is what the UI-driven process was already doing. I simply augmented that to put an async job into our queuing system. The queue job receives the ID of the created file, and then it simply makes a second REST call, also POSTing to another end point: http://api.example.com/process
. All good.A first observation here is that "process" really isn't a great name for a REST end point. An end point should have meaning unto itself, and "process" is meaningless unless one knows it's tightly coupled to an initial end point relating to a
/file
: "Ah: so it's processing the file". One shouldn't need this sort of context when it comes to a web service end point. Sadly this end point is now out in the wild. Well: here "the wild" is just internal apps, but we can't simply change the URL to be something better without scheduling updates to those apps too. I'll be back-logging work to get that sorted out.Part of my work here was to boyscout-rule the controllers of the "process" end point to move business logic out of the controller (yeah, don't ask), and into a service tier, which called a repository tier etc.
Some of the business logic I moved was to extract the decision as to which request parameter to use for the ID. Apparently we have some legacy code somewhere which passes the file ID as another argument name - I do not have the code in front of me right now so I can't remember what it is - so there was some logic to determine whether to use the legacy argument or the new one. Arguably this is perhaps still the job of the controller - it's dealing with stuff from the request object, which I can sell to myself as being controlller logic - but we're pretty stringent that we have as little logic in the controller as possible. And as I already needed a service method to do "Other Stuff", so I pushed this back into helper in the service too: something like
extractDistributionCriteria(array $allCriteria)
. I don't want to be passing the entire request object into the service, so I extracted the POST params and passed those, using Silex's $request->request->all()
method, and that returns the POST params as an array. Other than that, I was just shuffling logic around, so everything should be grand.