I got an interesting request from a client. At one of the places I work (I have two jobs, 30hrs and 20 hrs / week) we provide analytics for a bunch of companies - most of our stuff is gathered via FOIA (freedom of information act) requests, and we analyze the stuff into specific fields and have a great search interface.
Usually, these huge companies approach us and attempt to get us to sell them a continuous access to our data - everything. We turn these requests down, but they are quite frequent.
This other company though - they have a younger guy running development. He approached us and is interested in a Spotify-like model. They want to send us their private documents, and have those indexed by our analysts and stored in an area where they can search just those, or return those private documents mixed in with our normal documents for their users. Like how on Spotify you can have your own songs intermixed with those that they have on their service.
It's a neat project for me, as I'm kind of a database nerd, but it involves a fairly substantial change to how everything works. Luckily, we also do a lot of contract work for companies, and we often work with batches of private data, but we have never built a full blown interface to work on this private data. This request finally gives us an excuse to build that interface, provided I can get all of the checks and balances created to make sure that access is never provided to the wrong people, or that results that shouldn't be returned for a user are accidentally returned.
I've been going back and forth about how much of the "microservice kool-aid" I need to drink to get this project rolling. I already have a nice authentication system built, but I don't have LDAP or AD or anything in place for federation, if I even want to go that route.
I was thinking about having an authentication service that any number of services would poll in a standard fashion. I would then also have a service to return data to the website. (and possibly other things) That way the multiple search interfaces that we have would package up the requests in a standard fashion, include the auth token, and send it to that service. The service would unpack things, validate the token, and send requests on to N number of other-services which would perform the actual query. Right now we use elasticsearch and mariadb, so we'd have interfaces built for those two products, but in theory this could be anything as long as the exchange formats are agreed upon.
I'm going to most likely create stand alone databases for each client that wants to store their data in this fashion, just so that a select * from ... query won't return anything on accident. That puts the strain on the authentication service to make sure we are querying the exact things the user has access to.
Anyways. Rambling at this point. Is there anything super obvious that I'm missing already?