Architecture advice

Hi All,

I am hoping someone can provide some architecture advice on a project I am working on. First I will outline what we are trying to achieve and then I will outline the direction that I am currently considering, and if anyone has any advice or constructive criticism that would be great.

High Level Objective
Using particle Boron’s to collect end-user data and send via the cellular network to an online database. This data will then need to cleaned to remove any obvious outliers. We will also be calculating averages and fluctuations in the data over time. This data will then be presented to the individual end users via a dashboard behind a login wall on our website.

Current Proposed Strategy
Collect Data via particle, publish as an event.
Transfer data from Particle cloud services to Google BigQuery using stitch and a webhook.
Use some Google cloud service to filter data and calculate metrics.
Potentially transfer data to clean database
Display relevant data to logged in user on our website.

I am a bit unsure if the google portion of the above is the best way to go about doing this. I am also looking into embedded BI softwares such as SISENSE, which, I believe, can connect to a data source, filter and display the data on our website.

I am hoping someone here can give some guidance as to whether I am going about this the right way, or if I am overly complicating things. Thanks in advance.

The scenario you described will certainly work, as would dozens of similar IoT database and dashboard strategies.

My advice is to consider how easy/hard it will be to scale your project, when selecting your strategy in the beginning. In other words, develop the project to minimize the future time/effort to add new accounts/dashboards/Borons as you grow.

If you’re not careful, the time spent adding new customers can add-up to substantially more effort than creating the initial backend system. I learned that the hard way.

Thanks for the help. The scaling problem that you describe is exactly what I am worried about. I’m very new to Web development so my path towards an end goal is not as structured as I would like it to be, but there is no substitute for experience unfortunately.

If anyone has any high level pointers in the realm of scalability, I am all ears.

You should at least investigate using Google PubSub instead of a webhook to get the data from the devices into the Google Cloud. Both methods will work, however if you use a bare webhook you need to handle authentication and other things manually.

Using PubSub allows the Google cloud integration to handle the authentication for you. Also, PubSub can buffer the data, so if your data ingestion service goes down, the data won’t be lost.

2 Likes

Thanks for the heads up, I am currently using a service called Stitch, but PubSub looks very interesting. I am struggling to find a suitable way to get from BigQuery to my hosted mySQL database. All the solutions that I come across seem to include using node.js or something similar to collect and deliver the data, but from my limited knowledge this means I have to run this script on my local machine or else spin up a virtual machine to complete this step.

I have been down this path as well and built a SQL based system that does pretty much as you describe. The issue for me was that you need full stack skills to do anything and it is a big undertaking. I moved to Thingsboard.io and this has made it much simper to go from decide design and build into a system that has dashboard and widget capabilities with a minimum of JS to customise the final presentations of the data.

Thanks for the advice, Thingsboard looks interesting. Are you using it instead of publishing to the particle cloud?

Essentially I am looking for architecture advice. I guess this is typical with unexperienced developers, but I keep branching down a route and then come to an impasse and backing up to go down a different route.

I have been working from both ends of my architecture, ie from the particle side towards the back end and from the back end towards the particle. I have a php based web app up and running using website hosted SQL databases to register new customers and link them to devices, and I have my particle application running well and publishing data to the particle cloud services. I just need to connect both end of my architecture.

I am now looking to store this data somewhere, do some very simple filtering and then graph/dashboard behind the current login wall on my php web application.

Option 1: I am able to get the data into a table in Google BigQuery, but struggling to set up a Dataflow pipe to filter this data, and I’m not sure how I create user specific graphs on my hosted website with data from Google cloud without insisting all users use google authentication.

Option 2: I am able to get the data into the Azure IoT Hub, from here I am hoping to get it into an SQL Database, which I cam hoping I will be able to connect to through my currently built web application.

I guess I’m just talking out loud here, but maybe this might help people who are a few steps behind where I am, if that’s possible :slight_smile: .

Could you elaborate a bit more on the authentication limitations of using a webhook? Is authentication handled if I use the particle Google integration? Can I still utilise particle publish using the Google pubsub service?

Thanks for the advice.

Yes, you Particle.publish from the device still, but it’s converted into a Google pubsub topic by the Google cloud integration.

If you create a server that accepts HTTP requests from a webhook it’s necessarily exposed to the Internet, so in theory anyone could make the same requests, spoofing data from your devices. You’d need to include some authentication to your webhook to prevent that. Ideally, you’d also encrypt the data using https, which means your server also needs a TLS/SSL server certificate.

When you use Google pubsub the integration handles the authentication to the Google cloud, and it’s done over TLS/SSL.

Not necessarily for your scenario, but Google pubsub is great if you want to run the data manipulation on a server on your own private network. Unlike webhooks, the pubsub client can be behind a firewall, and does not require a static IP address, dyndns, or port forwarding. It also does not require a TLS/SSL server certificate because the connection is outgoing, to the Google cloud. And pubsub will hold the events if your network goes down, you restart the server, etc…

Ok brilliant this is very helpful information. I have spent the last 72 hours playing with all the Azure services and might go back to Google cloud.

So currently I have a php application hosted that utilises mySQL to store customer data, eg contact info, and what particle devices belong to which customers.

Following the particle google cloud tutorial I have the particle data going into a Google pubsub.

Does anyone have any advice regarding linking my hosted php application to the particle data in the pubsub?

Should I transfer to hosting my whole application on Google cloud?

I know I can use node.js to transfer the particle data to Google datastore, but I assume this means I would need a personal or virtual machine running this program 24/7 ?

Basically I am hoping to build a minimal viable architecture that is scalable and can be added to or improved over time.

Again thanks for taking the time to give advice.