The previous blog post: “Using BIM and API data to demonstrate opportunities in Ypsilon” presented a look to what is being researched as a real-time data architecture in the 4APIs project. We can see that there are multiple sensor data sources that provide real-time data about the conditions in the rooms. In order to effectively utilize the opportunities of this continuous IoT-data, we must build a sufficient architecture to support it.
As our approach, we have chosen to research and develop a cloud platform-based solution to capture, store, and provide an output for the data that is provided by the Ypsilon Sensors. More specifically we have chosen Microsoft Azure cloud platform for the initial Proof of Concept (PoC). The components provided by public cloud providers such as Azure are well suited for various data-related tasks. Furthermore, Cloud platforms provide scalability and reliability, which are both increasingly important in the future, with growing number of data sources and applications that are dependent on said data.
We base our PoC-architecture on Azure IoT reference architecture as well as our tests with dummy data simulated by a Python function. So far, we have made experiments that are meant to build the first draft of our PoC. PoC is developed with an idea of small incremental changes and monitoring the data flow in each component that is added. The data flow in the PoC architecture begins by capturing Ypsilon sensor data with Azure IoT Hub. The data is stored to Azure Blob Storage that divides the data stream into a hot path and a cold path.
The hot path is used to stream the data to an outbound API in order to provide access to the Ypsilon data in real time. The first draft of hot path will use an Azure function that reads the data from blob storage. The cold path is for storing the IoT data to a database so that it may be used for aggregation and analysis. We have chosen Snowflake as our PoC database. Storing the data in the cold path enables us to apply the data to reports, dashboards and even machine learning purposes.
We still have many questions that need to be answered. One of the most important ones is “What do we mean by real-time?” Our initial tests showed that simulated sensor data took roughly 90 seconds to reach blob storage, from which there is some additional latency to be expected in terms of outbound API. The entire latency can be brought down by utilizing different Azure components such as Stream Analytics, but as the service gets faster, the costs are also increased. We continue our tests with Azure components in order to find a reliable and fast solution that enables real-time data as well as analytics purposes with history data.