During the 18 months we worked on EcoNet™ mobile applications, the Contractor App and Nest integration, we faced some challenges while integrating with Rheem’s existing backend. We created workarounds to mitigate the existing backend system’s inefficiencies and reduce the legacy technology’s impact on the products we developed. Concurrently, we evaluated the system and planned for a multi-faceted solution to improve the power and performance of the platform and provide Rheem’s customers with a trusted solution that makes it convenient to control the comforts of their home. We ultimately re-engineered Rheem’s entire backend to improve the overall performance of the EcoNet™ apps, the Contractor app and the integration with Nest.
Dart, Aqueduct, PostgreSQL, CouchDB, Erlang, Cowboy, AWS, Simple Email Service, Wink Integration and Alexa Integration, Nest Integration and Simple Notification Service
- Leveraged stable|kernel’s open source server-side framework Aqueduct, written for Google’s Dart
- Designed a distributed system with separate components existing on separate servers to simplify complex information journeys and reduce latency
- Built a communication (or “comm”) server for a direct connection to the equipment
- Engineered a gateway to ensure information requests and transports worked in the most efficient way possible
- Created a key value store database or equipment database to provide eventual consistency
- Improved the overall performance of the EcoNet™ mobile apps, Contractor app and Nest integration
- Built a connected system that brings direct, remote control over a traditional, previously analog physical equipment (HVAC, water heaters and thermostats) to life
- Reduced latency by 1000%
- Reduced Rheem’s hosting costs by 90%
- Load-tested the system with 600,000 equipment connections to ensure its ability to scale to Rheem’s customer base
Re-Engineering Rheem’s Technology Stack
During the 18 months we worked on EcoNet™ mobile applications, the Contractor App and Nest integration, we faced some challenges while integrating with Rheem’s existing backend. We created workarounds to mitigate the existing backend system’s inefficiencies and reduce the legacy technology’s impact on the products we developed. Concurrently, we evaluated the system and planned for a multi-faceted solution to improve the power and performance of the platform.
Due to the size of the system, actions and data followed a complex journey. A request could sometimes travel six different pathways, creating several seconds of latency. An improved system would dispatch these requests in milliseconds. After extensive research into different server-side frameworks and languages, we chose to rewrite the majority of the backend with Google’s Dart. Dart is a language developed by Google in Germany in 2011. We chose Dart because it has its own easily deployable, stand-alone virtual machine, an incredible set of libraries for building web servers, asynchronous programming and reflection, and it is a simple, elegant language.
Previously, we developed a server-side framework called Aqueduct for Dart. Aqueduct is an open-source, server-side web framework written in Google’s Dart language. Aqueduct promises faster development, experimentation and testing – without sacrificing power. This allows for faster feedback cycles and greater stability over time. Our existing server-side framework was another reason why we felt Dart was a strong choice for Rheem’s backend system.
Another reason for choosing to use Dart for Rheem is Aqueduct’s ability to handle authentication, routing, request handling, database querying, documentation generation, testing utilities and other tasks, all integrated within one platform. We counseled Rheem on Dart’s potential performance for its large system and jointly made the decision to move forward with Dart.
Instead of working with the current monolithic system, we approached the problem by designing a distributed system with separate components existing on separate servers. We assigned each component with a specific focus and responsibility, providing better tools to complete each task. In the original system, the API layer was grouped with other components. For the API to translate messages from the client to the API server, it needed a clear communication path. So, our first task was to turn the API layer into its own component.
The next component of the system was a communication (or “comm”) server, which is a direct connection to the equipment. Separate from the API layer that communicates with the mobile client, the comm server needs to understand the logic of how a homeowner’s equipment would continuously interact with the system.
This is an important piece of the puzzle; it helps mitigate connection issues that would affect the entire system as opposed to just one part that experiences difficulties. We chose Erlang to build the comm server because it allows for hundreds of thousands of connections to act independently. In this new system, if something goes wrong with a connection – power outage, loss of connectivity, bad information – just that one connection is lost while hundreds of thousands of other connections remain intact. To create a fail-safe, in a worst case scenario where a server might disconnect or fail, we built four different comm servers in order to scale the connections out. In this situation, only 25% of the connection would be lost for a short period of time. The system can quickly work to correct itself while the other 75% of the system is still fully functioning.
After addressing the API layer and the comm server, we looked inward to the gateway and cache that make up the middle of the system. The gateway creates long-term connections by redirecting equipment information sent through the Wi-Fi adapter to the appropriate comm server. The comm servers and the gateway have a standing agreement; whenever a comm server goes online, it alerts the gateway and gives it permission to start sending long-term connections. Essentially, the gateway decides which comm server has the least burden on it at that instant and then directs the Wi-Fi adapter to send information to that server.
If the comm server doesn’t respond, the gateway sends information to another comm server. On the client side, each of these actions take place in a span of less than 300 milliseconds, and the client receives a response to its request. We needed to make sure these information requests and transports worked in the most efficient way possible, and we engineered the gateway to do just that.
The last and most important component of the backend was the creation of a key value store database or equipment database. Any time the equipment sends new information, the comm server writes it into the equipment database. The database caches the equipment values, speeding up the response time between requests made by the client and the delivery of the request.
Why is a cache so important? Prior to its inclusion, if a homeowner requested a 3-degree increase in temperature, the mobile client would not display that increase until the actual temperature rose, even though it was the intended result. With the introduction of the cache, the mobile client serves up a depiction of the eventual rise in temperature, giving the homeowner peace of mind that they’ve accomplished their desired action. The homeowner understands their request will not raise the temperature at that exact moment, but it will eventually occur. The cache provides eventual consistency.
The overall performance of the EcoNet™ app, Contractor app and Nest integration was greatly improved in a way that was not otherwise possible without the overhaul of the entire technology stack. We reduced latency by 1000% and Rheem’s hosting costs by 90%. We also load-tested the system with 600,000 equipment connections to ensure its ability to scale to Rheem’s customer base.
Our overall goal for Rheem was to deliver their customers a trusted solution that makes it convenient to control the comforts of their home. stable|kernel CEO Joe Conway said, “We asked ourselves, how do we build a system that people can trust? How do we ensure that the client’s actions produce the desired result? And we knew that if the solution didn’t work the very first time, it would likely be the last time a Rheem customer uses it. Our job was to ensure that the product works properly the first time to guarantee continuous use and application.”
The innovation in our work came from piecing all of the components together in a way that worked seamlessly for Rheem. We worked hard to bring Rheem’s vision of building a connected system that brings direct, remote control over a traditional, previously analog physical equipment (HVAC, water heaters and thermostats) to life. Our work also positioned Rheem at the front of their industry regarding mobile integration and the Internet of Things connectivity.
“EcoNet™ Cloud 2.0 is more than just a platform; it’s the Ferrari of the Internet of Things. That’s what stable|kernel has built just for us.” – Arslan Khan, mobile application development manager at Rheem Manufacturing
Our work with Dart on behalf of Rheem caught the attention of Google’s Dart team. They are very excited to see Dart used on a platform like Rheem EcoNet™. The system tests Dart’s full abilities, and Google intends to use Rheem’s story to evangelize the language and framework’s abilities to both the developer world and to CEOs and CTOs looking to improve their technology stacks.