Duetto’s core technology offering has grown rapidly over the past few years as we’ve delivered on the initial promise of a modern Revenue Strategy system for hoteliers through Open Pricing. We accomplished this using most of the same cloud-based architectural decisions we began with, but with a number of changes and improvements that have happened along the way.
[Editor’s note: This is the first entry in a three-part “Building a Better RMS” series, in which Duetto CTO Craig Weissman explains the finer points of the cloud-based architecture that powers Duetto and its GameChanger application.]
It has been some time since I have provided an update on the core features and philosophies of Duetto’s service and technology organization. We feel we have simply built a better overall data warehouse and analytic system for the key data driving revenue and profit within the hospitality industry. We hope to spend more of the remaining year describing the considerable technical achievements of Duetto in this “Building a Better RMS” series.
We’ll begin with a brief overview of the key components and design choices in Duetto’s software-as-a-service (SaaS) architecture and deployment processes. I’ll start with the most foundational choice, which was to use Amazon Web Services (AWS). Running our GameChanger app exclusively on AWS makes it more versatile and secure.
Leveraging the Revolution of SaaS
Like many modern technology companies, Duetto has never owned a single piece of physical hardware to deploy its service. We rely exclusively on AWS to deploy and run our service using a world-class data center.
Despite the considerable press that AWS receives for its technology prowess as well as its front-runner business model (and huge revenues), I remain convinced that AWS is underappreciated for the revolution it represents. Simply put, AWS turns hardware into software. Gone are rack-and-stack engineers, co-location fees, provisioning departments and leases, physical drives to replace in person, etc. AWS is not free by any means, but the freedom it provides to manipulate hardware with a few lines of scripted code is vastly worth the monthly costs.
Although the breadth of services offered by AWS is now extremely wide, we continue to use mostly its basic services. This represents a philosophy of staying “down the middle” and even allowing a vendor switch in the future (e.g. to Google’s cloud) if the need were to arise.
These core basic services include S3 for file storage and backups; EC2 for the vast majority of our costs with actual servers and compute resources and locally attached disc drives; ELB for load balancing; and a small bit of EBS, although we are considering removing this need entirely. We use a few of Amazon’s higher-level services, namely SES (simple email service), as well as Route 53 for internal DNS.
Perhaps we should and will look into higher-level services such as Redshift or SQS, but to date we have built these services ourselves. For instance, we use queueing and scheduling for a large number of background services, but we have tended to write our own versions of these higher-level application services using Java, Spring, Quartz or another framework. As for our core data warehouse and analytics engine, that is written by hand in Java as core Duetto IP.
More Secure Data in the Cloud
In terms of topology, all of our services are provisioned in the US West (Oregon) region, which is one of AWS’s largest regions. We deploy physical servers in various roles using Chef tools, although we are looking into using containerized services such as Amazon’s Docker extensions. Our current hardware is now provisioned in what Amazon calls “classic mode,” but we hope to move to VPC (virtual private cloud), as it offers greater security and other benefits such as the latest and fastest CPU instance types.
Our compute services can be broken down into database servers (running Mongo, which I’ll write about later) and application servers running our Java stack. We have a few specialized roles such as proxy servers with static IP addresses for outbound communication, “algorithm” servers for analytics and model building — which are accessed via a service-oriented architecture (SOA) from our core Java application — and “capture” servers for our shopping data website capture.
The most important principle for all of these services is to avoid any single points of failure (SPOF).
We deploy MongoDb in a clustered manner such that the loss of a physical node will not impact the service. Similarly, our application servers are designed to be stateless, meaning we can add or remove nodes from our load-balanced pool without any customer impact. Any application server can handle any request type for any of our customers. If a server should experience a hardware failure, then only those currently running requests would be impacted (and our queueing architecture often ensures that another task will take over shortly for the one that was in process).
The services for capturing shopping data from customer websites, proxying outbound requests from static IPs, etc., also are all deployed in a high-availability (HA) manner with hot standby servers in case of failure.
In terms of our backup and disaster recovery (DR) plan, we persist the entire database regularly into S3 and can redeploy all our services using our dev-ops and Chef scripts in an automated fashion. S3 data is available reliably throughout the US West (Oregon) region, so we can stand up the service in a new zone if our primary zone were to experience a major — albeit very unlikely — issue.
We believe in optimizing our software and algorithms not to require massive instances with huge amounts of RAM. For our standard app servers, we use “down the middle” M-class instances with 8 gigabytes of RAM. I will describe our server coding philosophies in a later post, but fitting our analytics into this relatively small amount of memory (even while running many requests in parallel) requires a great deal of careful engineering and Java code profiling and tuning.
For our database servers we use SSD-backed, I/O-optimized instances. In our test environments we tend to use C3-class machines as our database workhorses. In production we use the I2 instances which combine large SSD drives with a sizeable amount of RAM and CPU. Because we ask Mongo to do little more than store data and read from indexes (as well as churn several operational collections for queues and locks), we tend not to see huge CPU usage. We also monitor our network usage to ensure that bandwidth will not limit our application’s throughput.
In the next entry in this series, I’ll go into greater depth about the databases we run on these servers. MongoDb has been an excellent resource for Duetto, and I’ll share why in detail.
Bringing a New Approach to Integrations in the Hotel Industry