Tuesday, May 22, 2012

#VMware vFabric Blog: Increase performance and scalability of existing applications using SQLFire

VMware vFabric Blog: Increase performance and scalability of existing applications using SQLFire:

The Challenge

Traditional databases should not be used to do things for which they were never designed; like supporting thousands of concurrent users.
The main challenge of managing Web applications on a Cloud-scale is performance. Disk-based database architectures are fine when you have a small number of users, but they lack the facilities for horizontal scaling, and, are unable to address the variable access patterns.
In contrast, SQLFire, an in-memory database from VMware, was designed specifically for these kinds of challenges. With its speed and low latency, SQLFire delivers dynamic scalability and high performance for modern, data-intensive applications, all through a familiar SQL interface.
In this post, I will demonstrate one of the ways SQLFire can increase throughput and decrease latency of your current Web applications.

The Bottleneck

One of the most prevalent bottlenecks of today’s online applications is the database. Web and Mobile applications place increasing pressure on the data tier of these solutions, and, their current disk-based architectures simply inherit too much latency to deal with these kinds of workloads. Traditional databases are the wrong tool for this job.
New-db-preasure
Scaling Web applications creates database bottleneck 
Disk-based database architecture simply can’t deal with these kinds of read-write access patterns, and, beyond a certain levels, exhibit a catastrophic collapse phenomenon.

The Solution

While it would be easy to assume that the only way to elevate this bottleneck is to embark on a lengthy effort to re-write your applications against one of those “big data” solutions everyone seems to be talking about, the truth is, that this kind of effort for many enterprise applications would be very expensive and only further balkanize their data by limiting it accessibility to other data consumers.
To increase throughput and decrease latency of your solution at that scale, you need to bring the data closer to each one of the data consumption points.
New-db-preasure-resolved
SQLFire scales horizontally, while preserving legacy database access
SQLFire provides that extra performance with the familiarity of SQL interface. It accelerates your application performance, minimizing latency and increasing overall reliability by pooling memory, CPU and network resources across a cluster of machines.
In addition to the performance, SQLFire also delivers numerous build-in data synchronization features. For example, in order to continue supporting legacy data-flows in your current solution, SQLFire can make your new “fast data” also available to back-office applications.
SQLFire’s asynchronous “write behind” pattern has been designed for very high reliability and can handle many failure conditions as persistent queues ensure write operations are performed even when the backend database is temporarily unavailable. SQLFire supports multiple options for how these events can be queued, batched, and written to the underlining database.

The Implementation

Whether you are managing Java or .NET solution, in most scenarios, pointing your application to SQLFire database is as simple as replacing the application database connection string. But, before you can do that, you will need to replicate your current data identities into a SQLFire database.
Because SQLFire supports standard ANSI SQL-92, the act of replicating databases schema in most cases is pretty simple and verily automated. Most DBAs pose numerous commercial tools to perform these tasks.
If necessary, there are also readily available, open-source utilities like Apache DdlUtils or SQLF, the SQLFire command line.
However, to truly benefit from SQLFire’s in-memory performance and linear scalability, we should take some time to design a horizontal data partitioning strategy. This is the act of spreading a large data set (many rows in a table) across multiple servers.
SQLFire uses a partition key to ensure that data is uniformly balanced across all members of a data grid. Correctly identifying the partition key will avoid expensive queries across multiple partitions and enable your highly concurrent system to handle thousands of connections and allow multiple queries to be uniformly spread across the entire data.
In contrast to the highly volatile data, which greatly benefits from partitioning, some of your data elements are pretty static (like lookup tables) and are better replicated across all nodes in the data grid to further increase the performance of individual queries.
You can apply all these extended DDL attributes to tour SQLFire schema after creation, using the “ALTER” command. For more information on these and other ways of managing your SQLFire data, see the Managing Your Data in vFabric SQLFire documentation.

The Source

To go along with this post, I put together a sample application illustrating integration of SQLFire in your online application. The source code of this application is available on GitHub.

The Demo

To demonstrate the impact of SQLFire, I deployed this sample application to a shared hosting environment. While the specific database performance depend on many variables, the purpose of this short demo is to focus on the relative gains SQLFire delivers to a generic Web application while continuing to support the legacy database.

The Conclusion

Hopefully, you got a glimpse of the scale-out nature of SQLFire and the flexibility of the different use-cases, which it can address. As with any technology, there is no substitute for rolling up your sleeves and trying it yourself. SQLFire licensing allows for up to three nodes to be deployed free of charge for development purposes. Go ahead and see for yourself how much faster SQLFire can make your data.

No comments:

Post a Comment