Bas Geertsema.net

SaaS and the need for dynamic languages

by Bas 25. August 2009 13:38

A key distinction between SaaS and non-SaaS is the housing of multiple-tenants in a single instance. This required mass-customization techniques. Most of the customization, or variability, has to be defined at runtime; (re)deploying of the instance should be kept to a minimum! As an effect, the software system engineer has to design the application with support for runtime variability. For more advanced customization scenarios such as business logic and workflows, you will need a (turing-complete) programming language to support this. Statically typed languages are in a serious disadvantage compared to dynamic (scripting) languages as the former requires compilation, building and linking which is not near as easy a simple interpreting engine.

My guess is that statically typed languages will continue to be used for the instance itself, the core application. Building on top of that, all user customizations, or user applications, will mostly be developed using dynamic languages such as javascript or python as they are much easier to work with at runtime compared to the current generation of statically typed languages such as Java and C#.

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Tags:

Computer | Professional | English

Multi-tenancy.. do you really need structured data?

by Bas 18. August 2009 15:26

One of the big architectural decisions you have to make in multi-tenant software is how you store and partition your data. In many cases you will choose for a RDMBS such as Sql Server, MySQL or Oracle. The choice for an RDMBS is well founded; you get powerful querying, transactions, recovery, backups, indexing and management capabilities. Sure, a RDMBS might not easily scale-out but by scaling up you can come a long way.

Once you go the RDBMS route you have to figure out a database schema that will suit your application. Of course it has to support multi-tenancy. And since you have only a single instance of your application your application must take care of dealing with querying the correct data. In your database design you basically have three options:

- A single database for each tenant

- A shared database, multiple-schema for each tenant

- A shared database, shared schema

This all has been well explained in this MSDN document, and that is not where I want to focus at. Instead, I want to share some design struggles I had with this.

I recently have been quite busy figuring out which path to take. The difficult notion here is how to deal with tenant-specific customizations. For example, different tenants might have the same business entity extended with different attributes. This does not align with a relation server, which only supports a fixed database schema. So you either have to design a very generic and flexible schema, in which case all variability is handled by your application layer. But this tends to lead to awkward and inflexible querying and an unclear schema with names like attribute1, attribute2, attribute3, and a lot of meta-data. The second option is to modify the schema at runtime, which is obviously only possible if each tenant has its own schema or database, a shared schema is not possible. The runtime modification of database schemas seemed like a sensible approach. Just use DDL for schema modifications and introspection and some application layer doing the translation work. But this quickly turns out quite complex and error-prone. For example, what to do with schema updates in your application? And how to upgrade existing tenant data to a revised schema?

Dealing with these problems I figured that the real problem might be that I am locked in this ‘ it has to fit in SQL’ mindset. When I came to think of it, a lot of these customer extensions just end up in some forms or reports. They do not interfere with the core functionality of your application. Why, then, should you store this data in a very structured way which causes all the associated hassle? Why not use, let’s say, the semi-structured XML data column in SQL server to store all this tenant-specific, and possibly ambiguous, data? The world-wide web is pretty much semi-structured and ambiguous, but it works pretty well in the end, doesn’t it? And if it works for the web, why should it not work for me?

My current approach therefore is to have an explicit distinction between the structured and semi-structured data I deal with in my application. As it turned out, the structured data is very fixed among the different tenants. This allows me to adopt a shared database, shared schema approach. This might, or might not be the best way to go security-wise (tenants should never ever be able to see each others data), but at least I do have the option. This distinction also leads to a much more elegant and robust design with less complexity.

So, the next time you are struggling with a db schema, rethink it over; does it really need to be structured? Should it really be indexable and queryable? Or does it allow for a semi-structured data approach? The choice for less rigidity and more flexibility might save you a lot of troubles down the road.

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Tags:

English | Computer | Professional

Powered by BlogEngine.NET 1.4.5.0
Theme by Mads Kristensen

Search