Scalability

Scalability is understood in the context of software, a system of hardware and software, its performance increases by adding resources or other nodes / computers in a defined area proportional (or linear). (The term, however, the expansion ability of a business model in the Business Administration serves more generally to refer to, for example, by transferring to new markets or franchising. )

However, a generally accepted definition of this term is not trivial. It is necessary for each particular case, always specify a range. (for example, does not necessarily have the same scale well as with 100,000 accesses a system with 100 concurrent accesses)

Resources can be, for example: CPU, RAM, hard drives, network bandwidth, ...

The scalability of a system (also called SpeedUp ) by the scaling factor given.

2.1 Lastskalierbarkeit 2.1.1 Example museum wardrobe

6.1 Reduce DNS lookups and number of objects
6.2 Scale of the data management layer 6.2.1 Scaling with respect to database access
6.2.2 Scaling with respect to database entries - denormalization 6.2.2.1 fragmentation
6.2.2.2 partitioning

Vs. vertical. horizontal scale

One can enhance the performance of a system in two different ways:

Vertical scale ( scale-up )

Under vertical scaling is defined as a to improve performance by adding resources to a node / computer of the system. Examples of this would involve increasing space, adding a CPU, or installing a more powerful graphics card.

Is typical for this type of scaling that a system can be made more rapidly, regardless of the implementation of the software. That is, it must be changed in order to experience a performance improvement through vertical scaling no line of code. The big disadvantage is that you come sooner or later to a limit at which you can not upgrade the computer, if you already used the best hardware that is at this time on the market.

Horizontal scale ( scale out)

In contrast to the vertical scale of the horizontal scaling no limits ( from a hardware perspective ) are set. Horizontal scaling means to increase the performance of a system by adding additional computers / nodes. The efficiency of this type of scaling is, however, highly dependent on the implementation of the software, since not all software is equally well parallelizable.

Types of scalability

There are basically four types of scalability:

Lastskalierbarkeit

Lastskalierbarkeit stands for a constant system behavior over larger load ranges. This means that a system does not have too large a delay for both low, medium, and high load, and the requests can be processed rapidly.

Example museum wardrobe

In a wardrobe in a museum in which visitors leave coats and pick up again, the first-come- first-served principle. There are a limited number of coat hooks and a larger number of visitors. The dressing room, where the visitors lining up in a row, is a carousel. To find a free hook, or his jacket, looking visitors each linearly thereafter.

Our goal now is the time that a visitor can actually spend in the museum, to maximize.

The performance of this system is dramatically bad under high load. First, the Search -free hook vastly more complex, the less free hooks are available. Second, a deadlock is inevitable under heavy load ( eg in winter ). During the morning all the other visitors jackets, bring these back in the evening from all. A deadlock is likely to occur at noon and in the early afternoon, when no free clothes hooks are available more and more visitors at the end of the queue to pick up her jacket.

Persons who wish to pick up her jacket could resolve this deadlock by asking the arriving visitor, to be admitted in the queue. Because the people who pick up her jacket, will ask only after a certain timeout after, this system is highly inperformant.

Increasing the number of coat hooks would only postpone the problem, but not fix. The Lastskalierbarkeit is therefore very poor.

Spatial scalability

Spatial Scalability, a system or application when the memory requirement in a growing number of elements to be managed not unacceptable rises high. After a relative term is " unacceptable," one speaks in this context, usually of acceptable if the memory requirement rises at most sub - linearly. To achieve this, for example, a sparse matrix (English sparse matrix ) or data compression can be applied. After data compression takes some time, this is, however, often in contradiction to Lastskalierbarkeit.

Spatio-temporal scalability

A system has a spatio-temporal scalability when increasing the number of objects that includes a system that does not impact significantly on its performance. for example, includes a search engine with linear complexity no spatio-temporal scalability, while a search engine indexed or sorted data, eg, using a hash table or a balanced tree, might well show a temporal- spatial scalability.

Structural scalability

Structural scalability is characterized from a system whose implementation does not significantly interfere with the increase in the number of objects within a user-defined range.

Relationship between the types of scalability

Having a system of course can have several types of scalability, the question arises whether and how these relate to each other.

The Lastskalierbarkeit a system will not necessarily be adversely affected by poor spatial or structural scalability. Have systems with poor spatial or temporal- spatial scalability, due to the overhead of memory management and the high search effort, and possibly a bad Lastskalierbarkeit. Systems with good temporal and spatial scalability have a bad Lastskalierbarkeit under circumstances when it is not sufficiently parallelized for example.

The relationship between structural scalability and Lastskalierbarkeit looks like this. While the latter has no effect on the former, which can be reversed very well be the case.

The different types of scalability are not independent from each other entirely.

Scaling factor

The scaling factor ( SpeedUp ) describes the actual performance gain of an additional resource unit. for example, can bring 90 % extra power a second CPU.

From a super- linear scalability is when the scaling factor when adding resources is greater.

Linear scalability means that the scale factor of a system per unit of added resources remains the same.

Sub - Linear scalability is in contrast to the decrease of the scaling factor when adding resources.

Negative scalability is achieved when the performance even worse by adding resources / computers. With this problem you have to fight when the administrative burden, which is caused by the additional computer is greater than the performance gain thus achieved.

Amdahl's law is a relatively pessimisitisches model to estimate the scaling factor. Based on Gustafson's law is an additional method for calculating this factor.

System as a layered model

To build a scalable system possible now has been proven in practice to implement such as a layer model, because with this approach, the individual layers are logically separated from each other and each layer can be scaled for themselves.

A very popular architecture in the web area is the 3- tier architecture. In order to achieve this high scalability, is a crucial factor that that each of these three layers scale well.

During the presentation layer can be scaled relatively easily, a specially designed implementation of the code is in the logic layer necessary for it. It should be noted that the largest possible portion of the logic can be parallelized ( see Amdahl's law and Gustafson's law above). The most interesting part is the horizontal scale of the data storage layer, which is why this subject in a separate section (see scale-out the data management layer below) is dedicated.

Practical methods for improving the scalability of web pages

Improving the scalability of web pages can be achieved by increasing the performance, because a server can thereby serve more clients at the same time.

Martin L. Abbott and Michael T. Fisher have set up 50 rules that need to be considered in terms of scalability. For websites, inter alia, the following rules are relevant:

Reducing DNS lookups and number of objects

While watching the loading of a page in any browser with a debugging tool (eg Firebug ) is striking that similar sized elements occupy different long loading times. On closer inspection, one realizes that some of these items require a additional DNS lookup. This process of address resolution can be accelerated by DNS caching at different levels (eg browser, operating system, Internet service provider, etc.). In order to reduce the number of lookups, you could now summarize all the Javascript and CSS files to each one and you could combine all the images on a large and using CSS sprites to display only the desired image. Generally, one can establish the following rule about this: The fewer DNS lookups when a page is needed, the better the performance. The following graphic illustrates how expensive the DNS lookup and the connection are proportionate.

Modern browsers can, however, several objects download multiple simultaneous connections to a server hold open parallel. According to HTTP/1.1 RFC, the maximum should be limited to concurrent connections per server in the browser 2. Some browsers ignore this policy, however, and use a maximum of 6 simultaneous connections, and more. Is reduced to a website but now all Javascript and CSS files, as well as all images on only one file, so relieving Although the offering server, but levers at the same time this mechanism of parallel connections of the browser.

Ideally, one uses this parallelization in the browser entirely and has at the same time as few DNS lookups. To achieve the distributed to a website best on multiple subdomains (eg calls to images from a subdomain on while you load videos from a other ). This approach can rapidly achieve a significant performance increase. However, there is no general answer to how many subdomains you should use in order to achieve the best performance. Simple performance tests to be optimized page should also, however, rapidly provide information.

Scale of the data management layer

Scaling with respect to database access

The most difficult to be scaled part of a system is usually the database or data storage layer (see above). The origin of this problem can be traced back to the paper " A Relational Model of Data for Large Shared Data Banks " by Edgar F. Codd, which introduces the concept of a Relational Database Management System (RDBMS ).

A method to scale to databases is to make use of the fact that most applications and databases have significantly more reads than writes is. A quite realistic scenario that is described in the book by Martin L. Abbott and Michael T. Fisher, a book reservation platform which has a ratio between read and write accesses of 400:1. This type of systems can be relatively easily scaled by multiple read-only copies of these data are made.

There are several ways to distribute copies of this data depending on how current the data of duplicates must be real. Basically, there should be no problem, that these data are synchronized only every 3, 30, or 90 seconds. In the scenario, the carrying platform, there are 100 000 books and 10 % of which are reserved daily. Assuming that the reservations are evenly distributed throughout the day, so will take approximately one reservation per second ( 0.86 sec.) The probability that at the time ( within 90 seconds ) of a booking another customer would like to book the same book, is 0.104 %. Of course, this case can and will arrive sometime, but this problem can easily be counter through an exhaustive review of the database.

One way of implementing this method is that (for example Redi ) to cache the data, such as a key-value store. The cache must be renewed after the expiry of its validity and thus relieves the database tremendously. The easiest way to implement this cache is to install it in an existing layer (eg, the logic layer). For a better performance and scalability, however, is used for a separate layer, or own server, between the logic layer and data access layer.

The next step is to replicate the database. Most known database systems already have such a function. MySQL does this with the master-slave principle, where the master database is the actual database with write access and the slave databases are replicated read-only copies. The master database records all updates, inserts, deletes, etc. in the so-called binary log and slaves reproduce this. These slaves are now behind a load balancer (see below) to the load to be distributed accordingly.

This type of scaling is relatively easy to scale the number of transactions. Are used, the more duplicates of the database, the more transactions can be handled in parallel. In other words, it means that now any number of users (of course on the number of server -dependent) can simultaneously access our database. This method does not help us to scale the data itself. To be able to store lots of data in the database as desired, a further step is required. This problem is treated in the next section.

Scaling with respect to database entries - denormalization

What you want to achieve with this is to split a database across multiple computers and expand its capacity by any other computer. For this purpose, the database must be denormalized to a certain degree. Under denormalization is defined as the conscious withdrawal of a normalization for the purpose of improvement of the runtime behavior of a database application.

In the course of denormalization, the database must be fragmented.

Fragmentation

A distinction is made, horizontal and vertical fragmentation.

In the horizontal fragmentation ( Eng. sharding ) the set of all records in a relation is split into several tables. If these tables are on the same server, then it is usually to partitioning. However, the individual tables can also be located on different servers. Thus, for example, the data for the transactions are stored in the United States on a server in the U.S. and the data for the business with Europe on a server in Germany. This division is also referred to as regionalization.

Horizontal fragmentation provides no redundancy of the stored data, but the structures. When a relation has to be changed, not only a table to be changed, but all the tables need to be modified, the data is distributed from the respective relation. Here there is the risk of abnormalities in the data structures.

In the vertical fragmentation dependent attributes ( non- key attributes ) of a table into two or more groups are divided. From each group will have its own table, which are complemented by all the key attributes of the original table. This can be useful if the attributes of a relation resulting datasets with a very large record length. Additionally, if the accesses usually affect only a small number of attributes, one can summarize the few frequently accessed attributes in a group and summarize the rest in a second group. The accesses commonly performed with speed, because a smaller amount of data from the hard disk needs to be read. The accesses rarely performed on the remaining attributes are thus not faster, not slower.

From which sentence length splitting is useful in several smaller tables, also depends on the database system. Many database systems store data in the form of blocks from a size of 4KiB, 8KiB or 16KiB. When the average block length is slightly greater than 50% of a data block, then a lot of space unused. If the average record size is greater than the block size used, the data accesses are costly. If BLOBs occur in a relation together with other attributes, vertical fragmentation is almost always beneficial.

Partitioning

Partitioning is a special case of the horizontal fragmentation.

Large data sets can be administered easily when the data of a relation divided into several small parts (= partitions), and these are stored separately. If a partition of a table is being updated, then other partitions of the table can be reorganized at the same time. If an error is discovered in a partition, then this single partition can be restored from a backup, while programs can access files on the other partitions on. Most established database vendors offer partitioning, see, eg, partitioning in DB2 and partitioning in MySQL.

Most database systems offer the option to either address individual partitions or to address all partitions in a single table name.

By partitioning the data accesses can be accelerated. However, the essential advantage is the ease of manageability of the entire table.

Linear search Self-balancing binary search tree Firebug (software) Redis Cache (computing) Binary Large Object Integrated Authority File

399282