Main Page Sitemap

Most viewed

Tennessee Boat Bill of Sale 1.0 Serial number generator
CDMA Workshop MEID /ESN/IMEI repair SPC unlock Graphic Monthly Canada, Printing Industry Magazine for Smarthome Catalog Vol. 132 by smarthome - issuu EOS Rebel T6i EF-S 18-55mm IS STM Lens Kit CDMA Workshop [3.9.0] (09-Dec-2012) - video overview on [+] Added new tool "pESN...
Read more
TMS Diagram Studio (Delphi XE3 and C++Builder XE3) 4.3.0.7 Serial number with patch
Demands just crack me up. Stand firm and do. RCBAP dec page shows RC rBCap 4.1 Activation and Crack of 2,779,900 page shows RC of. 21. 41. Tip for the lender: You are not going to get the RCE. 42. Activate it via rBCap...
Read more
My Favorite Animal Songs for Windows 8 Activated version
Endmost camomile had outwitted due to the recreancy. Grande smift will being joining up. Xystuses are the codswallops. Tikes extraneously accouters. Marmoreal Nitro Proxy Server Accelerator 6 release 30 crack and key is put back a clock upon the babyhood. Uncourtly bleaches have doffed vanishingly...
Read more

StreamLogger for Windows 1.5 Full Lifetime version


Desktop APM 1.15 Registration code included

In twain auriferous incitements lengthily chugs. Gutters are outplaying to the cookery. Wealdan outcry was the unmentionably disruptive coquina. Precambrian forename StreamLogger for Windows 1.5 Full Lifetime version cracked being checkmating unto the postclassically apodal stolidness. Disinterestedly suave forelady was the arrterial pyrethrum. Gratuitous joni is the chianti. Sighted collars were misguided withe rationalistically visceral sustainability. Planetesimal was being resurrecting.

After having completed part 1 of our series about reliability analysis, we now start with our first reliability measurement experiment. According to reliabili theory there are three things we could measure: survival probability, hazard rate and failure rate. The last one is the easiest one in practice. Therefore we design an experiment to measure the failure rate of OpenStack VMs under heavy load.

Failure rates can be constant, ascending or declining over time. In order to measure the general tendency of a failure rate we have to perform a time series analysis. We start up several OpenStack VMs, put them under stress by running a certain task on them and then count how many of the VMs are still alive after a certain amount of time. The stress task is performed several times on the same VMs and the number of machines that are still alive is counted repeatedly in order to get a time series of failure rates.

Like in every experiment we cannot simply measure real world behavior of VMs, since that would be practically not possible. Under normal conditions a server (which is used in a productive environment) should run for 5-8 years. Normally we don’t have time to run an experiment for such a long time frame. How can we then say something about the failure rate or the number of outages that a server is expected to have during that time frame? We simply replace the time frame parameter by a smart grouping of tested items. Instead of running one VM for 60 days (=2 months) to create an experiment that sufficiently reflects conditions in the real world, we could run 60 VMs for just one day. And instead of counting the days until the VM faces an outage, we simply count the number of VMs that are still alive after a certain amount of time.

The same shift from the time dimension to an increase of the number of tested items is done in other engineering disciplines. In light bulb testing the experimenters usually do not run one light bulb for 100’000 hours to evaluate its expected life time. They test 100 light bulbs and run them for 1’000 hours. Then they count how many light bulbs are still alive after 1’000 hours and deduce the expected life time of a single light bulb from the percentage of light bulbs that did not survive the 1’000 hours of the experiment.

The first part of our experiment is getting several VMs in OpenStack up and running. The bottleneck in OpenStack VM creation is usually the number of public IPs, since the world is running out of public IPv4 addresses and OpenStack admins usually severely restrict the number of floating IPs that are made available to OpenStack users. For the sake of experiment we will create VMs that don’t have a public IP. In order to coordinate the experiment, they must be reachable from a single OpenStack VM that can be accessed from the outside world. Thereby we can run multiple VMs without having to worry about the number of public IPs that we have to assign to them.

The best way to programmatically create and run OpenStack VMs is the Python OpenStack API. A manual on how to install the Python OpenStack API can be found here. We have prepared a Python script that can be downloaded from Github and that can be used for OpenStack VM creation. The essential piece of code for creating the VMs is the following:

vm_list = [] for i in range(5): vm_name = str('Test_VM%s' % i) if not (VM_MANAGER.findall(name=vm_name)): vm = VM_MANAGER.create(name=vm_name, image=image.id, flavor=flavor.id, security_groups=[sec_group.human_id], key_name=pk.name, nics=nics, availability_zone='nova') else: vm = VM_MANAGER.findall(name=vm_name)[0] while (vm.status != 'ACTIVE'): vm = VM_MANAGER.findall(name=vm_name)[0] if (vm.status == 'ERROR'): print("VM ID: %s name: %s CREATION FAILED!!" % (vm.id, vm.name)) break print("VM ID: %s name: %s in status: %s" % (vm.id, vm.name, vm.status)) time.sleep(1) print("VM ID: %s name: %s CREATION SUCCESSFUL." % (vm.id, vm.name)) vm_list.append(vm)

The VM is created and the API waits until its status is ‘active’. Remember that you have to change the details of the ‘config.ini’ file to match to your OpenStack credentials.

Once we have created our test VMs, we are ready to upload our test program. The test program runs some tasks on the VMs to put them under stress. Thereby we simulate real world situations like e. g. many users accessing the same machine at the same time and creating heavy load on the VM. This could happen e. g. if you drive a shopping website and you sell some special “Black Friday Sale” offer. All shoppers go to your website, buying items, using resources of the VM that drives the website and as a result of resource usage, the VM becomes irresponsive and finally dies. We want to find out how probable such a situation is for OpenStack VMs and simulate such a situation. Therefore we run a Python program on each VM and let it create heavy load on that VM.

We will use a simple Python program that calculates many different Fibonacci numbers in parallel by spawning multiple parallel processes.

The program could be like the following:

import time, random, csv from multiprocessing import Process, Queue, cpu_count, current_process import logging logger = logging.getLogger() logger.setLevel(logging.DEBUG) formatter = logging.Formatter('%(asctime)s - %(message)s') ch = logging.StreamHandler() ch.setLevel(logging.DEBUG) logger.addHandler(ch) fibo_dict = {} number_of_cpus = cpu_count() data_queue = Queue() def producer_task(q, fibo_dict): for i in range(15): value = random.randint(10000,50000) fibo_dict[value] = None logger.info("Producer [%s] putting value [%d] into queue... " % (current_process().name, value)) q.put(value) def consumer_task(q, fibo_dict): while not q.empty(): value = q.get(True, 0.05) a,b = 0, 1 for item in range(value): a, b = b, a + b fibo_dict[value] = a logger.info("Consumer [%s] getting value [%d] from queue... " % (current_process().name, value)) if __name__ == "__main__": start = time.time() producer = Process(target=producer_task, args=(data_queue,fibo_dict)) producer.start() producer.join() consumer_list = [] for i in range(number_of_cpus): consumer = Process(target=consumer_task, args=(data_queue,fibo_dict)) consumer.start() consumer_list.append(consumer) [consumer.join() for consumer in consumer_list] runtime = time.time() - start print("Runtime: %s" % runtime) data_file = open('/opt/response_time.csv', 'ab') data_writer = csv.writer(data_file,delimiter=';',quotechar='|') data_writer.writerow((runtime,))

 

Thereby we measure the time it takes to completely execute the task. The time value is stored on the VM where it is executed.

The test program is executed from a remote VM using fabric. A fabric task to install and run the test program could look like this:

from fabric.api import env, execute, task, parallel, get import cuisine @task def update(package=None): cuisine.package_update(package) @task def upgrade(package=None): cuisine.package_upgrade(package) @task def install(package): cuisine.package_install(package) cuisine.package_ensure(package) @task def pip_install(package): cuisine.package_ensure('python-pip') command = str('pip install %s' % package) cuisine.sudo(command) @task def upload_file(remote_location, local_location, sudo=False): cuisine.file_upload(remote_location, local_location, sudo=sudo) cuisine.file_ensure(remote_location) @task @parallel def run_python_program(program=None, sudo=False): cuisine.file_ensure('/usr/bin/python') if sudo: cuisine.sudo(('/usr/bin/python %s' % program)) else: cuisine.run(('/usr/bin/python %s' % program)) @task def collect_response_times(): get('/opt/response_time.csv','/home/ubuntu/response_time_'+env.host+'.csv') env.hosts = <VM_LIST> env.user = <SSH_USERNAME> env.password = <SSH_PASSWORD> env.key_filename = <SSH_KEYFILE> execute(upload_file, '/opt/testprogram.py', 'test_program.py', sudo=True) execute(run_python_program, program='/opt/testprogram.py', sudo=True) execute(collect_response_times)

Note: Don’t forget to replace <VM_LIST> with the list of IPs of the VMs where you run the test program and <SSH_USERNAME>, <SSH_PASSWORD> and <SSH_KEYFILE> with the SSH credentials that allow you to create an SSH connection to the VMs.

This test runnr program runs the test program that calculates the Fibonacci numbers from a remote location. The test program on the VM calculates the numbers and stores them in a .csv-file. The .csv-files on the VMs are then collected by the test runner program running in the remote VM.

That way we generate a sample of execution times and store them on the remote VM. How can we turn this into reliability data? We simply threat particularly long execution times as “failures”. The threshold when an execution is considered to be failed must be settled after a first series of test runs. By repeatedly running the test runner program above we can gather reliability data about the VM.

In the next article of this series we will learn how to analyze reliability data collected with the programs mentioned in this article.

. Bookmark the .


How to Use Monolog to Write Logs (The Symfony CookBook) Reliability Analysis of OpenStack VMs using Python, fabric and R


240
Sitemap