http_load - another webserver peformance tester

Posted: June 7th, 2008 | Author: sofia | Filed under: open | Tags: , | No Comments »

Http_load is another cool webserver performance tester that gives simple stats on how your webapp is performing.

How to install in OS X

  1. Download from http://www.acme.com/software/http_load/
  2. Open terminal, cd to the directory where the archive is and unzip
    $ tar xvzf http_load-12mar2006.tar.gz
  3. Move to that directory
    $ cd http_load-12mar2006
  4. Run
    $ make
  5. Run
    $ sudo make install

You’re ready! Open up a text editor and write down the website’s url you want to test (your own preferably), then cd to the directory where the .txt is and run
$ http_load -parallel 5 -fetches 100 name_of_file.txt
which means open 5 concurrent connections and fetch the webpage 100 times.

You’ll get something like this:

100 fetches, 5 max parallel, 1.34237e+07 bytes, in 15.842 seconds
134237 mean bytes/connection
6.31234 fetches/sec, 847351 bytes/sec
msecs/connect: 28.9069 mean, 75.011 max, 14.865 min
msecs/first-response: 435.84 mean, 2484.28 max, 96.082 min
93 bad byte counts
HTTP response codes:
code 200 — 100

I highlighted the important bits. At the moment the webserver is capable of handling 6 requests per second and has a mean average initial latency of 435 milliseconds.

Http_load tells you how your webapp is currently performing allowing you to test it under different conditions, basically it’s a benchmarking tool juts like httperf i covered here. The next step is optimization. Have a look at the 1st part of  Getting Rich with PHP 5 (what a crappy title) by rasmus lerdorf  for tools you can use to profile your code and some tips on optimization. In the example shown he goes from 17 reqs/sec to 1100 reqs/sec .


measuring webserver performance - httperf

Posted: June 7th, 2008 | Author: sofia | Filed under: open | Tags: , | 8 Comments »

Httperf is a webserver performance tester. There are loads of performance testers out there (take a look here ) but i was up and running with httperf in no time. So here’s a quick get started guide

  1. Download the latest version from ftp://ftp.hpl.hp.com/pub/httperf/
  2. Install
    • $ tar xvzf httperf-0.9.0.tar.gz
    • $ cd httperf-0.9
    • $ ./configure
    • $ make
    • $ sudo make install

    Httperf is installed by default in /usr/local/bin/httperf. You then invoke httperf from the command line.

  3. Have a website to test (lol)
  4. Here’s a sample command
    $ httperf –server hostname –port 80 –ur /test.html –rate 150 –num-conn 27000 –num-call 1  –timeout 5
    Example: You have your site on localhost and for now just wanna test that.

    • $ httperf –server localhost –ur /about.html –num-conns 1000
      - test the page about.html in the localhost  server making 1000 concurrent connections
    • $ httperf  –-server=localhost –-wsess=12,8,2 –-rate=1 –-timeout=5
      • The –wsess sets the total number of sessions to generate, the number of calls per session, and the time (in seconds) that separates consecutive calls. If we use –wsess=12,8,2, we’re setting 12 sessions at five calls per session with two seconds between each call.
      • The –rate switch specifies the number of HTTP requests/second sent to the Web server — indicates the number of concurrent clients accessing the server. [Update] Actually when used together with –wsess it specifies the number of sessions and not of requests -> see comment by John Wilkinson below
      • The –timeout switch sets the maximum number of seconds to wait for a server response before httperf gives up. The default is forever so it’s good practice to set it just in case the server hangs (hangings your resources also). If this timeout expires, httperf considers the corresponding call to have failed.
      • The –num-conn sets how many total HTTP connections will be made during the test run - this is a cumulative number, so the higher it is, the longer the test runs
  5. Analyze the statistics printed to the console.
    There are six groups of statistics: overall results, results pertaining to the TCP connections, results for the requests that were sent, results for the replies that were received, CPU and network utilization figures, as well as a summary of the errors that occurred.
    Example printout:
    “Maximum connect burst length: 1
    Total: connections 100 requests 100 replies 100 test-duration 16.385 s

    Connection rate: 6.1 conn/s (163.8 ms/conn, <=1 concurrent connections)
    Connection time [ms]: min 135.5 avg 163.8 max 406.4 median 159.5 stddev 37.4
    Connection time [ms]: connect 19.0
    Connection length [replies/conn]: 1.000

    Request rate: 6.1 req/s (163.8 ms/req)
    Request size [B]: 64.0

    Reply rate [replies/s]: min 5.8 avg 6.1 max 6.2 stddev 0.2 (3 samples)
    Reply time [ms]: response 74.1 transfer 70.8
    Reply size [B]: header 514.0 content 15405.0 footer 1.0 (total 15920.0)
    Reply status: 1xx=0 2xx=100 3xx=0 4xx=0 5xx=0

    CPU time [s]: user 3.52 system 12.78 (user 21.5% system 78.0% total 99.5%)
    Net I/O: 95.3 KB/s (0.8*10^6 bps)

    Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0
    Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0

The connection rate, the request rate and the reply rate are the ones to look at. The better a website is performing (at the rate requested) the closer the connection and reply rate rate will be to the request rate specified in the initial command (–rate). Normally you do a series of tests, always increasing the request rate until you start to see that the reply and connection rate are no longer keeping up - that’s when you’ve hit your boundary, ie. how many requests per second your webapp is able to handle.

Also check autobench for automation of the testing process, here for an example of how httperf was used to benchmark the evolution of a project, an article from the source httperf—A Tool for Measuring Web Server Performance and finally this peepcode looks interesting.

Anyway, if i’ve missed any important information please say so in the comments.

[Update] Ted Bullock, one of the developers of httperf, was kind enough to point me to his quickstart guide, a six page long doc which has much more detailed information :=)


Data portability and data access

Posted: April 6th, 2008 | Author: sofia | Filed under: open | Tags: , , | No Comments »

There’s a growing trend for the web to become a (the) platform to be consumed by webapplications themselves. Users can use one login for different sites, sync information between different web applications, import/export information to/from a site. All this is related to identity management and data portability, and on a second level to digital communities/social networks.

Skype proposes that community building applications must have a defined set of areas/scenarios upon which they can interact to facilitate the desired interoperability between them. Here they are (skype journal):

Social Stack’s Six Zones of Interoperability

* ID (Account lifecycles, Login)
* Sync (Profile, Contacts, Objects)
* Permission (Policy, Licensing)
* Find (People Search, Discovery, Gatekeepers)
* Action (Group Actions, Relationship Actions)
* Now (Alerting, Presence)

The idea is that there can be one single sign in (openid), there must be a standard way to sync information between applications - eg. if i export my contacts/friends from facebook to hi5 and then remove that friend in facebook does it get deleted in hi5?, how about the other way round? -, to find people between apps - If i have a friend in facebook and i also have a myspace account, could myspace alert me that my friend is in the network as well? should myspace do it? Maybe the friend wouldn’t like it to because he stores his work colleagues in facebook and his closer friends in myspace. These are questions that must be solved for dataportability to become a reality. Also check Robert Scoble’s post on this topic.

Dataportability also poses the question of the unecessary duplication of content around the web. If a have a blog in wordpress is it really necessary for myspace to store and sync with wordpress my blog posts? Isn’t that just making things difficult, ie. 2 servers now have the same data and must sync this between them, how about if a user changes a post in the wordpress blog and another user changes the same post in myspace, when syncing which version wins? Ouch.. version control management. These are real roadblocks to dataportability.

In some cases it may be simpler to just allow for data access. In the example above, why not just let myspace access the wordpress blog through an rss feed and every change made in wordpress immediately gets reflected in myspace? So instead of dataportability - taking my data from one place (exporting) to another place (importing) -, why not just simple data access?

Or maybe we need both.

Maybe dataportability and data access have different use cases? You want your data to be portable between competing/equal services and you also want to share it with different services. Eg. i want to export my photos from flickr and import them to imageshack and i also want to show my photos in my wordpress blog whether they’re stored in flickr or imageshack.

Ultimately it will be up to the users to decide if they want to take their data with them or just share it. Developers just need to work out a way to make this possible.