Living in the Compute Cloud – Web 2.0 Expo Berlin

Your site can have a lot of traffic, for many different reasons. Apart from that, your site can experience peaks of traffic.

To deal with this you can build your own infrastructures, but today there are other solutions available, such as the ones provided by Amazon and by Google.

Amazon web services

They are several platforms:

  • s3 is used for storage
  • ec2 is an on demand virtual server controlled with web service api (you can use your favourite linux distribution). It provides Acl for port control, you can choose datacenter (currently only in the US), and do a snapshot backup to s3
  • simpledb is a hash-like database that store items with attribute/value pairs. It is meant for small items, organized into domains, redundant and distributed, has no schema, in it everything is a strin, it allows to use list values, you use sql-like queries to retrieve data

Google apps engine

With this solutions you run your application directly on the Google infrastructure. There is no concept of hardware – you just deploy an application. For the moment it’s limited to python and for sure it has not the same flexibility of the Amazon Solutions. As a compensation for not having access to low level sockets you can use memcache, image, email, url fetch, google auth and users. The platform is limiting but takes care of scaling problems.

Bigtable is Google solution for database. It is very similar to simple db (no schema, list values) but also very different (data type support, references and multiple tables, blob files (1mb)). What is very limiting is that results can only last for a couple of second, after that they are killed by the system. On the other hand it is very easy to use. In few words, you have to accept the limitations.

With Google Apps Engine you have no background jobs, no possibility to backup/snapshot data, emails can only be sent from google accounts and it’s restricted to pure-python libraries and given apis

Considerations and usage suggestions

The impression taken from this session is that we need to use a lot of tricks to proficiently use these tools, even Amazon. The speaker illustrated some case such as uploading users data with authentication.

If the application I developed needs extra capacity for an unknown period of time with Amazon ec2 is quite easy to start additional instances. It’s a matter of using a time base systems, such as cron (amazon)

If the need is for something that is load balanced a possible solution is to itegrate ec2 usage with some monitoring tool, such as Monit. With these tools I can monitor if the load is too high and eventually add new instances. Monitoring for these solutions is the hard part to do because there is no ready solution for it

Even if the site has its own infrastructure that works it’s possible, if neededn, add extra capacity connecting to ec2, so to combine the best of both worlds. However ec2 is not available in Europe at the moment and so there could be latency problems.

Real life use cases of these platforms:


Final thoughts

  • get accustomed to eventual consistency (not sure that queries of few milliseonds are updated in all instances)
  • be prepared to leave relational database
  • many miss strong SLAs – most of the time u can live fine without
  • hardware is a commodity – only specialize in it if it really necessary
Jonathan Weiss
A Ruby consultant and partner at Peritor Wissensmanagement GmbH in Berlin, Germany. For the last years he has been developing and consulting large Ruby on Rails projects where he focused on Scalability and Security. He is an active member of the Ruby and Rails community and is the developer of the Open Source deployment tool Webistrano. In his spare time he maintains Rubygems and Rails in the FreeBSD Ports system.

RIA and Ajax Security Workshop – Web 2.0 Expo Berlin

A very interesting and informative talk dealing with the new types of attacks that affect web 2.0 applications and RIA in particular.

The session was divided in 2 parts, the first about AJAX and the last about Rich Internet Applications.

The slides of this talk are available on slideshare and are impressive for their completeness. Not only they provide detailed examples for every case illustrated, but they link to a series of articles and web resources.

The main problem of this talk is that it’s quite impossible to be able to be specific enough and, at the same time, don’t get too much into details. This resulted in some hard-to-understand parts.


In general attacking an AJAX application is more difficult compared to a web 1.0 site. But on the other hand is more difficult to protect an AJAX application because there are more ways to exploit it and new ways are discovered every day.

  • Not all “web 2.0” sites use new technologies (such as Youtube and MySpace)
  • A single page in Myspace has a lots of includes.
  • Also Google Maps has a lot of includes, but ofJavascript code. Google code can be potentially insecure

Why care about web 2.0 security

  • People changed how they interact with web sites (they erase privacy barriers and they don’t feel the distance. The are the new generations)
  • Technologies spread from innovators to traditionalists (today AJAX in financial institutions, health care, government) – mainstream
  • Bugs are affecting people now

Discovery and method manipulation

  • Playing with parameters is still an excellent web attack (asking application to do the work for you). As business logic gets more complex, so do parameters vulnerabilities
  • Figuring out web apps is tough part of pen-test

Two types of Ajax apps

  • client-server proxy (equivalent to SOAP, client hides javascript)
  • client-side rendering (we can see the javascript and know what it does)

Cross Site Scripting

  • Downstream communication methods are much more complicated
  • User controlled data might be contained in arguments in dynamically created javascript, contained in Javascript arrays, etc. As a result, attack and defence is more difficult

Four bugs

  • downstream JS Arrays. Dangerous characters
  • XSS payload can be tucked into many places
  • XSS might already be in the dom (document.url, document.location, document.referer).
  • AJAX uses “backend” requests never expected to be seen directly in browser


Is ill-defined. Many contain many terms, AJAX, Flash, offline mode, decoupling from the browser. There is a huge disparity in features and security design.

Why use RIA

  • to increase responsiveness
  • desktop integration
  • to write full desktop apps

RIA Frameworks

No one framework is without limits and security problems. The worst seems Adobe Air because it shows all the limits of the very old ActiveX model.

The frameworks:

  • Adobe AIR
  • Microsoft Silverlight
  • Google Gears
  • Mozilla Prism

Adobe Air

  • Full-featured
  • Cross-browser, cross-platform
  • Created with Flex, Flash
  • Can be invoked by browser with arguments, like ActiveX or Flash
  • Air is best thought as ActiveX than Flash ++ (code runs with full privileges and can install malware)
  • SWF files can import functionalities that allows them to interact with AIR applications
  • SWF files can check install status and version
  • By default, code included in AIR application has full rights
  • There is not a “code access security” model such as in Java or .Net
  • AIR has many ways of loading executable content to run, such as HTML/JS and SWF
  • AIR applications can be bundled as binaries
  • Problems: allowing users to install signed applets is dangerous. Allowing self-signed is terrifying
  • Some suggestions to adobe: change default action, disable unsigned install prompts


Lot of sensibility toward security

  • Is the Microsoft Flash equivalent
  • Cross browser and cross platform
  • Subnet of the .NET frameworks
  • The security model is based on .NET
  • Calling system primitives the system will fail. You need to isolate it
  • What could go wrong (threading, DoS attacks against local system)

Google Gears

  • Has SQLite embedded
  • Uses an homegrown API for synchronizing data
  • Has a LocalServer
  • Works offline via SQL database, local assets and a local app server
  • Uses some origin to restrict access to site databases and LocalServer resource capture
  • Provides for parametrized SQL
  • Unfortunately they allows personalization of opt-in screen

Yahoo! Browserplus

  • A very bad idea
  • Runs as a browser plugin, with a separate helper process
  • It’s very similar to ActiveX concepts
  • Use old version or Ruby. Perfectly safe as long as you don’t use strings and arrays

Mozilla Prism

  • Wraps webapps to appears as desktop apps
  • Standalone browser instance
  • Problem: the Javascript included with webapps has full XPCOM privileges (but no content scripting privileges)
  • Problem: the sandbox isn’t real


HTML introduces some new concepts related to storage of informations.

  • Introduces DOM storage (sessionStorage, localStorage, database storage)
  • The major goals are more storage space and real persistence, because cookies are considered too small and users delete cookies or won’t accept them
  • This method bypasses pesky users, that however can use a specific about:config directive

Browser based SQL Databases

  • Injection becomes far more damaging (because of lot of privileges)


  • prevent predictability named data stores
  • parametrize sql statements


  • RIA frameworks widely vary in their security models
  • It is highly likely that web developers will introduce interesting flaws into their desktop applications
Alex Stamos is a Founding Partner of iSEC Partners, Inc, a strategic digital security organization. Alex is an experienced security engineer and consultant specializing in application security and securing large infrastructures, and has taught multiple classes in network and application security. He is a leading researcher in the field of web application and web services security and has been a featured speaker at top industry conferences such as Black Hat, CanSecWest, DefCon, SyScan, Microsoft BlueHat and OWASP App Sec. He is a contributing author of “Hacking Exposed: Web 2.0” and holds a BSEE from the University of California, Berkeley.

I siti pigri sono i più veloci

Ho messo da parte in questi anni un bel po’ di materiale e documentazione relativi alla performance e ottimizzazione dei siti web, sia per quanto riguarda il cosiddetto lato server, sia per quello che viene chiamato front end.

Verrà – spero presto – il momento di compilare un elenco ragionato di tutte queste risorse (potete farvene un’idea visitando la sezione optimization del mio delicious), ma ora mi limito a citare un articolo che propone in modo molto chiaro uno dei nodi fondamentali da affrontare. Si tratta di Lazy web sites run faster scritto da Gojko Adzic.

Per aumentare le performance dei siti potete investire sull’hardware, quindi più processori e sistemi più veloci, migliore connettività, infrastruttura moderna. Vi accorgerete però che anche così facendo il web server fatica, per architettura, a gestire un sito il cui codice non sia ottimizzato.

Potete allora dedicarvi alla riscrittura (o refactoring) del codice per renderlo più veloce. Anche qui però arriverete ben presto a un limite.

Il segreto, secondo Gojko, sta invece nella progettazione di un sistema che si preoccupi di

  • delegare le operazioni più complesse a processi che girano in background;
  • non comunicare con sistemi esterni in modo sincrono, non importa quanto velocemente;
  • essere pigro: meglio lasciare per dopo tutto quello che non ha necessità di essere eseguito al momento.

Aggiungerei anche di eliminare le elaborazioni inutili, come per esempio l’esecuzione a ogni richiesta di interrogazioni esose (come quelle verso i database) per contenuti che non cambiano quasi mai. In questo caso potrebbe essere interessante sperimentare qualche meccanismo di caching.

Se ripenso ai colli di bottiglia dei progetti che ho visto da vicino, la maggior parte poteva essere evitata rimandando operazioni non immediatamente essenziali, come per esempio:

  • l’invio di un messaggio di posta elettronica di conferma;
  • la trasformazione di file (soprattutto in formato XML);
  • la comunicazione con sistemi di gestione;
  • il calcolo di statistiche.

Capita di trovare anche online degli esempi che fanno riflettere. Ogni volta che utilizzo la funzione “all time” di Feedburner per analizzare il traffico complessivo dei miei feed mi trovo ad aspettare almeno una decina di secondi. Probabilmente il sistema sta elaborando il consuntivo in tempo reale, quando avrebbe potuto farlo a priori. Non c’è nulla di male ad aspettare anche se a volte, per carico del server, viene restituito un timeout. Forse non proprio il modo ideale per gestire questa funzionalità, anche se utilizzata da una minoranza.