Attila Oláh

SRE @ Google Zürich

  • 21 Nov 2010

    2010 Neoplanta Sprint

    Today is the first day of the 2010 Neoplanta Sprint, Serbia. Now I’m not really an expert in Plone, but I have had some experience with Zope 3 BlueBream, and since the sprint is taking place in my favourite city, I decided to take part.

    The event is taking place at Hotel Putnik, in the city centre. I’m arriving today afternoon, about 5pm. Looking forward to meet new people.

    Novi Sad at night View from Petrovaradin Fortress at night. Picture: EESTEC.

    Petrovaradin Fortress at night Petrovaradin Fortress at night. Picture: me.

    • programming,
    • travel
  • 08 Aug 2010

    Benchmarking urllib2 vs. urllib3

    urllib3 seems to be a long-abandoned project on PyPI. However, it has some features (like re-using connections, aka HTTP Keep-Alive) that are not present in the Python 2 version of urllib and urllib2. Another package that provides HTTP Keep-Alive is httplib2.

    Benchmark results on a single host

    Keep-Alive can significantly speed up your scraper or API client if you’re connecting to a single host, or a small set of hosts. This example shows the times spent downloading random pages from a single host, using both urllib2 and urllib3:

    urllib-benchmark-results.png

    urllib2 vs. urllib3 benchmark results urllib2 vs. urllib3 benchmark results

    The benchmark script

    Here’s a script that will benchmark urllib2 and urllib3 for the domain theoatmeal.con, and write out the results to a CSV files (easy to importy to Google Docs Spreadsheet and generate a nice chart).

    If you run it, it will also prent the result summary, something like this:

    Starting urllib2/urllib3 benchmark...
     * crawling: https://theoatmeal.com/
     * crawling: https://theoatmeal.com/comics/party_gorilla
     * crawling: https://theoatmeal.com/comics/slinky
     * crawling: https://theoatmeal.com/blog/floss
     ...
    Finishing benchmark, writing results to file `results.cvs`
    Total times:
     * urllib2: 183.593553543
     * urllib3: 95.9748189449
    

    As you can see, urllib3 appears to be twice as fast as urllib2.

    • programming
    Source
  • 07 Aug 2010

    Writing your own DOM ready listener

    Today I asked a question on StackOverflow on how to attach a function to the browser’s DOM ready event, in a cross-browser way, but without exporting any globals (keeping everything in an anonymous function’s closure) and without including any external file. As a result, with some help of a friendly StackOverflow user, I put together a code snippet that:

    • takes a single function as argument,
    • attaches that function to the DOM ready event in all browsers supported by jQuery,
    • is idempotent (will never fire the given function twice),
    • does not export any globals,
    • compiles down to less than 590 bytes (less than 300 bytes gzipped),
    • is based on the jQuery source code (I take no credit for it).

    I take no credit for writing this script. If you want to use it, please include jQuery’s license comment.

    Update

    Here is a CoffeeScript version:

    • programming
    Source
  • 07 Aug 2010

    Asynchronous JavaScript loading

    If your website has a sufficient amount of static content, it might be a good idea to load all the extra JavaScript files asynchronously. Thismy old blog, for example, shows the static content as soon as possible, allowing its visitors to read the main content (the article) while the not-so-important content (like Facebook Like buttons, “Web 2.0” widgets and all that crap) is on its way from the server.

    I use the following snippet to load external JS:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    
    (function () {
    
        // Asynchronous JS loader
        var load = (function () {
            // Private members
            var create = function (url) {
                var script = document.createElement('script');
                script.type = 'text/javascript';
                script.async = true;
                script.src = url;
                document.getElementsByTagName('head')[0].appendChild(script);
            };
            // Public: export a Function object
            return function (url) {
                setTimeout(function () { create(url); }, 1);
            };
        })();
    
        // Call the other script files from here
        load('https://www.example.com/foo-script.js');
        load('/media/js/some-local-js-file.js');
    
    })();
    

    I put all that stuff in a closure so nothing gets exported to to the global namespace. Note that the setTimeout trick is from here.

    Note also that some JavaScript files will not work if you load them like this. These are ones that expect to be run before DOM readyness. Such an example is less.js.

    Good candidates for asynchronous loading are the Facebook JavaScript SDK and Google Analytics.

    Update

    • You can load less.js too, just trigger a less.refresh() after it has been loaded.
    • Have a look at Richard Neil Ilagan’s implementation as well.
    • programming
  • 08 May 2010

    Berlin, Delft, and more

    I’ve spent the last two weeks in Berlin, Germany, working on two projects: an ecommerce web application and a Facebook app.

    The former is actually an ongoing project, involving many interesting technologies, such as working with the eBay API, extending the Django admin interface, geocoding (and reverse geocoding), CMS, domain and subdomain management, etc. The other project is a small app that works based on the users’ locaiton.

    As a result of these two weeks, Sproud Ventures UG (the company that sponsored the event) will open-source a python package for doing geological lookups and other useful things in web applications. The package contains a raw WSGI middleware for doing IP-based, keyword-based and coordinate-based lookups. Other handy features include Django template tags and a template context processor. I’ll write about it in details when it gets released (that is, when I find some time to improve test coverage and review the documentation.)

    During my stay I tested a few toolkits and learned some new techniques. Here are the highlightn, including some CSS and JavaScript tricks:

    • LESS, a very neat tool for writing structured, object-like CSS. Lets you define your template colours in a separate library, import it and use in other styles, use variables, basic arithmetic, and then compiles everything into a valid, nicely-formatted CSS file. I’ll definitely use it in my future projects.
    • The JavaSctipt Revealing Module Pattern can too be very handy. I’ve written lots and lots of hacky JavaScript snippets in the past, so now I’m trying to force myself to write more organized code.
    • Always use twod.wsgi when working with Django. Makes life much easier.
    • Use even more third party tools. There’s so many great libraryes out there. There are a lot of crappy ones too, but some of them can be improved. +1 for publicly forking projects on GitHub and BitBucket.
    • Use YAML, even more.

    Next stop, Delft, The Netherlands. New week, new project, new experiences and friends.

    Hermina and I In the Antwerp zoo.

    • travel
  • 14 Apr 2010

    Simple Python word finder

    A few days ago one of my customers asked me to put together a very simple Python script that would search through a text and find all the words that contain both letters and numbers. As simple as it may be, I’ve decided to make it a little more robust by wrapping it in a class that can be configured to split words and check patterns for each word.

    This is just too simple to be released as a module, but I’ll put the code here in case anyone needs it. I’m placing it in the public domain.

    • programming
    Source
Previous
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
Attila Oláh //
atl@google.com