For a new hosting service I am using the new WordPress 3.x multi-site features extensively. For new customers I added a custom theme based on the default Twenty Ten theme. As the main audience aren’t native speakers – English that is – I also need translations for my custom theme. I followed the path twentyten is showing my copying my .mo file into a subdirectory of the theme called languages. Unfortunately the file is not loaded automagically so you need some custom code which I added to my custom theme’s functions.php file.

1
2
3
4
5
6
7
const MY_THEME = 'name of your theme';
function mytheme_setup() {
        // The first occurence of MY_THEME is the name of your textdomain as used in the templates
	load_theme_textdomain( MY_THEME, str_replace('twentyten', MY_THEME, TEMPLATEPATH) . '/languages' );
}
// Tell WordPress to run mytheme_setup() when the 'after_setup_theme' hook is run.
add_action( 'after_setup_theme', 'mytheme_setup' );

If you are defining your own version of twentyten_setup() just add the line with load_theme_textdomain(…) in there. OOP in WordPress would be nice here and make things so much easier!

keine

The Seagull framework which we love to use for our projects offers two built-in methods for session management: files and database.

In our podcast project we started out using the file based sessions years ago. A while back we switched to so-called extended sessions which are saved in the database. Not so long ago we switched back to files again as the requests just for the sessions to the database became a serious bottleneck in our installation.

We did a relaunch of our podcast service beginning of September with sessions still stored to in files. This might have worked well if we had not switched to serving the website’s resources including the sessions through a high available NFS4 server. The server with the active NFS4 export hit its maximum capacity randomly making the service unusable.

I could have tried to switch back to the database handler as we have new, much more powerful machines. But I did not even bother as I assumed I’d eventually run into the same problems as before. Instead I researched sessions saved in memory. I knew that PHP offers shared-memory sessions. I tried that a while ago with no luck. During my research I came across memcached sessions. As I am already using memcache to store objects in the application I thought this would be ideal. And as it turned out today when applying the following changes to our live system it is!

To make Seagull work with memcached sessions only two minor changes to the code base had to be made. In the SGL core library Session.php change the constructor as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
        if ($conf['session']['handler'] == 'database') {
             $ok = session_set_save_handler(
                array(& $this, 'dbOpen'),
                array(& $this, 'dbClose'),
                array(& $this, 'dbRead'),
                array(& $this, 'dbWrite'),
                array(& $this, 'dbDestroy'),
                array(& $this, 'dbGc')
                );
        } elseif ($conf['session']['handler'] == 'memcache') {
           session_save_path($conf['session']['save_path']);
        } else {
            session_save_path(SGL_TMP_DIR);
        }

The second change is in the _init function:

1
2
3
4
5
6
7
8
9
10
11
12
13
            if ($conf['session']['handler'] == 'file') {
                //  manually remove old session file, see http://ilia.ws/archives/47-session_regenerate_id-Improvement.html
                $ok = @unlink(SGL_TMP_DIR . '/sess_'.$oldSessionId);
            } elseif ($conf['session']['handler'] == 'database') {
                $value = $this->dbRead($oldSessionId);
                $this->dbDestroy($oldSessionId);
                $this->dbRead(session_id());          // creates new session record
                $this->dbWrite(session_id(), $value); // store old session value in new session record
            } elseif ($conf['session']['handler'] == 'memcache') {
                // do nothing - just do not complain or fail
            } else {
                die('Internal Error: unknown session handler');
            }

So just add the lines with memcache and below. That’s it!

To make Seagull use the memcache session handler adjust your config accordingly, e.g. my local one looks like the following:

1
2
$conf['session']['handler'] = 'memcache';
$conf['session']['save_path'] = 'tcp://127.0.0.1:11211?persistent=1&weight=1&timeout=1&retry_interval=15';

Live we use several memcache servers which you can address with a comma separated list of servers, e.g.

1
2
$conf['session']['handler'] = 'memcache';
$conf['session']['save_path'] = 'tcp://127.0.0.1:11211?persistent=1&weight=1&timeout=1&retry_interval=15,tcp://127.0.0.1:11212?persistent=1&weight=2&timeout=1&retry_interval=50';

Make sure your memcache server(s) listen(s) on the correct IP address and port. Otherwise you will get a blank screen and/or a nasty error message.

Now you should be ready to go. Experience a never before known speed of your PHP applcation!

einer

Wie der Titel schon sagt, ist das folgende Shell-Konstrukt eine Möglichkeit eine Leerzeile am Anfang einer PHP-Datei zu finden.

find . -name \*.php -exec grep -x "^$" -m 1 -n -H {} \; | grep ":1:"

Mit find suchen wir alle Dateien mit der Endung .php rekursiv ab dem aktuellen Verzeichnis. Die gefundenen Dateien durchsuchen wir mit dem ersten grep und einem regulären Ausdruck (“^$“) nach einer leeren Zeile, wobei uns nur der erste Treffer interessiert. Der Switch -H gibt den Dateinamen der gefundenen Datei aus. Wir leiten die Ausgabe mit einer Pipe ( | ) weiter. Mit dem zweiten grep filtern wir nur die Ergebnisse, die die Leerzeile auf Zeile 1 haben.

keine

Running a successful website is a constant struggle for performance and speed improvements. Read how we have used the Seagull framework to build our portal podcast.de upon it. As a start-up we provide a web based service to find, comment, play and recommend audio and video podcasts. At the moment the service is intended for a German speaking audience only but we are prepared for internationalisation thanks to Seagull.

In February of 2006 I published the first version of podcast.de based on an early 0.4 release of Seagull. It took me over fourteen month to modify Seagull to fit my needs. I wrote several new modules and hacked lots of things like clean urls. Many of the things are now part of the core (and are not written by me). One thing did make it into the release. The export module which generates RSS feeds. RSS feeds are what our system is really about. Not knowing any better I built the portal based upon a MySQL database. An XML db might have been a better choice at least in some areas. The biggest table holds over 500.000 entries. These ar all the references to podcast episodes and its metadata. In the time being we upgraded from a MySQL db version 4.x to the latest, stable version of the 5th release. Database performance has been an issue since the start only at the beginning it was not noticeable enough. The layout of the database should have been designed less complex. With the amount of data and the overall usage growing we ran into hown-grown issues. With indexes, query optimisations, better hardware, newer MySQL releases, a migration to split sequence tables, db caching, MySQL tweaking and most importantly the replacement of most DataObjects code we got fairly good performance on the db side.

Caching is really the one life-saver every webmaster should look out for. Seagull has built-in template and nowadays library caching support which is nice. We used the SGL_Cache class for general purpose data caching as well. With a rising number of pages and a climbing user base the file-based cache turned into a bottleneck as the harddrive had to seek through several thousand files. A different file-system for smaller files might have helped for a while. We chose to replace the file-based with memory-based caching. It is just a five line hack in SGL_Cache and a little wrapper! In our case we decided to use memcache. Memcache(d) is commonly used. The PHP part is installable through PEAR and with PHP5.x which we switched to in our development cycle you have an OO interface. Shared memory would have been an alternative. At first memcache worked very well but having a few thousand objects stored it slowed down. For a while we had to flush the cache every now and then because we could not find anything related on the net which explained this behaviour. Migrating to a new server with a newer linux and a newer memcached installed solved the problem. We could not replace the memcached before because of dependency issues.

Another performance issue is rendering the navigation. This got a bit better over time. The solution again is to cache most parts of it. There are still some other minor issues with the DB-based navigation. I have not tried the file-based one, yet. With DB-based nav you are not able to add an alias with two slashes. That is the biggest drawback for us at the moment. I wrote my own alias strategy for this which works well. Proper I18N might be the next issue once we need it.

I am still not friends with emailing in Seagull. As default Seagull sends HTML mails. I am not a big fan of that. I want to send text and HTML mails combined. Ages ago I sent in a patch which did not get accepted. A couple month back an almost identical patch got sent in. Neither did that one get accepted. Anyhow we are using this modification successfully.

Now that I named all the issues with Seagull letÂŽs have a look at the good parts. The overall structure is complex but fairly clean and intuitive once you start working with it. The overall progress might be slow but the framework is steadily improving thanks to Demian, Lakiboy and all the other active developers. The framework is flexible and versatile. We are using it with several different interfaces (web, mobile, iptv, search). I hooked up Zend libs succesfully into the system without problems. We have several crawlers running using cron jobs and the CLI interface. Migrating from 0.4 to 0.6 was an effort but thanks to the loosely coupled core possible.

Since the last relaunch of podcast.de at the end of September ÂŽ07 we are intensively using AJAX. We moved the database to its own server. We have the static properties on another separate server. We tried to use lighttpd as http server instead of ApacheÂŽs http-server to reduce the load. We failed. I could not get it stable on load.

These days we welcome over 10.000 unique visitors a day. Almost 25.000 people registered for our service. We serve over two million pages a month with one Seagull server. We have 18 custom Seagull modules, 6 different themes (not all in use, yet) and three years of development spent on the system. The database now has 112 tables with around 1 GB of data. The next step in performance upgrades will be a second database server for a read-only slave. Before that we have to modify SQL query executions to differ between read and write operations. At the moment there is room for more so pay us a visit at www.podcast.de and let us know how you like it!

einer

Being inspired by a long talk yesterday at Berlin2.0 I decided to dig into Google´s offerings on the AdWords API today. Getting an API code is terrifying difficult. First you have to create a My Customer Center account if you do not have own one, yet. Create your AdWords API account and have a credit card ready. You need it to get the developer and application tokens.

I´d hope to find a toolkit from my favorite package source PEAR or for the Zend Framework but since there is an offering from code.google.com nobody bothered to develop anything. I downloaded the Google APIlity PHP Library for AdWords and the APIlity MySQL Schema to modify it for my own purposes.

keine