Skip to content

PHP, APC and sessions

Playing with redis/Rediska and sessions, I wanted to get more numbers to compare this solution to a traditional MySQL-based approach which also made me revisit the idea of a CouchDB-based session handler for Zend_Session.

Implementing this handler, I ran into a weird issue:

Fatal error: Undefined class constant 'ALLOW_ALL' in /usr/home/till/foo/trunk/library/Zend/Uri/Http.php on line 447
Call Stack
#   Time    Memory  Function    Location
1   0.7357  3914816 Foo_Session_SaveHandler_Couchdb->write( )   ../Couchdb.php:0
2   0.7358  3916600 Foo_Couchdb->query( )   ../Couchdb.php:94
3   0.7361  3969464 Zend_Http_Client->__construct( )    ../Couchdb.php:368
4   0.7361  3969544 Zend_Http_Client->setUri( ) ../Client.php:250
5   0.7362  3976568 Zend_Uri::factory( )    ../Client.php:267
6   0.7365  4003352 Zend_Uri_Http->__construct( )   ../Uri.php:130
7   0.7367  4006216 Zend_Uri_Http->valid( ) ../Http.php:154
8   0.7368  4006216 Zend_Uri_Http->validateHost( )  ../Http.php:281

The funny thing is that that APC was added (for apc_store() and apc_fetch()) at the same time to the game (to cache the configuration) and when I disabled it, the error disappeared.

Talking to to one of the leads of APCGopal (Btw, cheers for helping!) — on IRC (#[email protected]) I thought at first that the issue was autoload related and we thought the order in which the extensions are loaded might make a difference. From Rasmus' comment, I later discovered bug #16745 with a proposed workaround to use session_write_close().

On a sidenote: I'm still not sure why the error is expected behavior for some people but yet it works with some PHP and APC versions and breaks with others. From what I gathered it broke for me with 5.2.6, 5.2.11 and 5.3.2. Tried all with the latest version of APC (3.1.3p1).

Here's how I fixed it for myself

I have a Lagged_Application class to bootstrap my application. Lagged_Application is kind of like Zend_Application sans a lot of safety nets and magic. Since it does a lot less, it's also quiet a bit faster. To get an idea, check out my Google Code repository (for an alas rather outdated version of it).

I added the following function to it:

<?php
// (...)
public function shutdown()
{
    session_write_close();
}

My index.php looks like the following:

<?php
include 'library/Lagged/Application.php';
$app = new Lagged_Application;
$app->setEnvironment('production');
$app->bootstrap();

register_shutdown_function(array($app, 'shutdown'));

Somewhat related — shutdown() could be a good start to tear down other objects as well, when needed.

More?

Now that this issue is fixed, I think also the infamous Fatal error: Exception thrown without a stack frame in Unknown on line 0 originates from the same issue. That is, when sessions and APC are around — but I should dig a little deeper to verify this.

All in all, it's a pretty weird issue and IM(very)HO, objects shouldn't be torn down or some sort of before hook should be executed to avoid this. But that's especially easy to say if you don't do C. :-)

Fin

That's all. I sure hope this saves someone else some time.

Foursquare: How private is private?

Location is one of my hobbies. Even though I don't map items for openstreetmap and the like, I still try out at least every location-related startup there is.

Foursquare, as you probably know is a location-based game — get points and badges to check into locations. The points are aggregated into weekly leaderboard (of penis envy) and everyone gets a fresh start every Monday morning.

Check-in

Foursquare has different check-in modes. One is the regular, where your location gets published to your friends (and also Twitter/Facebook if those are linked up) and the other is called "off the grid" — supposedly not even your friends know where you're at.

Think of a possible scenario — cheating on your diet? You can still check into McDonald's and get the points but your boyfriend wouldn't know you did it.

Downsides

Is off the grid really off the grid? Far from it.

If you play Foursquare on a more national or global scale (e.g. between cities) even though you check in off the grid, your general location is updated on your profile.

So let's say I went from Berlin to Munich and didn't want anyone else to know. I still check in off the grid at the airport in Munich (to get a stupid badge or whatever) and my Foursquare profile would not show where I exactly I checked in (e.g. airport), but it would say "Till (Munich)".

From what I noticed the other week, if you checked into a venue and did off the grid, it would still show your icon on the venue's page on Foursquare, which doesn't really sound like advertised either.

So how is that check-in actually private? Well, not at all.

Location without a check-in

But wait, it gets even better!

I noticed that Foursquare's Android and Blackberry applications update your location without checking-in. From what I gathered, it's plenty to look at your friend list and sure as hell enough to scan for places around you (to get caught ;-)).

Friend list

The friend list always shows people from the city you're in. So as soon as you open the application, it displays those. Whenever you leave the city, it'll say something like, "Friends in other cities" — and "Viola!", your profile got updated.

All of this is powered by the GPS tracking in your nifty phone. Pretty cool, eh?

Scanning places

Last weekend I went to Chemnitz to attend a family thing — even though didn't check in anywhere, I briefly scanned to see who and what was around me. Still, my profile got updated.

Foursquare: my history

The above shows two check-ins, Jet is in Berlin (gas station) and Aral is near Dresden (another gas station — but don't worry, I just bought a magazine and took my dog for a walk — didn't need to fuel up).

And here's a shot of my profile, which clearly states I'm in Chemnitz:

Foursquare: my profile

I guess I should have expected it, but I'm still not sure if I like it. The upside is, it's pretty accurate too (note, sarcasm)!

And in case anyone got doubts — I'm sure someone with elite jedi powers from Foursquare can verify that I didn't cheat.

Share

I think the biggest mis-conception here is that I expected Foursquare to share my location only when I do something using the application or the website. I'd really like it to be a more active thing when my profile is populated with data. On the other hand, that's probably inconvenient as hell for … Foursquare?

Further more

I haven't really checked into this any further, but does anyone know if the Foursquare applications use background data?

I would like to how much of my location is shared on a regular basis and also how granular the data gathered is, e.g. Foursquare only updates the city/country on your profile, but do they really keep latitude and longitude?

Fin

Not the usual programming bs. :-) And that's all for today.

If in doubt about your data, you should disable location based services.

The very least you can do is to learn enough about them in order to understand (and comprehend) what's happening with your data.

jQuery post requests with a json response, sans eval()

I know some of you out there are probably tired of jQuery and people raving about it's goodness, but bare with me! Because jQuery never ceases to amaze me — especially when I haven't looked at it — or client-side JavaScript code in general — in a good year or so.

Refactoring

I've been refactoring some of my old JavaScript libs on a project and I noticed that I had used evil eval() all over the place to parse the JSON from our API. Guess we all know that this is not just a security issue since it allows code execution, but also a performance hit. And here's how to get around it. :-)

A codesnippet to automatically parse a JSON response from a $.post-request:

    $.post(
      '/the-url-post-to',
      data,
      function(json) {
        // handle response
      },
      'json'
    );

Note: The 4th parameter 'json' in $.post(). Magic.

Fin

That's all.

PHP: So you'd like to migrate from MySQL to CouchDB? - Part III

This is part three of a beginner series for people with a MySQL/PHP background. Apologies for the delay, this blog entry has been in draft since the 13th December of last year (2009).

Follow these links for the previous parts:

Recap

Part I introduced the CouchDB basics which included basic requests using PHP and cURL. Part II focused on create, read, update and delete operations in CouchDB. I also introduced my nifty PHP CouchDB called ArmChair!

ArmChair is my own very simple and (hopefully) easy-to-use approach to accessing CouchDB from PHP. The objective is to develop it with each part of this series to make it a more comprehensive solution.

Part III

Part three will target basic view functions in CouchDB — think of views as a WHERE-clause in MySQL. They are similar, but also not. :-)

Map-Reduce-Thingy

If you read up on CouchDB before coming to this blog, you will probably heard of map-reduce. There, or maybe elsewhere. A lot of people attribute Google's success to map-reduce. Because they are able to process a lot of data in parallel (across multiple cores and/or machines) in relatively little time.

I guess the PageRank in Google Search or Google Analytics are examples of where it could be used.

In the following, I'll try to explain what map-reduce is. For people without a science degree. (And that includes me!)

Map

Generally, map-reduce is a way to process data. It's made off two things, map and reduce.

The idea is that the map-function is very robust and it allows data to be broken up into smaller pieces so it can be processed in parallel. In most cases the order data is processed in doesn't really matter. What generally counts is that it is processed at all. And since map allows us to run the processing in parallel, it's easier to scale out. (That's the secret sauce!)

And when I write scale-out, I don't suggest to built a cluster of 1000 servers in order to process a couple thousand documents. It's already sufficient in this case to utilize all cores in my own computer when the map task is run in parallel.

In CouchDB, the result of map is a list of keys and values.

Reduce

Reduce is called once the map-part is done. It's an optional step in terms of CouchDB — not every map requires a reduce to follow.

Real world example

  • take a simple photo application (such as flickr) with comments
  • use map to sort through the comments and emit the names of users who left one
  • use reduce to only get unique references and see how many comments were left by these user

In SQL:

SELECT user, count(*) FROM comments GROUP BY user

Why the fuzz?

Just so people don't feel offended. Map-reduce is slightly more complicated than my example SQL-query but it's also not some secret-super-duper thing. Its strength is really parallelization which requires the ability to break data into chunks to process them. The end.

EC2 security group owner ID

I recently had the pleasure to setup an RDS instance and it took me a while to figure out what the --ec2-security-group-owner-id parameter needs to be populated with when you want to allow access to your RDS instance from instances with a certain security group.

To cut to the chase, you need to log into AWS and then click the following link — done.