Posted: August 30th, 2012 | Author: Giv | Filed under: Python | 1 Comment »
Update: 13 Feb 2013
Very excited to see the Smithsonian’s Cooper Hewitt labs use my library for their collection
http://labs.cooperhewitt.org/2013/giv-do/
——————————————————————–
I’ve started a new project on Github called RoyGBiv. Fork it!
It allows you to feed it an image and return a bunch of info about its colors. Currently you can get a list of its most prominent colors/palette and the average color used. Soon I’m (with your help) hoping to add more features.
Try it out here
Sample:


1 Comment »
Posted: November 30th, 2011 | Author: Giv | Filed under: Django, MongoDB, Python, Tutorials | 9 Comments »
Traditional relational databases (mySQL PostgreSQL etc) and noSQL systems are not mutually exclusive. I have several Django applications that are happily using mySQL. If your site is not scaling due to your database, you are doing it wrong! noSQL will not help you until you start caching some of those expensive queries using something like Memcached.
I use MongoDB alongside mySQL for all the dirty work like storing stats for later processing. There’s no point in polluting mySQL with this sort of data, especially when you’re dealing with millions of entries.
This post is intended for absolute beginners who use Django tranditionally and are curious about how they can integrate a secondary storage service into their apps. I’m assuming you have already installed MongoDB on your dev environment. You will also need to install the MongoEngine library for Python.
Let’s start.
You already know how to create data models in Django, but let’s say we want to store an activity feed for your users everytime they do something on your site. We begin by creating a data model similar to Django’s ORM using MongoEngine but the difference here is that you don’t need to run “syncdb” to create your tables. Mongo’s collections (similar to SQL tables) are schemaless so these models can be manipulated and you won’t need to worry about running migration scripts.
Let’s create a simple collection for storing user activities. Create a file where you normally keep your Django models and call it mongomodel.py
1
2
3
4
5
6
7
8
9
10
11
12
| from mongoengine import *
# connect to a db (no need to create this - it will be created automagically)
connect('useractivity')
class Author(Document):
pk = IntField()
name = StringField(max_length=200, required=True)
class Activity(Document):
message = StringField(max_length=200)
author = ReferenceField(Author, reverse_delete_rule=CASCADE) |
“What’s this??!! Django already has a User model, why do I need another in Mongo?” Well, you don’t, but say you want your activity to say something like: “Joe uploaded a photo” and you want Joe’s name to be linked to his profile page. We keep a reference to his mySQL id in case we need to look up other info or construct a URL.
You’ll also notice in the Activity model we are referencing the Author model. This is like a foreign key that will allow us to create relationships, similar to SQL. The CASCADE option will make sure if the user is deleted, all activities are also cleared out.
Ok, let’s start using this puppy! Using the example above we want to create an activity for Joe next time he uploads a photo. First, import mongomodel.py whenever you’re planning to interact with Mongo. In my photo upload view function I will create an activity like so:
1
2
3
4
5
6
7
8
9
10
11
| # After photo upload is complete
from main.mongomodel import *
# first create a user object - you can grab data from request object
the_author = Author(pk=request.user.id, name=request.user.first_name)
the_author.save()
# now create the activity
activity = Activity(message='uploaded a new photo', author=the_author)
activity.save() |
That’s it. If you decide later you also want to add the name of the file uploaded you can simply add a new field to your Activity model and it will just work, plus it will be backwards compatible, i.e. older records without this field will not complain. Lovely.
Displaying the activity is just as simple. In your view function pull out the record and push down to your template:
1
2
3
4
5
6
7
| from main.mongomodel import *
# get all activities
activities = Activity.objects
# push down to template
return render_to_response('activities.html', {'activities':activites}) |
Now in your template loop and output like any other model:
1
2
3
4
5
| <ul>
{% for a in activites %}
<li><a href="{% url main.views.profile a.author.pk %}">{{ a.author.name }}</a> {{ a.message }}</li>
{% endfor %}
</ul> |
I’ve used the user’s mySQL primary key to construct his profile URL.
This is a very basic example but hopefully you can see the advantage of offloading some of the data storage to Mongo. You may ask “but what if the user changes his name? won’t the data in the activity remain out of sync?”. Yes, it will, but you can very easily add a simple method in your Django user model to update Mongo records whenever the user’s details are updated.
Good luck.
9 Comments »
Posted: August 1st, 2011 | Author: Giv | Filed under: Python | 5 Comments »
This is a short post. I spent too long working this out so hopefully this post will help a future Google search.
If you’re using the Boto python wrapper for the Amazon S3 service, you can quickly generate temporary URLs for your private files.
1
2
3
4
| from boto.s3.connection import S3Connection
s3 = S3Connection('YOUR_KEY', 'YOUR_SECRET', is_secure=False)
url = s3.generate_url(60, 'GET', bucket='YOUR_BUCKET', key='YOUR_FILE_KEY', force_http=True) |
This will give you a URL to your private file on S3 that will only work for 60 seconds. It will look something like this:
http://mycoolbucket.s3.amazonaws.com/myfile.jpg?
Signature=ABC123DEF456&
Expires=1312216031&
AWSAccessKeyId=ABCDEFGHIJKLMNOP
5 Comments »
Posted: June 13th, 2011 | Author: Giv | Filed under: Python | 7 Comments »
The project I’m currently working on requires cropping of hundreds of portraits from the First World War archives at the Imperial War Museum.
Running a batch script on a directory of images is straight forward except my script is pretty dumb and tries to do a centre crop to create a square image. Unfortunately some of these images are not suitable for centre cropping:

Some of these portraits are quite long in height so a centre crop often results in the decapitation of the subject!
The logical thing to do here is to have your script first detect where the face is and then make a more intelligent crop to ensure the face remains in the new image. But surely face recognition requires super computers and several PhDs? Yes, it does. But we don’t really care who the subject is, we just need to know where the face is (or at least something that looks like a face). What we need is face detection, not recognition.
I was surprised to come across this little beauty: OpenCV, an open library for vision processing and luckily there’s a nice Python binding for it.
I tried out a sample from Robert Martin McGuire’s blog and was amazed at how simple and effective it was.
Robert’s script spits out two coordinates from the image that places a rectangle of where the face is. If your image has more than one person in it (or things that look like faces – more on that later) it will return two sets for each face.
Here’s the same image after running it through our face detection script:

Perfect! now we can adjust our cropping script to ensure that the face is within the bounds.
I tried this using really high resolution images and the script detected several faces in the image where there was only one. The problem is that if you have a lot of detail in your image like background artifacts and smudges there is likely to be some pattern that matches those of a face. For best results you may want to work with smaller images.
You can get this script from Robert’s site but here it is for all you lazy people. Make sure you’ve installed all necessary libraries. On Debian/Ubuntu you should be able to use this:
$ sudo apt-get install python-opencv libcv-dev python-imaging
Test out the script like this:
$python thescript.py original.jpg output.jpg
If you get errors chances are it’s not finding the XML files. I had to copy these manually to get it to work. Note: this script doesn’t do any cropping, it just shows you where the face is and you will need to do the cropping yourself with some trial and error.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
| import os
import sys
from opencv.cv import *
from opencv.highgui import *
import Image, ImageDraw
def print_rectangle(x1,y1,x2,y2): #function to modify the img
im = Image.open(sys.argv[1])
draw = ImageDraw.Draw(im)
draw.rectangle([x1,y1,x2,y2])
im.save(sys.argv[2])
def detectObjects(image):
"""Converts an image to grayscale and prints the locations of any
faces found"""
grayscale = cvCreateImage(cvSize(image.width, image.height), 8, 1)
cvCvtColor(image, grayscale, CV_BGR2GRAY)
storage = cvCreateMemStorage(0)
cvClearMemStorage(storage)
cvEqualizeHist(grayscale, grayscale)
cascade = cvLoadHaarClassifierCascade(
'/usr/share/opencv/haarcascade/haarcascade_frontalface_alt.xml',
cvSize(1, 1))
faces = cvHaarDetectObjects(grayscale, cascade, storage, 1.2, 2,
CV_HAAR_DO_CANNY_PRUNING, cvSize(50, 50))
if faces.total > 0:
for f in faces:
x1,y1,x2,y2=f.x,f.y,f.x+f.width,f.y+f.height
print("[(%d,%d) -> (%d,%d)]" % (f.x, f.y, f.x + f.width, f.y + f.height))
print_rectangle(x1,y1,x2,y2) #call to a python pil
def main():
image = cvLoadImage(sys.argv[1]);
detectObjects(image)
if __name__ == "__main__":
main() |
7 Comments »
Posted: February 26th, 2011 | Author: Giv | Filed under: MongoDB, PHP, Tutorials | 4 Comments »
This is not another SQL vs noSQL rant. I’m not here to defend one or the other and there are plenty of articles written about this very topic. I just wanted to share my personal experience with using MongoDB in Zend Framework using the Morph library.
The most painful part of programming for me is the CRUD of persistant data storage. I can’t think of anything more tedious than creating complex SQL schemas, modelling my data, creating getters and setters etc. I know what data I need to store, retrieve and manipulate and I just want to dive into the code without having to waste hours setting up the database. But the real pain starts during iterations where I have to update the schema and deploy the changes to all environments – or worse, roll back schema changes.
After discovering Django‘s modelling layer I realised what I was missing out on. It made perfect sense to me. Make your models first and have the SQL managed by the framework automatically. Of course this was still quite annoying because model/schema changes were quite difficult to do but at least there was no need for CRUD because you get it all for free. No more complex join queries!
Once I started using Google App Engine‘s Datastore I was free of SQL’s constraints. I created my data models and if I needed to add an extra field to my model I could do so without having to worry about running ALTER TABLE commands.
I’ve spent the last 2 years working with another noSQL system, CouchDB and again, I’m not here to compare the two but MongoDB just seems more suitable for my needs. I have now integrated MongoDB into my Zend Framework projects and it’s hard to imagine how I ever lived without it. It’s ridiculously easy to set up, no CRUD and it’s incredibly intuitive. Here’s how:
1. Download and run MongoDB service
Once you have downloaded the binaries, create a directory where you want your data stored. Remember MongoDB creates individual BSON (binary JSON) documents for each record. You can start the service by running:
shell> mongod --dbpath=/path/to/my/mongodata
2. Install the MongoDB PHP extension
After downloading the extension install and add to your php.ini
3. Download the Morph library
Add the Morph lib to your PHP include path (or autoload in ZF)
You are ready to go. Let’s try and use it in a simple ZF project. Let’s say we want to create a shopping cart that holds a product name, quantity, price and colour.
Create a “Cart.php” class in /models directory and extend Morph_Object.
1
2
3
4
5
6
7
8
9
10
11
12
13
| class Application_Model_Cart extends Morph_Object
{
public function __construct($id = null)
{
parent::__construct($id);
$this->addProperty(new Morph_Property_String('title'))
->addProperty(new Morph_Property_Integer('quantity'))
->addProperty(new Morph_Property_Float('price'))
->addProperty(new Morph_Property_String('colour'));
}
} |
Believe it or not, you’ve just finished modelling your data and CRUD and you’re ready to store and retrieve data immediately without having to run any SQL generation scripts. Brilliant.
To store data in say a controller, you would do something like this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
| // instantiate mongo and select your db
$mongo = new Mongo();
Morph_Storage::init($mongo->selectDB('shoppingDB'));
// instantiate your cart
$cart = new Application_Model_Cart();
// add data
$car->title = 'Something cool';
$cart->quantity = 12;
$cart->price = 15.50;
$cart->color = "Red";
// save the data
$cart->save(); |
You’ve just saved your first record and didn’t have to create individual getters and setters for your fields.
Getting the data out is just as simple:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
| // instantiate mongo and select your db
$mongo = new Mongo();
Morph_Storage::init($mongo->selectDB('shoppingDB'));
// instantiate your cart
$cart = new Application_Model_Cart();
// create a query
$query = new Morph_Query();
$query->property('price')->greaterThan(1.0);
// find records matching query
$result = $cart->findByQuery($query);
// send results down to view layer
$this->view->result = $result; |
Check the documentation for the full list of query options. Now loop through the results in your view layer and output the data:
1
2
3
| <? foreach($this->result as $item): ?>
<li><?=$item->title?></li>
<? endforeach ?> |
The best part is that since our data store is schema-less, you are free to manipulate your data structure like adding a new field. You do this by simply adding a new property to the Cart class. That’s it.
I hope this post demonstrates the benefits of using noSQL in your PHP projects. Naturally you can use the above in any PHP framework. I just used ZF as an example.
Happy noSQL coding!
4 Comments »