’Sup. I’m a software developer at Khan Academy and a recent graduate of UCLA. I like web applications and distributed systems and I care about using them to solve problems in ways that make people happy. This is my website.
The App Engine NDB documentation does a good job of explaining the benefits of the new interface, but it doesn’t really have anything for application developers who want to upgrade their existing models. As I discussed in my previous post, the Khan Academy engineering team recently went through this process and came out with a lot of experience about what works and what doesn’t for an established codebase. What follows is our refined plan of attack, distilled into a field guide that you can use to transition over your own application.
N.B.: The difficulty of making this transition is proportional to the size and complexity of your application. If it’s small enough that you can convert everything at once, great. If not, be prepared to do a fair bit of debugging to smooth things out. Our experience taught us that some parts of this are going to be rocky no matter what, but that the flexibility offered by the upgrade is worth it.
1. Change your models to subclass from ndb.Model and use NDB properties and APIs
class Video(ndb.Model). If only this were all it took!
PostalAddressPropertyare now simply
ListProperty. Instead, add
repeated=Trueto the property constructor. For example, what was once
db.ListProperty(bool)will now be
KeyPropertydoes not automatically fetch the referred-to entity from the datastore. You could write a custom
ndb.Propertysubclass to emulate the old ReferenceProperty:
from google.appengine.ext import ndb class ReferenceProperty(ndb.KeyProperty): def _validate(self, value): if not isinstance(value, ndb.Model): raise TypeError('expected an ndb.Model, got %s' % repr(value)) def _to_base_type(self, value): return value.key def _from_base_type(self, value): return value.get()
ndb.Property. Good news: it’s pretty trivial and the way to do custom properties is vastly simplified in NDB. See my custom
entity.key.delete(), etc. Refer to the cheat sheet. Making sure you’ve covered these changes everywhere they need to happen is the most difficult part of the conversion process.
_use_cache = Falseclass variable to each model as necessary. More sophisticated policy functions are available as well, but those are best left for final tweaking. This is less of a “I need to be afraid of a potential slowdown” thing and more of a “I want to preserve my existing performance characteristics at the risk of not getting potential improvements, because it will make me feel safer about this” thing.
2. Change all code that uses the newly converted models to use the NDB interface
ndb.Queryinstances. But I don’t recommend punting: the NDB query syntax is pretty sexy and this is one of the least error-prone parts of the conversion. Do note, however, that calling methods like
filteron a query instance doesn’t modify it in-place; you need to do that yourself by reassigning the instance to itself:
from google.appengine.ext import db, ndb class OldBananaStand(db.Model): contains_money = db.BooleanProperty() class NewBananaStand(ndb.Model): contains_money = ndb.BooleanProperty() old_ones = OldBananaStand.all() old_ones.filter('contains_money = True') # => ok! new_ones = NewBananaStand.query() new_ones.filter(NewBananaStand.contains_money == True) # => nope new_ones = new_ones.filter(NewBananaStand.contains_money == True) # => ok!
from google.appengine.ext import db, ndb from google.appengine.datastore import entity_pb def db_entity_to_protobuf(e): return db.model_to_protobuf(e).Encode() def protobuf_to_db_entity(pb): # precondition: model class must be imported return db.model_from_protobuf(entity_pb.EntityProto(pb)) def ndb_entity_to_protobuf(e): return ndb.ModelAdapter().entity_to_pb(e).Encode() def protobuf_to_ndb_entity(pb): # precondition: model class must be imported return ndb.ModelAdapter().pb_to_entity(entity_pb.EntityProto(pb))
3. Test, test, test
db.Keyin your client-side template but the corresponding server-side endpoint is querying for an
4. Deploy the mechanical translation and squash any remaining bugs
If you’ve made it this far, you’re in great shape. You have a solid NDB foundation and now the more advanced features are available for you to play with.
5. Start using the asynchronous API
@ndb.tasklet. A tasklet returns a future, which you can get the result of by calling
get_result, naturally. By convention, I append
_asyncto the names of newly tasklet-ized functions. But what if that function needs to be called from existing synchronous code? A future is of little use there. You could upgrade your synchronous code to always call
get_resultafter calling a tasklet, but a slightly nicer solution is this conditionally async decorator that introduces a
from google.appengine.ext import ndb def tasklet(func): """Tasklet decorator that lets the caller specify either async or sync behavior at runtime. If make_sync is False (the default), the tasklet returns a future and can be used in asynchronous control flow from within other tasklets (like ndb.tasklet). If make_sync is True, the tasklet will wait for its results and return them, allowing you to call the tasklet from synchronous code (like ndb.synctasklet). """ @ndb.utils.wrapping(func) def tasklet_wrapper(*args, **kwds): arg_name = "make_sync" sync_by_default = False make_sync = kwds.get(arg_name, sync_by_default) if make_sync: taskletfunc = ndb.synctasklet(func) else: taskletfunc = ndb.tasklet(func) if arg_name in kwds: del kwds[arg_name] return taskletfunc(*args, **kwds) return tasklet_wrapper
ndb.Returnexception. This is a good example from the App Engine documentation:
# from https://developers.google.com/appengine/docs/python/ndb/async @ndb.tasklet def get_cart_async(acct): cart = yield CartItem.query(CartItem.account == acct.key).fetch_async() yield ndb.get_multi_async([item.inventory for item in cart]) raise ndb.Return(cart) @ndb.tasklet def get_offers_async(acct): offers = yield SpecialOffer.query().fetch_async(10) yield ndb.get_multi_async([offer.inventory for offer in offers]) raise ndb.Return(offers) @ndb.tasklet def get_cart_plus_offers(acct): cart, offers = yield get_cart_async(acct), get_offers_async(acct) raise ndb.Return((cart, offers))
Finally, here are a couple of additional anecdotes that are somewhat specific to our codebase but worth sharing nonetheless:
isinstancechecks to deal with both types appropriately, but in practice that’s really ugly.
yield op.db.Put(entity)to accumulate too many NDB entities and fail with a “datastore RPC too large” error. Luckily for you, this bug has since been fixed and doesn’t exist in later revisions.
Don’t panic. Welcome to the future!