Usage¶
Using DynamORM is straight forward. Simply define your models with some specific meta data to represent the DynamoDB Table structure as well as the document schema. You can then use class level methods to query for and get items, represented as instances of the class, as well as class level methods to interact with specific documents in the table.
Note
Not all functionality is covered in this documentation yet. See the tests for all “supported” functionality (like: batch puts, unique puts, etc).
Setting up Boto3¶
Make sure you have configured boto3 and can access DynamoDB from the Python console.
import boto3
dynamodb = boto3.resource('dynamodb')
list(dynamodb.tables.all()) # --> ['table1', 'table2', 'etc...']
Configuring the Boto3 resource¶
The above example is relying on the files ~/.aws/credentials
& ~/.aws/config
to provide access information and region selection. You can provide explicit configuration for boto3 sessions and boto3 resources as part of your Table
definition.
For example, if you develop against a local dynamo service your models may look something like:
class MyModel(DynaModel):
class Table:
session_kwargs = {
'region_name': 'us-east-2'
}
resource_kwargs = {
'endpoint_url': 'http://localhost:33333'
}
You would obviously want the session and resource configuration to come from some sort of configuration provider that could provide the correct options depending on where your application is being run.
Using Dynamo Local¶
If you’re using Dynamo Local for development you can use the following config for the table resource:
MyModel.Table.get_resource(
aws_access_key_id="-",
aws_secret_access_key="-",
region_name="us-west-2",
endpoint_url="http://localhost:8000"
)
Defining your Models – Tables & Schemas¶
Models represent tables in DynamoDB and define the characteristics of the Dynamo service as well as the Marshmallow or Schematics schema that is used for validating and marshalling your data.
-
class
dynamorm.model.
DynaModel
(partial=False, **raw) DynaModel
is the base class all of your models will extend from. This model definition encapsulates the parameters used to create and manage the table as well as the schema for validating and marshalling data into object attributes. It will also hold any custom business logic you need for your objects.Your class must define two inner classes that specify the Dynamo Table options and the Schema, respectively.
The Dynamo Table options are defined in a class named
Table
. See thedynamorm.table
module for more information.Any Local or Global Secondary Indexes you wish to create are defined as inner tables that extend from either the
LocalIndex
orGlobalIndex
classes. See thedynamorm.table
module for more information.The document schema is defined in a class named
Schema
, which should be filled out exactly as you would fill out any other MarshmallowSchema
or SchematicsModel
.For example:
# Marshmallow example import os from dynamorm import DynaModel, GlobalIndex, ProjectAll from marshmallow import fields, validate, validates, ValidationError class Thing(DynaModel): class Table: name = 'things' hash_key = 'id' read = 5 write = 1 class ByColor(GlobalIndex): name = 'by-color' hash_key = 'color' read = 5 write = 1 projection = ProjectAll() class Schema: id = fields.String(required=True) name = fields.String() color = fields.String(validate=validate.OneOf(('purple', 'red', 'yellow'))) compound = fields.Dict(required=True) @validates('name') def validate_name(self, value): # this is a very silly example just to illustrate that you can fill out the # inner Schema class just like any other Marshmallow class if name.lower() == 'evan': raise ValidationError("No Evan's allowed") def say_hello(self): print("Hello. {name} here. My ID is {id} and I'm colored {color}".format( id=self.id, name=self.name, color=self.color ))
Table Data Model¶
The inner Table
class on DynaModel
definitions becomes an instance of our
dynamorm.table.DynamoTable3
class.
The attributes you define on your inner Table
class map to underlying boto data structures. This mapping is
expressed through the following data model:
Attribute |
Required |
Type |
Description |
---|---|---|---|
name |
True |
str |
The name of the table, as stored in Dynamo. |
hash_key |
True |
str |
The name of the field to use as the hash key. It must exist in the schema. |
range_key |
False |
str |
The name of the field to use as the range_key, if one is used. It must exist in the schema. |
read |
True |
int |
The provisioned read throughput. |
write |
True |
int |
The provisioned write throughput. |
stream |
False |
str |
The stream view type, either None or one of: ‘NEW_IMAGE’|’OLD_IMAGE’|’NEW_AND_OLD_IMAGES’|’KEYS_ONLY’ |
Indexes¶
Like the Table
definition, Indexes are also inner classes on DynaModel
definitions, and they require the same
data model with one extra field.
Attribute |
Required |
Type |
Description |
---|---|---|---|
projection |
True |
object |
An instance of of |
Creating new documents¶
Using objects:
thing = Thing(id="thing1", name="Thing One", color="purple")
thing.save()
thing = Thing()
thing.id = "thing1"
thing.name = "Thing One"
thing.color = "purple"
thing.save()
Using raw documents:
Thing.put({
"id": "thing1",
"name": "Thing One",
"color": "purple"
})
In all cases, the attributes go through validation against the Schema.
thing = Thing(id="thing1", name="Thing One", color="orange")
# the call to save will result in a ValidationError because orange is an invalid choice.
thing.save()
Note
Remember, if you have a String
field it will use unicode
(py2) or str
(py3) on any value assigned to it, which means that if you assign a list
, dict
, int
, etc then the validation will succeed and what will be stored is the representative string value.
Fetching existing documents¶
Get based on primary key¶
To fetch an existing document based on its primary key you use the .get
class method on your models:
thing1 = Thing.get(id="thing1")
assert thing1.color == 'purple'
To do a Consistent Read just pass consistent=True
:
thing1 = Thing.get(id="thing1", consistent=True)
assert thing1.color == 'purple'
Querying¶
A Query operation uses the primary key of a table or a secondary index to directly access items from that table or index.
Like a get
operation this takes arguments that map to the key names, but you can also specify a comparison operator for that key using the “double-under” syntax (<field>__<operator>
). For example to query a Book
model for all entries with the isbn
field that start with a specific value you would use the begins_with
comparison operator:
books = Book.query(isbn__begins_with="12345")
You can find the full list of supported comparison operators in the DynamoDB Condition docs.
Scanning¶
The Scan operation returns one or more items and item attributes by accessing every item in a table or a secondary index.
Scanning works exactly the same as querying.
# Scan based on attributes
Book.scan(author="Mr. Bar")
Book.scan(author__ne="Mr. Bar")
Read Iterator object¶
Calling .query
or .scan
will return a ReadIterator
object that will not actually send the API call to DynamoDB until you try to access an item in the object by iterating (for book in books:
, list(books)
, etc…).
The iterator objects have a number of methods on them that can be used to influence their behavior. All of the methods described here (except .count()
) are “chained methods”, meaning that they return the iterator object such that you can chain them together.
next_10_books = Book.query(hash_key=the_hash_key).start(previous_last).limit(10)
Returning the Count (.count()
)¶
Unlike the rest of the methods in this section, .count()
is the only one that does not return the iterator object. Instead it changes the SELECT parameter to COUNT
and immediately sends the request, returning the count.
books_matching_hash_key = Books.query(hash_key=the_hash_key).count()
Requesting consistent results (.consistent()
)¶
Queries & scans return eventually consistent results by default. You can use .consistent()
to return results that ensure all in-flight writes finished and no new writes were launched.
Books.query(hash_key=the_hash_key).consistent()
Changing the returned attributes (.specific_attributes()
)¶
By default, query & scan operations will return ALL attributes from the table or index. If you’d like to change the attributes to only return subset of the attributes you can pass a list to .specific_attributes([...])
. Each attribute passed in should match the syntax from Specifying Item Attributes in the docs.
Books.query(hash_key=the_hash_key).specific_attributes(['isbn', 'title', 'publisher.name'])
Paging (.last
, .start()
& .again()
)¶
A single Query operation will read up to the maximum number of items set (if using the Limit parameter) or a maximum of 1 MB of data and then apply any filtering to the results
When you query a table with many items, or with a limit, the iterator object will set its .last
attribute to the key of the last item it received. You can pass that item into a subsequent query via the start()
method, or if you have the existing iterator object simply call .again()
.
books = Book.scan()
print(list(books))
if books.last:
print("The last book seen was: {}".format(books.last))
print(list(books.again()))
last = get_last_from_request()
books = Book.scan().start(last)
Limiting (.limit()
)¶
The maximum number of items to evaluate (not necessarily the number of matching items). If DynamoDB processes the number of items up to the limit while processing the results, it stops the operation and returns the matching values up to that point.
You can also use the .limit()
method on the iterator object to apply a Limit to your query.
books = Book.scan().limit(1)
assert len(books) == 1
Reversing (.reverse()
- Queries Only)¶
To have the indexed scanned in reverse for your query, use .reverse()
Note
Scanning does not support reversing.
books = Book.query(hash_key=the_hash_key).reverse()
Recursion (.recursive()
)¶
If you wish to get ALL items from a query or scan without having to deal with paging your self, then you can use the recursive()
method to have the iterator handle the paging for you.
books = Book.scan().recursive()
Q
objects¶
-
dynamorm.table.
Q
(**mapping) A Q object represents an AND’d together query using boto3’s Attr object, based on a set of keyword arguments that support the full access to the operations (eq, ne, between, etc) as well as nested attributes.
It can be used input to both scan operations as well as update conditions.
See the dynamorm.model.DynaModel.scan()
docs for more examples.
Indexes¶
By default the hash & range keys of your table make up the “Primary Index”. Secondary Indexes provide different ways to query & scan your data. They are defined on your Model alongside the main Table definition as inner classes inheriting from either the GlobalIndex
or LocalIndex
classes.
Here’s an excerpt from the model used in the readme:
class Book(DynaModel):
# Define our DynamoDB properties
class Table:
name = 'prod-books'
hash_key = 'isbn'
read = 25
write = 5
class ByAuthor(GlobalIndex):
name = 'by-author'
hash_key = 'author'
read = 25
write = 5
projection = ProjectAll()
With the index defined we can now call Book.ByAuthor.query
or Book.ByAuthor.scan
to query or scan the index. The query & scan semantics on the Index are the same as on the main table.
Book.ByAuthor.query(author='Some Author')
Book.ByAuthor.query(author__ne='Some Author')
Indexes uses “projection” to determine which attributes of your documents are available in the index. The ProjectAll
projection puts ALL attributes from your Table into the Index. The ProjectKeys
projection puts just the keys from the table (and also the keys from the index themselves) into the index. The ProjectInclude('attr1', 'attr2')
projection allows you to specify which attributes you wish to project.
Using the ProjectKeys
or ProjectInclude
projection will result in partially validated documents, since we won’t have all of the require attributes.
A common pattern is to define a “sparse index” with just the keys (ProjectKeys
), load the keys of the documents you want from the index and then do a batch get to fetch them all from the main table.
Updating documents¶
There are a number of ways to send updates back to the Table from your Model classes and indexes. The Creating new documents section already showed you the dynamorm.model.DynaModel.save()
methods for creating new documents. save
can also be used to update existing documents:
# Our book is no longer in print
book = Book.get(isbn='1234567890')
book.in_print = False
book.save()
When you call .save()
on an instance the WHOLE document is put back into the table as save simply invokes the dynamorm.model.DynaModel.put()
function. This means that if you have large models it may cost you more in Write Capacity Units to put the whole document back.
You can also do a “partial save” by passing partial=True
when calling save, in which case the dynamorm.model.DynaModel.update()
function will be used to only send the attributes that have been modified since the document was loaded. The following two code blocks will result in the same operations:
# Our book is no longer in print
book = Book.get(isbn='1234567890')
book.in_print = False
book.save(partial=True)
# Our book is no longer in print
book = Book.get(isbn='1234567890')
book.update(in_print=False)
Doing partial saves (.save(partial=True)
) is a very convenient way to work with existing instances, but using the dynamorm.model.DynaModel.update()
directly allows for you to also send Update Expressions and Condition Expressions with the update. Combined with consistent reads, this allows you to do things like acquire locks that ensure race conditions cannot happen:
class Lock(DynaModel):
class Table:
name = 'locks'
hash_key = 'name'
read = 1
write = 1
class Schema:
name = String(required=True)
updated = Integer(required=True, default=0)
key = String()
is_locked = Boolean(default=False)
@classmethod
def lock(self, name, key):
inst = cls.get(name=name, consistent=True)
if inst is None:
inst = Lock(name=name)
inst.save()
if not inst.is_locked:
inst.update(
is_locked=True,
key=key,
updated=time.time(),
conditions=dict(
updated=inst.updated,
)
)
return inst
@classmethod
def unlock(cls, name, key):
inst = cls.get(name=name, consistent=True)
if key == inst.key:
inst.update(
is_locked=False,
key=None,
updated=time.time(),
conditions=dict(
updated=inst.updated,
)
)
return inst
lock = Lock.lock('my-lock', 'my-key')
if lock.key != 'my-key':
print("Failed to lock!")
else:
print("Lock acquired!")
Just like Scanning or Querying a table, you can use Q objects for your update expressions.
Relationships¶
Relationships leverage the native tables & indexes in DynamoDB to allow more concise definition and access of related objects in your Python code.
You define relationships along side your Schema and Indexes on your model, and must provide the query used to map the related models together. You can also supply a “back reference” query to have the other side of the relationship also have a relationship back to the defining model.
DynamORM provides the following relationship types:
dynamorm.relationships.OneToOne
- Useful when you have a large number of attributes to store and you want to break them up over multiple tables for performance in querying.dynamorm.relationships.OneToMany
/dynamorm.relationships.ManyToOne
- Useful when you have an instance of one model that has a collection of related instances of another model. You useOneToMany
orManyToOne
depending on which side of the relationship you are defining the attribute on. You’ll to use both interchangeably based on how your models are laid out since you need to pass a reference to the other model into the relationship.
Here’s an example of how you could model the Forum Application from the DynamoDB Examples:
class Reply(DynaModel):
class Table:
name = 'replies'
hash_key = 'forum_thread'
range_key = 'created'
read = 1
write = 1
class ByUser(GlobalIndex):
name = 'replies-by-user'
hash_key = 'user_name'
range_key = 'message'
projection = ProjectKeys()
read = 1
write = 1
class Schema:
forum_thread = String(required=True)
created = String(required=True)
user_name = String(required=True)
message = String()
class User(DynaModel):
class Table:
name = 'users'
hash_key = 'name'
read = 1
write = 1
class Schema:
name = String(required=True)
replies = OneToMany(
Reply,
index='ByUser',
query=lambda user: dict(user_name=user.name),
back_query=lambda reply: dict(name=reply.user_name)
)
class Thread(DynaModel):
class Table:
name = 'threads'
hash_key = 'forum_name'
range_key = 'subject'
read = 1
write = 1
class ByUser(GlobalIndex):
name = 'threads-by-user'
hash_key = 'user_name'
range_key = 'subject'
projection = ProjectKeys()
read = 1
write = 1
class Schema:
forum_name = String(required=True)
user_name = String(required=True)
subject = String(required=True)
user = ManyToOne(
User,
query=lambda thread: dict(name=thread.user_name),
back_index='ByUser',
back_query=lambda user: dict(user_name=user.name)
)
replies = OneToMany(
Reply,
query=lambda thread: dict(forum_thread='{0}\n{1}'.format(thread.forum_name, thread.subject)),
back_query=lambda reply: dict(
forum_name=reply.forum_thread.split('\n')[0],
subject=reply.forum_thread.split('\n')[1]
)
)
class Forum(DynaModel):
class Table:
name = 'forums'
hash_key = 'name'
read = 1
write = 1
class Schema:
name = String(required=True)
threads = OneToMany(
Thread,
query=lambda forum: dict(forum_name=forum.name),
back_query=lambda thread: dict(name=thread.forum_name)
)