Django custom creation manager logic for temporal

2019-09-04 12:03发布

问题:

I am trying to develop a Django application that has built-in logic around temporal states for objects. The desire is to be able to have a singular object representing a resource, while having attributes of that resource be able to change over time. For example, a desired use case is to query the owner of a resource at any given time (last year, yesterday, tomorrow, next year, ...).

Here is what I am working with...

class Resource(models.Model):                                       
    id = models.AutoField(primary_key=True)                         


class ResourceState(models.Model):                                  
    id = models.AutoField(primary_key=True)                         

    # Link the resource this state is applied to                    
    resource = models.ForeignKey(Resource, related_name='states', on_delete=models.CASCADE)

    # Track when this state is ACTIVE on a resource                 
    start_dt = models.DateTimeField()                               
    end_dt = models.DateTimeField()                                 

    # Temporal fields, can change between ResourceStates      
    owner = models.CharField(max_length=100)                        
    description = models.TextField(max_length=500)                 

I feel like I am going to have to create a custom interface to interact with this state. Some example use cases (interface is completely up in the air)...

# Get all of the states that were ever active on resource 1 (this is already possible)
Resource.objects.get(id=1).states.objects.all()

# Get the owner of resource 1 from the state that was active yesterday, this is non-standard behavior
Resource.objects.get(id=1).states.at(YESTERDAY).owner

# Create a new state for resource 1, active between tomorrow and infinity (None == infinity)
# This is obviously non standard if I want to enforce one-state-per-timepoint
Resource.objects.get(id=1).states.create(
    start_dt=TOMORROW,
    end_dt=None,
    owner="New Owner",
    description="New Description"
)

I feel the largest amount of custom logic will be required to do creates. I want to enforce that only one ResourceState can be active on a Resource for any given timepoint. This means that to create some ResourceState objects, I will need to adjust/remove others.

>> resource = Resource.objects.get(id=1)
>> resource.states.objects.all()
[ResourceState(start_dt=None, end_dt=None, owner='owner1')]
>> resource.states.create(start_dt=YESTERDAY, end_dt=TOMORROW, owner='owner2')
>> resource.states.objects.all()
[
    ResourceState(start_dt=None, end_dt=YESTERDAY, owner='owner1'),
    ResourceState(start_dt=YESTERDAY, end_dt=TOMORROW, owner='owner2'), 
    ResourceState(start_dt=TOMORROW, end_dt=None, owner='owner1')
]

I know I will have to do most of the legwork around defining the logic, but is there any intuitive place where I should put it? Does Django provide an easy place for me to create these methods? If so, where is the best place to apply them? Against the Resource object? Using a custom Manager to deal with interacting with related 'ResourceState' objects?

Re-reading the above it is a bit confusing, but this isnt a simple topic either!! Please let me know if anyone has any ideas for how to do something like the above!

Thanks a ton!

回答1:

too long for a comment, and purely some thoughts, not a full answer, but having dealt with many date effective records in financial systems (not in Django) some things come to mind:

My gut would be to start by putting it on the save method of the resource model. You are probably right in needing a custom manager as well.

I'd probably also flirt with the idea of a is_current boolean field in the state model but certain care would need to be considered with future date effective state records. If there is only one active state at a time, I'd also examine the need for an enddate. Having both start and end definitely makes the raw sql queries (if ever needed) easier: date() between state.start and state.end <- this would give current record, sub in any date to get that date's effective record. Also, give some consideration to the open ended end date where you don't know the end date date. Your queries will have to handle the nulls properly. YOu probably also may need to consider the open ended start date (say for a load of historical data where the original start date is unknown). I'd suggest staying away from using some super early date as a fill in (same for date far in the future for unknown end dates) - If you end up with lots of transactions, your query optimizer may thank you, however, I may be old and this doesn't matter anymore.

If you like to read about this stuff, I'd recommend a look at 1.8 in https://www.amazon.ca/Art-SQL-Stephane-Faroult/dp/0596008945/ and chapter 6:

"But before settling for one solution, we must acknowledge that valuation tables come in all shapes and sizes. For instance, those of telecom companies, which handle tremendous amounts of data, have a relatively short price list that doesn't change very often. By contrast, an investment bank stores new prices for all the securities, derivatives, and any type of financial product it may be dealing with almost continuously. A good solution in one case will not necessarily be a good solution in another.

Handling data that both accumulates and changes requires very careful design and tactics that vary according to the rate of change."