Differences between revisions 1 and 5 (spanning 4 versions)

Persistence Layer

Launchpad suffers greatly by having no simple persistence layer. It has sql layer (Storm) but its regular business logic is intermingled with storage : we cannot test business logic without exercising the storage layer, we cannot test the storage layer without mixing in business logic tests, we cannot migrate components of our model to different storage systems (because of the intermingling). Performance suffers - there is no differentiation between object access and data retrieval, so any failure to preload data results in thousands of backend queries. And our test performance suffers because we're exercising the full stack every time we do anything. Lastly our business logic and storage logic have to change in lockstep because there is no separation between them.

As a Technical Architect
I want a dedicated, substitutable persistence layer
so that writing fast code, testing and evolution of our schema are easier to do than today.

This is not an ORM nor a separate library.

Rationale

Solving the performance problems endemic in Launchpad requires a code structure that fits the needs of our data storage layer; much of the Zope machinery we use is hostile to that structure : it assumes individual Python object and attribute access is cheap or free - which it isn't in our current conflated structure.

We need to achieve a high performance Launchpad to:

Satisfy the needs our users and stakeholders.
make good use of the substantial hardware investment made in running Launchpad

And we need to fix our testing story to be able to test much more rapidly than we do today.

The lack of persistence layer is one driving factor for the problems we suffer, and working around that adds significant burden to writing efficient code or tests.

Because of these considerations we're going to solve this infrastructure level issue now, so that we can (eventually) focus on interesting problems.

Stakeholders

Developers [folk that write code]

Constraints and Requirements

Must

Provide an easy to use layer which exposes plain old python objects on the top and works with Storm / pgsql [as appropriate] underneath.
Provide an alternative backend for testing which we can run our developer-run tests (including integration, excluding persistence layer tests) against.
Permit disabling DB queries once we hit the render pipeline.
Provide a way of loading a comprehensive object graph in an optimal manner.
Be incrementally adoptable.
Encapsulate all details of how we persist the object graph.

Nice to have

Not requiring a cache in the layer
Being able to get a transcript of a an entire problem web request (like our timelines, but with parameters and trivially replayable) would be nice.

Must not

Leak SQL concepts through the layer.
- e.g. If we end up talking about anything other than an object graph in our business logic (e.g. table, views, etc.) then we've failed. For example, if we say things like "get all Person objects that are teams", or talking about things like "group by", joins and so forth.

Workflows

Writing tests of UI / business logic.
Changing the data storage/retrieval functions.

Success

Bugs are at: Bugs for this LEP (persistencelayer)

How will we know when we are done?

None of our tests of Launchpad functionality require a real database to execute.
All of our pages execute in 5-10 queries.

How will we measure how well we have done?

It Will Be Obvious.

-  ⇤ ← Revision 1 as of 2010-11-24 01:34:42 → 
  Size: 2317
  Editor: lifeless
  Comment: start draft
+   ← Revision 5 as of 2010-11-27 05:04:51 → ⇥
  Size: 3779
  Editor: lifeless
  Comment: details
-Deletions are marked like this.
+Additions are marked like this.
 Line 6:
-'''I want ''' a real persistence layer<<BR>>
+'''I want ''' a dedicated, substitutable persistence layer<<BR>>
 Line 9:
-This is not an ORM nor a separately library.
+This is not an ORM nor a separate library.
 Line 13:
-''Why are we doing this now?''
+Solving the performance problems endemic in Launchpad requires a code structure that fits the needs of our data storage layer; much of the Zope machinery we use is hostile to that structure : it assumes individual Python object and attribute access is cheap or free - which it isn't in our current conflated structure.
 Line 15:
-''What value does this give our users? Which users?''
+We need to achieve a high performance Launchpad to:
 1. Satisfy the needs our users and stakeholders.
 1. make good use of the substantial hardware investment made in running Launchpad

And we need to fix our testing story to be able to test much more rapidly than we do today.

The lack of persistence layer is one driving factor for the problems we suffer, and working around that adds significant burden to writing efficient code or tests.

Because of these considerations we're going to solve this infrastructure level issue now, so that we can (eventually) focus on interesting problems.
-Line 19:
+Line 27:
-''Who really cares about this feature? When did you last talk to them?''
+. Developers [folk that write code]
-Line 25:
+Line 33:
-''What MUST the new behaviour provide?''
+. Provide an easy to use layer which exposes plain old python objects on the top and works with Storm / pgsql [as appropriate] underneath.
 1. Provide an alternative backend for testing which we can run our developer-run tests (including integration, excluding persistence layer tests) against.
 1. Permit disabling DB queries once we hit the render pipeline.
 1. Provide a way of loading a comprehensive object graph in an optimal manner.
 1. Be incrementally adoptable.
 1. Encapsulate all details of how we persist the object graph.
-Line 29:
+Line 42:
+. Not requiring a cache in the layer

 1. Being able to get a transcript of a an entire problem web request (like our timelines, but with parameters and trivially replayable) would be nice.
-Line 31:
+Line 48:
-''What MUST it not do?''

== Subfeatures ==

''Other LaunchpadEnhancementProposal``s that form a part of this one.''
+. Leak SQL concepts through the layer.
    e.g. If we end up talking about anything other than an object graph in our business logic (e.g. table, views, etc.) then we've failed.  For example, if we say things like "get all Person objects that are teams", or talking about things like "group by", joins and so forth.
-Line 39:
+Line 53:
-''What are the workflows for this feature? Even a short list can help you and others understand the scope of the change.'' 
''Provide mockups for each workflow.''

'''''You do not have to get the mockups and workflows right at this point. In fact, it is better to have several alternatives, delaying deciding on the final set of workflows until the last responsible moment.'''''
+. Writing tests of UI / business logic.
 1. Changing the data storage/retrieval functions.
-Line 46:
+Line 58:
-'''Bugs are at:''' [[https://launchpad.net/launchpad-project/+bugs?field.tag=$LEPNAME|Bugs for this LEP]] ''link to a search for a bug tag or milestone. Use launchpad-project rather than a sub-project.''
+'''Bugs are at:''' [[https://launchpad.net/launchpad-project/+bugs?field.tag=persistencelayer|Bugs for this LEP (persistencelayer)]]
-Line 50:
+Line 62:
+. None of our tests of Launchpad functionality require a real database to execute.
 1. All of our pages execute in 5-10 queries.
-Line 52:
+Line 67:
+It Will Be Obvious.
-Line 53:
+Line 70:
-''Put everything else here. Better out than in.''

launchpad development

Persistence Layer

Rationale

Stakeholders

Constraints and Requirements

Must

Nice to have

Must not

Workflows

Success

How will we know when we are done?

How will we measure how well we have done?

Thoughts?

launchpad development

Diff for "LEP/PersistenceLayer"

Persistence Layer

Rationale

Stakeholders

Constraints and Requirements

Must

Nice to have

Must not

Workflows

Success

How will we know when we are done?

How will we measure how well we have done?

Thoughts?