Diff for "LEP/WebservicePerformance/ClientSyntax"

Not logged in - Log In / Register

Differences between revisions 2 and 3
Revision 2 as of 2010-11-24 23:09:55
Size: 8560
Editor: gary
Comment:
Revision 3 as of 2011-01-26 16:10:01
Size: 8609
Editor: benji
Comment: Reformat very long lines and superfluous whitespace prior to editing.
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
See https://dev.launchpad.net/Foundations/Webservice/ProposalQnA for background discussion on this proposal. See https://dev.launchpad.net/Foundations/Webservice/ProposalQnA for background
discussion on this proposal.
Line 5: Line 6:
The basic pattern for using the Launchpad scripting interface (launchpadlib) is to GET references to what you want, usually via a query; and then to act on those references. You can DEREFerence them (that is, get their details), change the objects to which they refer (PATCH them), or DELETE them.  

In this introduction, note that functions in all-capitals ("GET", "DEREF", "PATCH", "DELETE") denote code that connects to Launchpad over the network.  Efficient code will do this as infrequently as possible, batching work together. For example, calling a network-traversing function inside a loop is often the wrong approach.
The basic pattern for using the Launchpad scripting interface (launchpadlib) is
to GET references to what you want, usually via a query; and then to act on
those references. You can DEREFerence them (that is, get their details),
change the objects to which they refer (PATCH them), or DELETE them.

In this introduction, note that functions in all-capitals ("GET", "DEREF",
"PATCH", "DELETE") denote code that connects to Launchpad over the network.
Efficient code will do this as infrequently as possible, batching work
together. For example, calling a network-traversing function inside a loop is
often the wrong approach.
Line 11: Line 19:
''SUMMARY: This section shows how to query Launchpad for collections of references. You can refine your query, ask for sorted results, and ask for only a subset of the query.''

Launchpadlib exposes several top-level collections of objects (bugs, people, etc.) that you can query.
''SUMMARY: This section shows how to query Launchpad for collections of
references. You can refine your query, ask for sorted results, and ask for
only a subset of the query.''

Launchpadlib exposes several top-level collections of objects (bugs, people,
etc.) that you can query.
Line 26: Line 37:
The bug_refs variable will contain references to all Launchpad bugs, if there are less than 10,000. [We will provide a channel to "bless" users so they can get more. Perhaps Canonical employees are auto-blessed.] Otherwise the request will generate a traceback, and you need to change your code to make a smaller query. The bug_refs variable will contain references to all Launchpad bugs, if there
are less than 10,000. [We will provide a channel to "bless" users so they can
get more. Perhaps Canonical employees are auto-blessed.] Otherwise the
request will generate a traceback, and you need to change your code to make a
smaller query.
Line 37: Line 52:
marked as "won't fix". Query objects (e.g., laucnhpad.bugs) can have "restrictions" applied to them that select which items will be included when the query is executed (via GET): marked as "won't fix". Query objects (e.g., laucnhpad.bugs) can have
"restrictions" applied to them that select which items will be included when
the query is executed (via GET):
Line 46: Line 63:
Since launchpadlib knows that bug status can only have one of a small set of valid values, using an invalid value will generate an error. Since launchpadlib knows that bug status can only have one of a small set of
valid values, using an invalid value will generate an error.
Line 57: Line 75:
queries pretty often.] [I also prefer "refine". I forget what Leonard's argument was for "restrict": he had one.]

Other modifiers (such as LessThan, GreaterThan, LessThanOrEqualTo, GreaterThanOrEqualTo, and Between) are also available.

There are two other features to making queries within the GET function. First, you can pass a field name to sort by; and, second, you can pass a start and a limit.

Note that the sorting functionality is limited to particular attributes. See the Fine Documentation to determine what fields are supported.
queries pretty often.] [I also prefer "refine". I forget what Leonard's
argument was for "restrict": he had one.]

Other modifiers (such as LessThan, GreaterThan, LessThanOrEqualTo,
GreaterThanOrEqualTo, and Between) are also available.

There are two other features to making queries within the GET function. First,
you can pass a field name to sort by; and, second, you can pass a start and a
limit.

Note that the sorting functionality is limited to particular attributes. See
the Fine Documentation to determine what fields are supported.
Line 72: Line 95:
[Future iterations of the webservice may allow sub-sorting (e.g., ``sort_on=[Descending(bug_query.creation_date), Ascending(bug_query.title)]``); for now, only sorting at one level is supported.]

[We're assuming sorting on indexed columns is cheap. If not we might not be able to do this.]

The only other way of generating references is described below in the section titled "Named Objects".
[Future iterations of the webservice may allow sub-sorting (e.g.,
``sort_on=[Descending(bug_query.creation_date), Ascending(bug_query.title)]``);
for now, only sorting at one level is supported.]

[We're assuming sorting on indexed columns is cheap. If not we might not be
able to do this.]

The only other way of generating references is described below in the section
titled "Named Objects".
Line 81: Line 108:
''SUMMARY: This section shows how to dereference collections of references.  You can ask for dereferencing of the returned objects, to make the a single dereferencing call over the network get more of what you need at once, potentially increasing your program's efficiency and speed.'' ''SUMMARY: This section shows how to dereference collections of references.
You can ask for dereferencing of the returned objects, to make the a single
dereferencing call over the network get more of what you need at once,
potentially increasing your program's efficiency and speed.''
Line 84: Line 114:
get all the data about the bugs we can call DEREF on the references. There's a primitive for this: DEREF. get all the data about the bugs we can call DEREF on the references. There's a
primitive for this: DEREF.
Line 97: Line 128:
DEREF represents a single web call. However, you typically won't use it, because it is not required to return the data for all of the references, only the first "batch," where the size of the batch is determined by the server.

[As an internal optimization, only the first N (where N is around 1000) references will actually be sent to the server in the above example.]

Since this is a hassle, the DEREF function is rarely used. The batchDEREF function (which is built on top of the primitive DEREF) is used instead. The batchDEREF function requests batches of results* [reference is to note about non-transactionality] but hides the batching operation so client code only sees an iterable of results.
DEREF represents a single web call. However, you typically won't use it,
because it is not required to return the data for all of the references, only
the first "batch," where the size of the batch is determined by the server.

[As an internal optimization, only the first N (where N is around 1000)
references will actually be sent to the server in the above example.]

Since this is a hassle, the DEREF function is rarely used. The batchDEREF
function (which is built on top of the primitive DEREF) is used instead. The
batchDEREF function requests batches of results* [reference is to note about
non-transactionality] but hides the batching operation so client code only sees
an iterable of results.
Line 109: Line 147:
In the above example all the attributes of the bug objects were returned. It would be better to ask only for the attributes we're interested in. That's done by passing an "select" argument to batchDEREF. In the above example all the attributes of the bug objects were returned. It
would be better to ask only for the attributes we're interested in. That's
done by passing an "select" argument to batchDEREF.
Line 134: Line 174:
Note that the collection of items to be dereferenced may be heterogeneous, in which case the selection requests may be heterogeneous. They will be applied as appropriate. [We think heterogeneous requests will actually be easier to implement than enforcing homogeneity, and they can encourage fewer dereferencing requests, and we believe that we will still be able to make efficient queries for them.]

The last dereferencing feature is described in the section below titled "Named Objects."
Note that the collection of items to be dereferenced may be heterogeneous, in
which case the selection requests may be heterogeneous. They will be applied
as appropriate. [We think heterogeneous requests will actually be easier to
implement than enforcing homogeneity, and they can encourage fewer
dereferencing requests, and we believe that we will still be able to make
efficient queries for them.]

The last dereferencing feature is described in the section below titled "Named
Objects."
Line 141: Line 187:
''SUMMARY: both queries and references can be passed to PATCH, along with descriptions of what to change.'' ''SUMMARY: both queries and references can be passed to PATCH, along with
descriptions of what to change.''
Line 162: Line 209:
Note that launchpadlib.people['benji'] results in a reference, while launchpadlib.people is a query (which you can GET to turn into references).

[The phrasing above suggests to me that RUN might be a better name than GET.  On the other hand, RUN implies an arbitrary operation, while GET implies actually getting a result.]
Note that launchpadlib.people['benji'] results in a reference, while
launchpadlib.people is a query (which you can GET to turn into references).

[The phrasing above suggests to me that RUN might be a better name than GET.
On the other hand, RUN implies an arbitrary operation, while GET implies
actually getting a result.]
Line 173: Line 223:
You can also pass any iterable of these references, even combined with references from a query response if so desired, to DEREF, PATCH, or DELETE. You can also pass any iterable of these references, even combined with
references from a query response if so desired, to DEREF, PATCH, or DELETE.
Line 178: Line 229:
''SUMMARY: DEREF, PATCH, and DELETE can accept an alternate ``query`` argument when you pass in a collection. This means that only members of the collection that match the query will be modified.''

[XXX Probably not implemented in first iteration; this just records an interesting idea.]
''SUMMARY: DEREF, PATCH, and DELETE can accept an alternate ``query`` argument
when you pass in a collection. This means that only members of the collection
that match the query will be modified.''

[XXX Probably not implemented in first iteration; this just records an
interesting idea.]

See https://dev.launchpad.net/Foundations/Webservice/ProposalQnA for background discussion on this proposal.

Writing scripts against Launchpad

The basic pattern for using the Launchpad scripting interface (launchpadlib) is to GET references to what you want, usually via a query; and then to act on those references. You can DEREFerence them (that is, get their details), change the objects to which they refer (PATCH them), or DELETE them.

In this introduction, note that functions in all-capitals ("GET", "DEREF", "PATCH", "DELETE") denote code that connects to Launchpad over the network. Efficient code will do this as infrequently as possible, batching work together. For example, calling a network-traversing function inside a loop is often the wrong approach.

Querying for references

SUMMARY: This section shows how to query Launchpad for collections of references. You can refine your query, ask for sorted results, and ask for only a subset of the query.

Launchpadlib exposes several top-level collections of objects (bugs, people, etc.) that you can query.

For example, here's how we could get references to all the bugs in Launchpad:

   1     from launchpadlib import Launchpad, GET
   2     launchpad = Launchpad(...)
   3     bug_query = launchpad.bugs
   4     bug_refs = GET(bug_query)

Note how GET accepts a query object and returns a list of references.

The bug_refs variable will contain references to all Launchpad bugs, if there are less than 10,000. [We will provide a channel to "bless" users so they can get more. Perhaps Canonical employees are auto-blessed.] Otherwise the request will generate a traceback, and you need to change your code to make a smaller query.

References are returned in a Python list.

   1     count = len(bug_refs)
   2     first_fifty = bug_refs[:50]
   3     first = bug_refs[0]

We may want to pare down the list of bugs to just the ones that have been marked as "won't fix". Query objects (e.g., laucnhpad.bugs) can have "restrictions" applied to them that select which items will be included when the query is executed (via GET):

   1     from launchpadlib import restrict
   2     bug_query = launchpad.bugs
   3     bug_query = restrict(bug_query.status, "won't fix")
   4     bug_refs = GET(bug_query)

Since launchpadlib knows that bug status can only have one of a small set of valid values, using an invalid value will generate an error.

You can use the "AnyOf" modifier to make a more inclusive filter.

   1     bug_query = launchpad.bugs
   2     bug_query = restrict(bug_query.status, AnyOf("won't fix", "Incomplete"))
   3     bug_refs = GET(bug_query)

[How about "refine" instead of "restrict". We talk/think about refining queries pretty often.] [I also prefer "refine". I forget what Leonard's argument was for "restrict": he had one.]

Other modifiers (such as LessThan, GreaterThan, LessThanOrEqualTo, GreaterThanOrEqualTo, and Between) are also available.

There are two other features to making queries within the GET function. First, you can pass a field name to sort by; and, second, you can pass a start and a limit.

Note that the sorting functionality is limited to particular attributes. See the Fine Documentation to determine what fields are supported.

This would get references to the most recently created 50 bugs.

   1     from launchpadlib import Descending
   2     bug_refs = GET(bug_query, sort_on=Descending(bug_query.creation_date), limit=50)

[Future iterations of the webservice may allow sub-sorting (e.g., sort_on=[Descending(bug_query.creation_date), Ascending(bug_query.title)]); for now, only sorting at one level is supported.]

[We're assuming sorting on indexed columns is cheap. If not we might not be able to do this.]

The only other way of generating references is described below in the section titled "Named Objects".

Dereferencing

SUMMARY: This section shows how to dereference collections of references. You can ask for dereferencing of the returned objects, to make the a single dereferencing call over the network get more of what you need at once, potentially increasing your program's efficiency and speed.

References to bugs are nice, but not something we can work with directly. To get all the data about the bugs we can call DEREF on the references. There's a primitive for this: DEREF.

   1     bugs = DEREF(bug_refs)

Now we can iterate over some bugs and inspect their values:

   1     for bug in bugs:
   2         print bug.title

DEREF represents a single web call. However, you typically won't use it, because it is not required to return the data for all of the references, only the first "batch," where the size of the batch is determined by the server.

[As an internal optimization, only the first N (where N is around 1000) references will actually be sent to the server in the above example.]

Since this is a hassle, the DEREF function is rarely used. The batchDEREF function (which is built on top of the primitive DEREF) is used instead. The batchDEREF function requests batches of results* [reference is to note about non-transactionality] but hides the batching operation so client code only sees an iterable of results.

   1     from launchpadlib import batchDEREF
   2     for bugs in batchDEREF(bug_refs):
   3         print bug.title

In the above example all the attributes of the bug objects were returned. It would be better to ask only for the attributes we're interested in. That's done by passing an "select" argument to batchDEREF.

   1     for bugs in batchDEREF(bug_refs, select=(bug_query.title,)):
   2         print bug.title

[We've also talked about a structDEREF. Not described here.]

In some situations there may not be a top-level collection of the items we're interested in. We may instead want "deep" information about an item. For example, if we wanted to know the names and ISO 3166-2 codes of all the countries in which there exist mirrors for any distribution:

   1    distro_query = launchpad.distributions
   2    distro_refs = GET(distro_query)
   3    distros = DEREF(distro_refs, select=(
   4        distro_query.cdimage_mirrors_collection.country.name,
   5        distro_query.cdimage_mirrors_collection.country.iso3166code2))
   6    for distro in distros:
   7        print distro.cdimage_mirrors_collection.country.iso3166code2,
   8        print distro.cdimage_mirrors_collection.country.name

Note that the collection of items to be dereferenced may be heterogeneous, in which case the selection requests may be heterogeneous. They will be applied as appropriate. [We think heterogeneous requests will actually be easier to implement than enforcing homogeneity, and they can encourage fewer dereferencing requests, and we believe that we will still be able to make efficient queries for them.]

The last dereferencing feature is described in the section below titled "Named Objects."

Patching

SUMMARY: both queries and references can be passed to PATCH, along with descriptions of what to change.

XXX Limit?

Deleting

SUMMARY: both queries and references can be passed to DELETE.

XXX Limit?

Named objects

Sometimes objects have well-known names (users have user names, bugs have bug numbers, etc.) and we want to referr to those objects by name. We can do that:

   1     me_ref = launchpadlib.people['benji']

Note that launchpadlib.people['benji'] results in a reference, while launchpadlib.people is a query (which you can GET to turn into references).

[The phrasing above suggests to me that RUN might be a better name than GET. On the other hand, RUN implies an arbitrary operation, while GET implies actually getting a result.]

If we DEREF one of these single items, we get all of its top-level attributes.

   1     me = DEREF(me_ref)
   2     print me.name

You can also pass any iterable of these references, even combined with references from a query response if so desired, to DEREF, PATCH, or DELETE.

Re-constraining a collection for DEREF, PATCH, and DELETE

SUMMARY: DEREF, PATCH, and DELETE can accept an alternate query argument when you pass in a collection. This means that only members of the collection that match the query will be modified.

[XXX Probably not implemented in first iteration; this just records an interesting idea.]

LEP/WebservicePerformance/ClientSyntax (last edited 2011-01-26 19:28:01 by benji)