Diff for "RNRDesign"

Not logged in - Log In / Register

Differences between revisions 9 and 10
Revision 9 as of 2010-02-12 19:07:32
Size: 11520
Editor: barry
Comment:
Revision 10 as of 2010-02-16 16:16:24
Size: 4479
Editor: barry
Comment:
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
On 2010-02-03 mvo and barry sketched out a design to support ratings and
reviews for Lucid (an LTS). Here are our notes.
Plan: implement ratings and reviews in Software Center for Lucid.
Line 6: Line 5:
'''''This is out of date. We probably won't use much if any of this, but I'm not deleting the page yet. -BAW'''''
Line 10: Line 8:
We're going to (ab)use Launchpad answers as the database for ratings and
reviews. Every reviewable application will be linked to a question and we'll
store individual reviews as comments in the question. User reviews will be
inputted on the desktop through Software Center and will be submitted directly
to Launchpad via the API. We'll deploy a new service, likely called
`reviews.ubuntu.com` that will provide application reviews via HTTP GET as one
big file of XML. `reviews.ubuntu.com` will talk to LP API to access and
collate question comments to prevent hammering the LP database, because it can
cache the read-only XML review files.
*Ratings and reviews* is a feature targeted for Software Center (SC) in Lucid.
It will allow end users to rate and review applications in Lucid, and to view
the ratings and reviews submitted by all Ubuntu users. The user will
experience both aspects fully within the SC application, with data storage and
retrieval using an external service.
Line 20: Line 14:
== Mapping applications to questions == Outside of SC, moderators will be able to control the visibility of any review
by accessing them through the external service's web ui. Moderation does not
occur through the SC ui.
Line 22: Line 18:
Questions have a number and are associated with a language. We need to map
reviewable applications on the desktop to a question number. We'll do this
with a 4-tuple of:
The current design consists of a [[http://www.djangoproject.com/|Django]]
application hosted inside the `one.ubuntu.com` web service. SC will
communicate with this application with simple HTTP POST and GET, utilizing
Ubuntu One's (U1) desktop `OpenID` login and RESTful OAuth service for
authentication. The data will be stored in the Django database.

For the purposes of this spec, we'll refer to the web service as
`reviews.ubuntu.com` (r.u.c) although in reality, the service will probably be
hosted at `one.ubuntu.com/reviews`.


=== Data retrieval ===

Rating and review data will be collated on the r.u.c server by a cronjob,
generating a static XML file every 5 minutes or so (we can optimize for the
case where no changes have occurred). Because access to this file is
anonymous (i.e. over plain http://) and static, it is highly cachable. This
is critical as we expect vastly more downloads of review data than
submissions.


=== Review submission ===

Reviewers must be authenticated, so their reviews will be submitted to r.u.c
over https. Their submissions will be signed so that r.u.c can verify the
user id or display name of the submitter. Only one review per applications
per submitter will be allowed.


== Identifying reviews ==

Reviews are identified (aside from authentication data) by four pieces of
information:
Line 33: Line 59:
This mapping will not be stored explicitly in LP. Instead, `reviews.u.c` will
maintain the mapping and use HTTP and URL trickery to expose this mapping to
Software Center.
'''''It is recognized that many applications are essentially unchanged between
distro releases, such that a review of e.g. gedit in Karmic may be applicable
to gedit in Lucid. We need a plan for deciding when and how to utilize
reviews from a previous distro release in the next distro release.'''''
Line 37: Line 64:
 * Curtis:
    * We commonly use `SourcePackageName` which we use to create an instance of a `DistributionSourcePackage`.
    * Answers claims to use `SourcePackage`, but that is wrong because that limits the answer to a single series. Answers apply to packages in multiple series 99% of the time. The implementation is often uses a `DistributionSourcePackage` because that is the sane object to return.
    * So I ask if a review for a version of "gedit" in Hardy does not apply to a review in Karmic? There was only one user-visible change, and that led to 2 bugs being reported. Few users were affected by the spell checker change, so I am sure a review for Hardy and Karmic are equal.

 * Barry:
     * Perhaps a question for mvo or mpt, but I would rather have a review for gedit-in-hardy when looking at S/C in karmic than nothing at all because gedit hasn't been reviewed in karmic yet. However, I think the XML coming from `r.u.c` should indicate which Ubuntu series the review was done in so that it could be selectively displayed or hidden in the S/C ui.
Line 47: Line 67:
When Sheila wants to view the reviews for Emacs, she uses Software Center.
S/C generates a URL from the 4 pieces of information above, plus the language
she wants to see the reviews in. The URL is something like:
Sheila wants to view the reviews for Emacs, so she starts up SC. SC needs to
refresh its cache of reviews so it does an anonymous http GET from
Line 52: Line 71:
http://reviews.ubuntu.com/lucid/emacs/emacs/23.1/en http://one.ubuntu.com/reviews/data.xml
Line 55: Line 74:
If some reviews exist for this application, `r.u.c` will know what question
number this is associated with (because it's already retrieved that mapping).
`r.u.c` will respond with an XML file containing the entire current review
stream for the application. The response will include the question number,
which is required for submitting reviews. Software Center will parse the
returned XML file and present it nicely to Sheila in her S/C interface.
SC then uses this new XML file to parse the reviews for Emacs, for display in
the SC ui.
Line 62: Line 77:
If no review exists yet for the application, `r.u.c` needs to inform S/C of
this, but this introduces a race condition. For example, if Bob wants to
review the same version of Emacs as Sheila, who wins? Here are some
alternative approaches (comments and other ideas welcome):
== Submitting a review ==
Line 67: Line 79:
=== Issue a 404 === Sheila wants to review Emacs for Lucid, so she starts up SC. This is Sheila's
first review so SC performs a dbus authentication against the U1 desktop
service. Sheila enters her credentials and completes the authentication
process.
Line 69: Line 84:
`r.u.c` could issue a 404 which S/C would take to mean there are no reviews of
that application yet. S/C would then allow Sheila to review the app and it
would submit a new question to LP with her initial review. How `r.u.c`
discovers this new question and associates it with the application review is
discussed below. If Bob also submits a first review before `r.u.c` discovers
Sheila's review, we'll now have two questions in LP which contain the reviews
for Emacs.

We would have to expose an API in LP that `r.u.c` would call to merge the two
questions. Probably the question with the lowest number would win. LP would
merge the comments from the second question with the first, and then mark the
first as `invalid`. `r.u.c` would know that the application is mapped to the
first question.

The window of opportunity for this race is probably fairly small, since there
are 30,000 reviewable applications in Ubuntu, but maybe only a few thousand
very common ones. As the review database warms up, there will be fewer
popular applications that have not yet been reviewed.

=== Pre-populate on first request ===

Another idea is that `r.u.c` could pre-populate the LP database whenever a
review for a non-reviewed application is requested. For example, when Sheila
initiates the first review of Emacs, `r.u.c` would synchronously create a new
question for this review. Thus when Bob wants to review Emacs while Sheila is
still typing hers, Bob's review will end up on the same question.

The downside of this approach is that we might have lots of questions without
review comments. E.g. what if both Sheila and Bob abort their review before
submitting it? We've now got an entry in the LP database for Emacs but with
no content. We're also concerned that this will hammer the database more as
it warms up with new reviews.

== Adding a review ==

Bob wants to add a review for application Gnome-do, for which there is a
robust comment history already. Bob's S/C makes a request to:
Sheila then enters her review and 5-star rating of Emacs in Lucid, into the SC
ui. When she clicks on the Submit button, SC retrieves her access token out
of the gnomekeyring, and uses that to sign via HMAC-SHA1 a request to
Line 108: Line 89:
http://reviews.ubuntu.com/lucid/gnome-do/gnome-do/0.8.3.1/en https://one.ubuntu.com/reviews/new
Line 111: Line 92:
and gets a mass of XML in response. This is displayed in the S/C u/i. The
question number for this review is given in the response. Bob uses S/C to
enter his review of Gnome-do and hits submit. S/C will authenticate Bob to
`login.ubuntu.com` via Open``ID and create an O``Auth application key for
submitting his review. S/C will use launchpadlib to submit Bob's review as a
comment on the question. It may provide some local hacks to display Bob's
review immediately but other people will not see Bob's review for a little
while.
In addition, SC creates a signed request against the `one.ubuntu.com` REST API
requesting the user id and/or display name information. It passes this
request and the signed review to r.u.c. r.u.c then invokes the request on
behalf of the end user, retrieving some JSON data containing the reviewer's
user name (or display name). r.u.c now has an authenticated review that it
checks for uniqueness, and stores in its database keyed by the four pieces of
information described above.
Line 122: Line 102:
We do not yet have moderation for question comments exposed in the LP ui. Our
intent is to enable this as the way special people can remove spam comments.
The idea is to add a new team, e.g. `~software-center-moderators` as a LP
celebrity, and to extend permission to edit (or maybe just disable) existing
comments to this team. Thus trusted members of the Ubuntu community can be
added to the team to moderate reviews.
TBD
Line 129: Line 104:
Currently API exists to edit bug comments, but not yet any ability to edit
question comments. This would need to be added as well.

== Limiting reviews to one-per-person ==

The above approach does not yet support limiting reviews to one-per-person.
We could potentially build this into the submission API as a validity check
for new review comments.

== reviews.ubuntu.com ==

This is a new service we'd have to roll out that would scan LP for new review
questions and comments, and build static XML files for vending to the vast
Ubuntu usership. The advantage of this is that we can vend these XML files
statically, so take advantage of load balancing, caching, etc. This will
greatly reduce the read pressure on the LP database for review comments, as
only `r.u.c` will generally query the relevant APIs.

`r.u.c` will probably run a cron script that will scan LP for new questions
above a watermark, looking for questions that are specifically formatted as
reviews. It can look for questions assigned to `~software-center-moderators`
that have a status of `review` which we will probably want to add.

The `review` status will be used to hide those questions from the web ui,
unless specifically search for of course. This means we won't have to
overload the `invalid` status.

So `r.u.c` will keep a watermark of the highest question number its seen. It
will do two cron tasks:

 * Scan for updates to existing review questions. `r.u.c` has a list of
 questions with `review` status so it needs to request updated comments for
 each of those questions. `r.u.c` can then append the review XML and cache it
 for any future requests.
 * Scan for new review questions. `r.u.c` maintains a watermark of the
 highest question number its seen to date. It then needs to request a list of
 new questions, with numbers higher than its watermark and a status of
 `review`. These it adds to its database mapping application 4-tuple to
 question number.

== Question format ==

Questions with status `review` are specially formatted for use by S/C. Any
improperly formatted question will be ignored, as will any improperly
formatted comment.

Question summaries will be formatted using RFC 822 style key: value pairs:

{{{
Application: distro/pkgname/appname/appversion
Summary: Review of application Foo 5.8.1 in Lucid
}}}

Comments will have the following RFC 822 style key: value pairs:

{{{
Rating: 4
Summary: Great app, I love it!
Text:
 Gnome-do is the best thing I've ever used.

 My only complaint is that the icon is not purple enough. Please
 make it more purply.
}}}

Normal comment metadata, such as the author and date can be used directly.
Line 198: Line 107:
Each reviewed application will be vended by `r.u.c` as a single XML file. The
exact format of that XML file is TBD, but will be generated from a collation
of RFC 822 summaries and comments for each question.

== API ==

The following APIs need to be added to Launchpad to support the functionality
described above.

 * Create new question tied to `(distro, pkgname, appname, appversion)`
 * Create new comment for `(distro, pkgname, appname, appversion)`
 * Get all comments for `(distro, pkgname, appname, appversion)`. Open
 question is what format this will be returned as. It must be as efficient as
 possible, but perhaps `r.u.c` can be the component that formats the response
 into the expected XML.
 * Mark question as `review` (or maybe this happens when new question is
 added) and `invalid` (for spam but maybe this happens through the normal LP
 web ui).
 * Get all summaries for `review` status questions with id's > watermark.
TBD
Line 221: Line 112:
we've baked that in until the next LTS. Because using Answers is a bit of a
hack, it means we'll have to live with this hack for a long time, unless we
can abstract away the fact that we're using Answers underneath the hood.

This spec does an effective job of that for retrieving review data, because
we're only relying on `reviews.ubuntu.com`, a specific URL scheme (which of
course could be redirected at some point), and an XML structure.

However, for *submitting* reviews, we're exposing the use of Answers to the
client. One solution for that is to define a rather generic `ISubmitReview`
interface for the API above. That way we can implement reviews using a
totally different mechanism without having to live with a crufty client for
years.
we've baked that in until the next LTS. To any extent possible, we should
isolate the client API that SC utilizes behind an abstract interface, so that
we can change the underlying implementation and storage without requiring
changes to SC or backward compatibility issues.

Ratings and reviews implementation

Plan: implement ratings and reviews in Software Center for Lucid.

Overview

*Ratings and reviews* is a feature targeted for Software Center (SC) in Lucid. It will allow end users to rate and review applications in Lucid, and to view the ratings and reviews submitted by all Ubuntu users. The user will experience both aspects fully within the SC application, with data storage and retrieval using an external service.

Outside of SC, moderators will be able to control the visibility of any review by accessing them through the external service's web ui. Moderation does not occur through the SC ui.

The current design consists of a Django application hosted inside the one.ubuntu.com web service. SC will communicate with this application with simple HTTP POST and GET, utilizing Ubuntu One's (U1) desktop OpenID login and RESTful OAuth service for authentication. The data will be stored in the Django database.

For the purposes of this spec, we'll refer to the web service as reviews.ubuntu.com (r.u.c) although in reality, the service will probably be hosted at one.ubuntu.com/reviews.

Data retrieval

Rating and review data will be collated on the r.u.c server by a cronjob, generating a static XML file every 5 minutes or so (we can optimize for the case where no changes have occurred). Because access to this file is anonymous (i.e. over plain http://) and static, it is highly cachable. This is critical as we expect vastly more downloads of review data than submissions.

Review submission

Reviewers must be authenticated, so their reviews will be submitted to r.u.c over https. Their submissions will be signed so that r.u.c can verify the user id or display name of the submitter. Only one review per applications per submitter will be allowed.

Identifying reviews

Reviews are identified (aside from authentication data) by four pieces of information:

  • package_name - the binary package name
  • application name - the application name. this is necessary because binary packages can contain more than one application (this may be "")
  • application version - the (major.minor?) version number of the application on the desktop
  • distro name - e.g. Lucid

It is recognized that many applications are essentially unchanged between distro releases, such that a review of e.g. gedit in Karmic may be applicable to gedit in Lucid. We need a plan for deciding when and how to utilize reviews from a previous distro release in the next distro release.

Downloading reviews for display

Sheila wants to view the reviews for Emacs, so she starts up SC. SC needs to refresh its cache of reviews so it does an anonymous http GET from

http://one.ubuntu.com/reviews/data.xml

SC then uses this new XML file to parse the reviews for Emacs, for display in the SC ui.

Submitting a review

Sheila wants to review Emacs for Lucid, so she starts up SC. This is Sheila's first review so SC performs a dbus authentication against the U1 desktop service. Sheila enters her credentials and completes the authentication process.

Sheila then enters her review and 5-star rating of Emacs in Lucid, into the SC ui. When she clicks on the Submit button, SC retrieves her access token out of the gnomekeyring, and uses that to sign via HMAC-SHA1 a request to

https://one.ubuntu.com/reviews/new

In addition, SC creates a signed request against the one.ubuntu.com REST API requesting the user id and/or display name information. It passes this request and the signed review to r.u.c. r.u.c then invokes the request on behalf of the end user, retrieving some JSON data containing the reviewer's user name (or display name). r.u.c now has an authenticated review that it checks for uniqueness, and stores in its database keyed by the four pieces of information described above.

Moderation

TBD

XML format

TBD

Concerns

Lucid is an LTS so once we decide on the external API for ratings and reviews, we've baked that in until the next LTS. To any extent possible, we should isolate the client API that SC utilizes behind an abstract interface, so that we can change the underlying implementation and storage without requiring changes to SC or backward compatibility issues.

RNRDesign (last edited 2010-03-09 20:16:47 by barry)