11520
Comment:
|
4479
|
Deletions are marked like this. | Additions are marked like this. |
Line 3: | Line 3: |
On 2010-02-03 mvo and barry sketched out a design to support ratings and reviews for Lucid (an LTS). Here are our notes. |
Plan: implement ratings and reviews in Software Center for Lucid. |
Line 6: | Line 5: |
'''''This is out of date. We probably won't use much if any of this, but I'm not deleting the page yet. -BAW''''' | |
Line 10: | Line 8: |
We're going to (ab)use Launchpad answers as the database for ratings and reviews. Every reviewable application will be linked to a question and we'll store individual reviews as comments in the question. User reviews will be inputted on the desktop through Software Center and will be submitted directly to Launchpad via the API. We'll deploy a new service, likely called `reviews.ubuntu.com` that will provide application reviews via HTTP GET as one big file of XML. `reviews.ubuntu.com` will talk to LP API to access and collate question comments to prevent hammering the LP database, because it can cache the read-only XML review files. |
*Ratings and reviews* is a feature targeted for Software Center (SC) in Lucid. It will allow end users to rate and review applications in Lucid, and to view the ratings and reviews submitted by all Ubuntu users. The user will experience both aspects fully within the SC application, with data storage and retrieval using an external service. |
Line 20: | Line 14: |
== Mapping applications to questions == | Outside of SC, moderators will be able to control the visibility of any review by accessing them through the external service's web ui. Moderation does not occur through the SC ui. |
Line 22: | Line 18: |
Questions have a number and are associated with a language. We need to map reviewable applications on the desktop to a question number. We'll do this with a 4-tuple of: |
The current design consists of a [[http://www.djangoproject.com/|Django]] application hosted inside the `one.ubuntu.com` web service. SC will communicate with this application with simple HTTP POST and GET, utilizing Ubuntu One's (U1) desktop `OpenID` login and RESTful OAuth service for authentication. The data will be stored in the Django database. For the purposes of this spec, we'll refer to the web service as `reviews.ubuntu.com` (r.u.c) although in reality, the service will probably be hosted at `one.ubuntu.com/reviews`. === Data retrieval === Rating and review data will be collated on the r.u.c server by a cronjob, generating a static XML file every 5 minutes or so (we can optimize for the case where no changes have occurred). Because access to this file is anonymous (i.e. over plain http://) and static, it is highly cachable. This is critical as we expect vastly more downloads of review data than submissions. === Review submission === Reviewers must be authenticated, so their reviews will be submitted to r.u.c over https. Their submissions will be signed so that r.u.c can verify the user id or display name of the submitter. Only one review per applications per submitter will be allowed. == Identifying reviews == Reviews are identified (aside from authentication data) by four pieces of information: |
Line 33: | Line 59: |
This mapping will not be stored explicitly in LP. Instead, `reviews.u.c` will maintain the mapping and use HTTP and URL trickery to expose this mapping to Software Center. |
'''''It is recognized that many applications are essentially unchanged between distro releases, such that a review of e.g. gedit in Karmic may be applicable to gedit in Lucid. We need a plan for deciding when and how to utilize reviews from a previous distro release in the next distro release.''''' |
Line 37: | Line 64: |
* Curtis: * We commonly use `SourcePackageName` which we use to create an instance of a `DistributionSourcePackage`. * Answers claims to use `SourcePackage`, but that is wrong because that limits the answer to a single series. Answers apply to packages in multiple series 99% of the time. The implementation is often uses a `DistributionSourcePackage` because that is the sane object to return. * So I ask if a review for a version of "gedit" in Hardy does not apply to a review in Karmic? There was only one user-visible change, and that led to 2 bugs being reported. Few users were affected by the spell checker change, so I am sure a review for Hardy and Karmic are equal. * Barry: * Perhaps a question for mvo or mpt, but I would rather have a review for gedit-in-hardy when looking at S/C in karmic than nothing at all because gedit hasn't been reviewed in karmic yet. However, I think the XML coming from `r.u.c` should indicate which Ubuntu series the review was done in so that it could be selectively displayed or hidden in the S/C ui. |
|
Line 47: | Line 67: |
When Sheila wants to view the reviews for Emacs, she uses Software Center. S/C generates a URL from the 4 pieces of information above, plus the language she wants to see the reviews in. The URL is something like: |
Sheila wants to view the reviews for Emacs, so she starts up SC. SC needs to refresh its cache of reviews so it does an anonymous http GET from |
Line 52: | Line 71: |
http://reviews.ubuntu.com/lucid/emacs/emacs/23.1/en | http://one.ubuntu.com/reviews/data.xml |
Line 55: | Line 74: |
If some reviews exist for this application, `r.u.c` will know what question number this is associated with (because it's already retrieved that mapping). `r.u.c` will respond with an XML file containing the entire current review stream for the application. The response will include the question number, which is required for submitting reviews. Software Center will parse the returned XML file and present it nicely to Sheila in her S/C interface. |
SC then uses this new XML file to parse the reviews for Emacs, for display in the SC ui. |
Line 62: | Line 77: |
If no review exists yet for the application, `r.u.c` needs to inform S/C of this, but this introduces a race condition. For example, if Bob wants to review the same version of Emacs as Sheila, who wins? Here are some alternative approaches (comments and other ideas welcome): |
== Submitting a review == |
Line 67: | Line 79: |
=== Issue a 404 === | Sheila wants to review Emacs for Lucid, so she starts up SC. This is Sheila's first review so SC performs a dbus authentication against the U1 desktop service. Sheila enters her credentials and completes the authentication process. |
Line 69: | Line 84: |
`r.u.c` could issue a 404 which S/C would take to mean there are no reviews of that application yet. S/C would then allow Sheila to review the app and it would submit a new question to LP with her initial review. How `r.u.c` discovers this new question and associates it with the application review is discussed below. If Bob also submits a first review before `r.u.c` discovers Sheila's review, we'll now have two questions in LP which contain the reviews for Emacs. We would have to expose an API in LP that `r.u.c` would call to merge the two questions. Probably the question with the lowest number would win. LP would merge the comments from the second question with the first, and then mark the first as `invalid`. `r.u.c` would know that the application is mapped to the first question. The window of opportunity for this race is probably fairly small, since there are 30,000 reviewable applications in Ubuntu, but maybe only a few thousand very common ones. As the review database warms up, there will be fewer popular applications that have not yet been reviewed. === Pre-populate on first request === Another idea is that `r.u.c` could pre-populate the LP database whenever a review for a non-reviewed application is requested. For example, when Sheila initiates the first review of Emacs, `r.u.c` would synchronously create a new question for this review. Thus when Bob wants to review Emacs while Sheila is still typing hers, Bob's review will end up on the same question. The downside of this approach is that we might have lots of questions without review comments. E.g. what if both Sheila and Bob abort their review before submitting it? We've now got an entry in the LP database for Emacs but with no content. We're also concerned that this will hammer the database more as it warms up with new reviews. == Adding a review == Bob wants to add a review for application Gnome-do, for which there is a robust comment history already. Bob's S/C makes a request to: |
Sheila then enters her review and 5-star rating of Emacs in Lucid, into the SC ui. When she clicks on the Submit button, SC retrieves her access token out of the gnomekeyring, and uses that to sign via HMAC-SHA1 a request to |
Line 108: | Line 89: |
http://reviews.ubuntu.com/lucid/gnome-do/gnome-do/0.8.3.1/en | https://one.ubuntu.com/reviews/new |
Line 111: | Line 92: |
and gets a mass of XML in response. This is displayed in the S/C u/i. The question number for this review is given in the response. Bob uses S/C to enter his review of Gnome-do and hits submit. S/C will authenticate Bob to `login.ubuntu.com` via Open``ID and create an O``Auth application key for submitting his review. S/C will use launchpadlib to submit Bob's review as a comment on the question. It may provide some local hacks to display Bob's review immediately but other people will not see Bob's review for a little while. |
In addition, SC creates a signed request against the `one.ubuntu.com` REST API requesting the user id and/or display name information. It passes this request and the signed review to r.u.c. r.u.c then invokes the request on behalf of the end user, retrieving some JSON data containing the reviewer's user name (or display name). r.u.c now has an authenticated review that it checks for uniqueness, and stores in its database keyed by the four pieces of information described above. |
Line 122: | Line 102: |
We do not yet have moderation for question comments exposed in the LP ui. Our intent is to enable this as the way special people can remove spam comments. The idea is to add a new team, e.g. `~software-center-moderators` as a LP celebrity, and to extend permission to edit (or maybe just disable) existing comments to this team. Thus trusted members of the Ubuntu community can be added to the team to moderate reviews. |
TBD |
Line 129: | Line 104: |
Currently API exists to edit bug comments, but not yet any ability to edit question comments. This would need to be added as well. == Limiting reviews to one-per-person == The above approach does not yet support limiting reviews to one-per-person. We could potentially build this into the submission API as a validity check for new review comments. == reviews.ubuntu.com == This is a new service we'd have to roll out that would scan LP for new review questions and comments, and build static XML files for vending to the vast Ubuntu usership. The advantage of this is that we can vend these XML files statically, so take advantage of load balancing, caching, etc. This will greatly reduce the read pressure on the LP database for review comments, as only `r.u.c` will generally query the relevant APIs. `r.u.c` will probably run a cron script that will scan LP for new questions above a watermark, looking for questions that are specifically formatted as reviews. It can look for questions assigned to `~software-center-moderators` that have a status of `review` which we will probably want to add. The `review` status will be used to hide those questions from the web ui, unless specifically search for of course. This means we won't have to overload the `invalid` status. So `r.u.c` will keep a watermark of the highest question number its seen. It will do two cron tasks: * Scan for updates to existing review questions. `r.u.c` has a list of questions with `review` status so it needs to request updated comments for each of those questions. `r.u.c` can then append the review XML and cache it for any future requests. * Scan for new review questions. `r.u.c` maintains a watermark of the highest question number its seen to date. It then needs to request a list of new questions, with numbers higher than its watermark and a status of `review`. These it adds to its database mapping application 4-tuple to question number. == Question format == Questions with status `review` are specially formatted for use by S/C. Any improperly formatted question will be ignored, as will any improperly formatted comment. Question summaries will be formatted using RFC 822 style key: value pairs: {{{ Application: distro/pkgname/appname/appversion Summary: Review of application Foo 5.8.1 in Lucid }}} Comments will have the following RFC 822 style key: value pairs: {{{ Rating: 4 Summary: Great app, I love it! Text: Gnome-do is the best thing I've ever used. My only complaint is that the icon is not purple enough. Please make it more purply. }}} Normal comment metadata, such as the author and date can be used directly. |
|
Line 198: | Line 107: |
Each reviewed application will be vended by `r.u.c` as a single XML file. The exact format of that XML file is TBD, but will be generated from a collation of RFC 822 summaries and comments for each question. == API == The following APIs need to be added to Launchpad to support the functionality described above. * Create new question tied to `(distro, pkgname, appname, appversion)` * Create new comment for `(distro, pkgname, appname, appversion)` * Get all comments for `(distro, pkgname, appname, appversion)`. Open question is what format this will be returned as. It must be as efficient as possible, but perhaps `r.u.c` can be the component that formats the response into the expected XML. * Mark question as `review` (or maybe this happens when new question is added) and `invalid` (for spam but maybe this happens through the normal LP web ui). * Get all summaries for `review` status questions with id's > watermark. |
TBD |
Line 221: | Line 112: |
we've baked that in until the next LTS. Because using Answers is a bit of a hack, it means we'll have to live with this hack for a long time, unless we can abstract away the fact that we're using Answers underneath the hood. This spec does an effective job of that for retrieving review data, because we're only relying on `reviews.ubuntu.com`, a specific URL scheme (which of course could be redirected at some point), and an XML structure. However, for *submitting* reviews, we're exposing the use of Answers to the client. One solution for that is to define a rather generic `ISubmitReview` interface for the API above. That way we can implement reviews using a totally different mechanism without having to live with a crufty client for years. |
we've baked that in until the next LTS. To any extent possible, we should isolate the client API that SC utilizes behind an abstract interface, so that we can change the underlying implementation and storage without requiring changes to SC or backward compatibility issues. |
Ratings and reviews implementation
Plan: implement ratings and reviews in Software Center for Lucid.
Overview
*Ratings and reviews* is a feature targeted for Software Center (SC) in Lucid. It will allow end users to rate and review applications in Lucid, and to view the ratings and reviews submitted by all Ubuntu users. The user will experience both aspects fully within the SC application, with data storage and retrieval using an external service.
Outside of SC, moderators will be able to control the visibility of any review by accessing them through the external service's web ui. Moderation does not occur through the SC ui.
The current design consists of a Django application hosted inside the one.ubuntu.com web service. SC will communicate with this application with simple HTTP POST and GET, utilizing Ubuntu One's (U1) desktop OpenID login and RESTful OAuth service for authentication. The data will be stored in the Django database.
For the purposes of this spec, we'll refer to the web service as reviews.ubuntu.com (r.u.c) although in reality, the service will probably be hosted at one.ubuntu.com/reviews.
Data retrieval
Rating and review data will be collated on the r.u.c server by a cronjob, generating a static XML file every 5 minutes or so (we can optimize for the case where no changes have occurred). Because access to this file is anonymous (i.e. over plain http://) and static, it is highly cachable. This is critical as we expect vastly more downloads of review data than submissions.
Review submission
Reviewers must be authenticated, so their reviews will be submitted to r.u.c over https. Their submissions will be signed so that r.u.c can verify the user id or display name of the submitter. Only one review per applications per submitter will be allowed.
Identifying reviews
Reviews are identified (aside from authentication data) by four pieces of information:
- package_name - the binary package name
- application name - the application name. this is necessary because binary packages can contain more than one application (this may be "")
- application version - the (major.minor?) version number of the application on the desktop
- distro name - e.g. Lucid
It is recognized that many applications are essentially unchanged between distro releases, such that a review of e.g. gedit in Karmic may be applicable to gedit in Lucid. We need a plan for deciding when and how to utilize reviews from a previous distro release in the next distro release.
Downloading reviews for display
Sheila wants to view the reviews for Emacs, so she starts up SC. SC needs to refresh its cache of reviews so it does an anonymous http GET from
http://one.ubuntu.com/reviews/data.xml
SC then uses this new XML file to parse the reviews for Emacs, for display in the SC ui.
Submitting a review
Sheila wants to review Emacs for Lucid, so she starts up SC. This is Sheila's first review so SC performs a dbus authentication against the U1 desktop service. Sheila enters her credentials and completes the authentication process.
Sheila then enters her review and 5-star rating of Emacs in Lucid, into the SC ui. When she clicks on the Submit button, SC retrieves her access token out of the gnomekeyring, and uses that to sign via HMAC-SHA1 a request to
https://one.ubuntu.com/reviews/new
In addition, SC creates a signed request against the one.ubuntu.com REST API requesting the user id and/or display name information. It passes this request and the signed review to r.u.c. r.u.c then invokes the request on behalf of the end user, retrieving some JSON data containing the reviewer's user name (or display name). r.u.c now has an authenticated review that it checks for uniqueness, and stores in its database keyed by the four pieces of information described above.
Moderation
TBD
XML format
TBD
Concerns
Lucid is an LTS so once we decide on the external API for ratings and reviews, we've baked that in until the next LTS. To any extent possible, we should isolate the client API that SC utilizes behind an abstract interface, so that we can change the underlying implementation and storage without requiring changes to SC or backward compatibility issues.