Mail Archive
A services that archives mailing list emails and provides an API for other applications to retrieve the messages
Contact: Curtis (irc: sinzui)
On Launchpad: ml-archive-sucks bug tag
This is not a web/html archive. It is not a secondary means of subscribing to lists or forwarding messages (but future extensions could do this).
Rationale
1. Performance problems
Users frequently report that message take days to appear in the html archive.
2. Availability problems
Private archives are not accessible because of an openid authentication regression
3. Integration problems
The index and message pages do not look or behave like Lp app pages, which causes confusion.
We want to a fast and reliable means to store mailman messages, and to show those messages in Launchpad
Stakeholders
Canonical groups such as OEM, DX, UX, and U1 use private teams and mailing lists, but the archive is not visible. The stakeholder archive is such a list that cannot accessed to review previous discussion about this very issue.
High profile projects like openstack do not believe their messages are being delivered because they do not appear in the archive yet.
User stories
As a Mailman instance
I want messages archived quickly
so that I can keep the ArchRunner queue at zero
This could also be phrased as a sender to the list, I want to see my message in the archive to be certain it was forwarded to other users.
As a team member
I want the footer of the email to include a link to the message at Lp
so that I can refer other users to the message
As a team member
I want the message pages to include standard Lp links
so that I can navigate to users, bugs, and other areas of Lp
As a team admin
I want to use my existing MBoxes from other archivers
so that I can keep the list history
As a private team member
I want I want see the messages in the archive
so that I can review previous conversations
As a team admin
I want I want to hide messages in the archive from non-admins
so that spam, abuse, and user-data is not shown in web pages
LOSA use a custom script to do this on request of the users...mhonarc's own "delete" is not reliable.
Constraints and Requirements
Must
1. Integrate with Mailman's archiver mechanism.
2. Append new messages to the MBox quickly
MBox is the standard for storing messages. Users expect that format when importing or exporting their list data. MBox is not a fast format for managing date and thread views of the messages, or retrieving messages, which is a common performance issue with mailing list archivers.
3. Allow us to import the existing MBox data.
4. Permit exporting MBOXes
5. Provide a web service that permits Lp to:
- Get a list of months when messages were set to the list.
- Get a list of messages by date or thread for a month
- Get a message
- Allow the team admin to toggle message visibility
Nice to have
1. support a predicable id to store and retrieve messages by
Messages forwarded by mailman could include a link to where the message will be in the archive. Importing or hiding messages will not change the id used to retrieve the message.
2. Use ReST/JSON as the webservice protocol and format.
3. Provide data to show the volume of messages per month, week, and day.
Must not
1. Delay archiving a message from mailman to do secondary work.
2. server data/pages directly to users
Out of scope
1. Import an MBox from the Lp webapp
2. Forward a message to a user as if he was subscribed
3. Provide a feed of the latests messages in the archive
Subfeatures
1. Provide a library to manages how the commands/features work with the archive data, indexes, and messages.
2. Provide a command line tool for mailman and admins to work with the archive.
3. Provide a web service that Lp can Integrate with.
Success
How will we know when we are done?
1. Users can see list message in the Lp app with bug and people linked. eg. https://launchpad.net/~launchpad-dev/+mailing-list-archive/+message/nnn
2. Private teams can see the messages sent to their list.
3. list emails include a perma link to the message at Lp in the footer.
How will we measure how well we have done?
1. The Mailman ArchRunner queue will have less than 10 messages at any one time.
2. A message sent to a list is accessible in Launchpad within a minute of it arriving in the archive.
3. Members of teams with large lists can find a message in less than two minutes if they know the subject and the date +/- 1 day.
Thoughts?
background
Mailman's internal archiver is Pipermail. It maintains a canonical representation of all messages in mbox format. It generates html using templates. It supports monthly mbox archiving which reduces the burden of generating pages for all the messages. Pipermail is considered to be under-developed and needs feature to support modern needs. See http://wiki.list.org/display/DEV/ModernArchiving The mailman config can be set to use the internal archiver, see http://terri.zone12.com/doc/mailman/mailman-admin/node27.html
There are no mail archivers that meet Lp's needs. Most large scale hosters write their own service or make extensive customisations to the mediocre archives to meet their needs.
* MBox is the standard for storing a collections of messages.
- importing and exporting mbox format is a requirement, but it is not necessarilly the mechanism for mangaging indexes or servicing individual messages quickly.
- A common strategy to ensure quick archiving is to create monthly mboxes for each list. This makes monthly and date presentations easy too. This complicates thread indexes since they might span many mboxes.
* ReST/JSON is desirable for webservice API because we could use
- AJAX to interact with it.
- We do not intend to permit browsers ot have direct access to the data because we *think* we want to enhance the message data with links to real users.
Diagram of interaction
Mailman . . <create-archive> <add-message> . v mail-archive-command (Posix) | | -- mailarchivelib | | Mail-Archive-Service (ReST/JSON) ^ . <get-month-indexs> <get-date-index> <get-threat-index> <get-message> . <hide-message> <import-mbox*> <forward-message*> . Launchpad * actions are not essential