## page was renamed from Projects/MailArchiver
= Mail Archive =
A services that archives mailing list emails and provides an API for
other applications to retrieve the messages
'''Contact:''' Curtis (irc: sinzui) <
>
'''On Launchpad:''' [[https://bugs.launchpad.net/launchpad/+bugs?field.tag=ml-archive-sucks|ml-archive-sucks bug tag]]
This is not a web/html archive. It is not a secondary means of subscribing
to lists or forwarding messages (but future extensions could do this).
== Rationale ==
1. Performance problems<
>
Users frequently report that message take days to appear in the html archive.
2. Availability problems<
>
Private archives are not accessible because of an openid authentication
regression
3. Integration problems<
>
The index and message pages do not look or behave like Lp app pages, which
causes confusion.
We want to a fast and reliable means to store mailman messages, and to
show those messages in Launchpad
== Stakeholders ==
Canonical groups such as OEM, DX, UX, and U1 use private teams and mailing
lists, but the archive is not visible. The stakeholder archive is such
a list that cannot accessed to review previous discussion about this very
issue.
High profile projects like openstack do not believe their messages are being
delivered because they do not appear in the archive yet.
== User stories ==
<)>>
'''As a ''' Mailman instance<
>
'''I want ''' messages archived quickly<
>
'''so that ''' I can keep the ArchRunner queue at zero<
>
This could also be phrased as a sender to the list, I want to see my message
in the archive to be certain it was forwarded to other users.
'''As a ''' team member<
>
'''I want ''' the footer of the email to include a link to the message at Lp<
>
'''so that ''' I can refer other users to the message<
>
'''As a ''' team member<
>
'''I want ''' the message pages to include standard Lp links<
>
'''so that ''' I can navigate to users, bugs, and other areas of Lp<
>
'''As a ''' team admin<
>
'''I want ''' to use my existing MBoxes from other archivers<
>
'''so that ''' I can keep the list history<
>
'''As a ''' private team member<
>
'''I want ''' I want see the messages in the archive<
>
'''so that ''' I can review previous conversations<
>
'''As a ''' team admin<
>
'''I want ''' I want to hide messages in the archive from non-admins<
>
'''so that ''' spam, abuse, and user-data is not shown in web pages<
>
LOSA use a custom script to do this on request of the users...mhonarc's
own "delete" is not reliable.
== Constraints and Requirements ==
=== Must ===
1. Integrate with Mailman's archiver mechanism.
2. Append new messages to the MBox quickly<
>
MBox is the standard for storing messages. Users expect that format when
importing or exporting their list data. MBox is not a fast format for
managing date and thread views of the messages, or retrieving messages,
which is a common performance issue with mailing list archivers.
3. Allow us to import the existing MBox data.
4. Permit exporting MBOXes
5. Provide a web service that permits Lp to:
1. Get a list of months when messages were set to the list.
2. Get a list of messages by date or thread for a month
3. Get a message
4. Allow the team admin to toggle message visibility
=== Nice to have ===
1. support a predicable id to store and retrieve messages by<
>
Messages forwarded by mailman could include a link to where the message
will be in the archive. Importing or hiding messages will not change
the id used to retrieve the message.
2. Use ReST/JSON as the webservice protocol and format.
3. Provide data to show the volume of messages per month, week, and day.
=== Must not ===
1. Delay archiving a message from mailman to do secondary work.
2. server data/pages directly to users
=== Out of scope ===
1. Import an MBox from the Lp webapp
2. Forward a message to a user as if he was subscribed
3. Provide a feed of the latests messages in the archive
== Subfeatures ==
1. Provide a library to manages how the commands/features work with the
archive data, indexes, and messages.
2. Provide a command line tool for mailman and admins to work with the
archive.
3. Provide a web service that Lp can Integrate with.
== Success ==
=== How will we know when we are done? ===
1. Users can see list message in the Lp app with bug and people linked.
eg. https://launchpad.net/~launchpad-dev/+mailing-list-archive/+message/nnn
2. Private teams can see the messages sent to their list.
3. list emails include a perma link to the message at Lp in the footer.
=== How will we measure how well we have done? ===
1. The Mailman ArchRunner queue will have less than 10 messages at any one
time.
2. A message sent to a list is accessible in Launchpad within a minute of it
arriving in the archive.
3. Members of teams with large lists can find a message in less than
two minutes if they know the subject and the date +/- 1 day.
== Thoughts? ==
=== background ===
Mailman's internal archiver is Pipermail. It maintains a canonical
representation of all messages in mbox format. It generates html
using templates. It supports monthly mbox archiving which reduces
the burden of generating pages for all the messages. Pipermail is
considered to be under-developed and needs feature to support modern
needs. See http://wiki.list.org/display/DEV/ModernArchiving
The mailman config can be set to use the internal archiver, see
http://terri.zone12.com/doc/mailman/mailman-admin/node27.html
There are no mail archivers that meet Lp's needs. Most large scale hosters
write their own service or make extensive customisations to the mediocre
archives to meet their needs.
* MBox is the standard for storing a collections of messages.
* importing and exporting mbox format is a requirement, but it is
not necessarilly the mechanism for mangaging indexes or servicing
individual messages quickly.
* A common strategy to ensure quick archiving is to create monthly
mboxes for each list. This makes monthly and date presentations
easy too. This complicates thread indexes since they might span
many mboxes.
* ReST/JSON is desirable for webservice API because we could use
AJAX to interact with it.
* We do not intend to permit browsers ot have direct access to the data
because we *think* we want to enhance the message data with
links to real users.
=== Diagram of interaction ===
{{{
Mailman
.
.
.
v
mail-archive-command (Posix)
|
|
-- mailarchivelib
|
|
Mail-Archive-Service (ReST/JSON)
^
.
.
.
Launchpad
* actions are not essential
}}}