Re: Re: MathGroup /: Descriptive headings

*To*: mathgroup at smc.vnet.net*Subject*: [mg52010] Re: [mg51971] Re: MathGroup /: Descriptive headings*From*: DrBob <drbob at bigfoot.com>*Date*: Sun, 7 Nov 2004 01:03:26 -0500 (EST)*References*: <200411060707.CAA25940@smc.vnet.net>*Reply-to*: drbob at bigfoot.com*Sender*: owner-wri-mathgroup at wolfram.com

I can volunteer (on a limited basis, if it helps and doesn't get too onerous) to post notebooks on my website for people who can't do it themselves. In many cases -- maybe most cases -- I'd just use the Copy as InputForm palette and return the code to them. As for spam, David Park and I both use SpamArrest http://www.spamarrest.com/ to eliminate spam. It really works. The only downside to it for me -- and I don't think this should apply to you, the moderator -- is that I have to check the stopped mail I _want_ from senders who aren't human. [SpamArrest would not work on my systems which are not Windows. I only use UNIX systems. I consider Windows systems to be a security risk I do not want anywhere near my work. I use spamassassin and my own hand built procmail filters. - Moderator] You shouldn't want mail like that at all, at the mathgroup address. Bobby On Sat, 6 Nov 2004 02:07:32 -0500 (EST), Steven M. Christensen <steve at smc.vnet.net> wrote: > > See my comments within the message below. > > Steve Christensen > > >> >> In article <cmfd5u$7ta$1 at smc.vnet.net>, >> "Steven M. Christensen" <steve at smc.vnet.net> wrote: >> >> > I want to take the opportunity to reply to Paul's suggestion in >> > as much detail as possible. >> > >> > I am sorry I was not at the event at the Wolfram Technology >> > Conference when this was discussed. >> > >> > First, here are the steps I take each day to moderate this group. >> > Figuring out where in these steps to put in categorization would need >> > to fit into this. >> > >> > 1. I get perhaps 2500-3000 emails a day, every day. Of these, perhaps >> > 500 are not spam. Because the Mathgroup addresses are easily found >> > by spammers, there is no way around getting a lot of spam. >> >> Do you mean that the spammers are forging email addresses of MathGroup >> participants and using these to post messages to MathGroup >> (mathgroup at wolfram.com)? I can see how that would make things more difficult >> to filter. > > > Yes, this happens all the time. Spam comes to mathgroup via mailing > list messages, newsgroup posts, spammers who have just found addresses > in the newsgroups and archives. > > >> >> If I understand you correctly, requiring individuals to "register" with >> you, possibly listing multiple email addresses, and bouncing email that >> is not from registered participants, with a message telling them how to >> register, would not work. > > > No this would not work. I even get spam from wolfram.com addresses > even though I know it did not come from there. I sometimes get > spam from myself! > > >> >> Because I usually post from a news reader, my messages have the >> following field: >> >> Newsgroups: comp.soft-sys.math.mathematica >> >> Could this be used as a filter (or do spammers forge this as well)? > > > Spammers forge every element of posts. > > >> >> [As an aside, a solution to SPAM needs to be found. To me, it should >> cost money, only of the order of a couple of cents, to send any email >> message. You would need to purchase a valid one-off "e-stamp", using >> some form of encryption technology, from some site (I'm suprised that >> the automatic billing sites have not already done this). Then only valid >> e-stamps would be routed though the network. There are, of course, many >> issues with this proposal ...] >> >> > Further, because MathGroup users often, unfortunately, >> > send html email or other attachments, maybe 10-20 of their mails get >> > filtered by my, fairly sophisticated but not perfect, spam filters into >> > my spam folder. >> >> To me, one of the major limitations of MathGroup is that we cannot >> attach Notebooks (without including them in the body of the message). > > > > Attaching notebooks causes numerous problems. > > 1. Notebooks as attachments are very often rejected by spam filters > either at ISP's, moderation level, or end users. > > 2. Can a windows user really trust that a notebook attachment is not > a virus or worm? If I were using a Windows machine and saw an > attachment, I would not open it. > > 3. Many notebooks are very long and some mail systems will not be able > to handle them. Rules about attached notebooks would have > to be devised. Not a simple matter given that I get so many > posts that can't follow even simpler rules. > > It is far simpler to have someone put their notebook on a server somewhere > where it can be downloaded and then include a link within the post. > > >> >> > 2. Of the 500 good emails that get past my spam filters, I then have to >> > filter out those mails that are for Mathgroup. Then, I have to >> > go through the spam folder to find any MathGroup posts that might be >> > there. So,there are usually about 70 emails relevant to MathGroup. >> > Some, maybe 10 do not follow the rules - flames, licensing questions, >> > discussions of other systems, really trivial items, totally >> > non-Mathematica >> > related. In the end, there are 30-60 emails to read in more detail. >> >> Actually, if the Subject line included question categories as is being >> proposed, couldn't you use this as the primary filter (or again, do >> spammers forge this as well)? > > > Again, spammers will grab email addresses, Subject lines, even > content sometimes. Most of that comes to me where I filter it. > But I have had some reports that people get email from mathgroup > and I did not send. > > >> >> > 3. Once I decide that the posts are OK, I run them through a number of >> > UNIX scripts and do some more editing to take out unneeded mail headers >> > etc. >> > >> > 4. Then the mails are run through scripts that send them to the >> > newsgroup and the mailing list. One of the scripts adds the >> > numbers to the Subject line of the mail that goes to >> > the mailing list. Note that the [ ] are really needed. >> >> As I read MathGroup in a newsreader or sometimes via Google at >> >> http://groups.google.com/groups?q=comp.soft-sys.math.mathematica >> >> I do not see the numbers or the []. Google seems to handle threading >> better than my newsreader. >> > > > > The [mg ... ] numbers only go out to the mailing list to help with > filtering. They will not be seen in the newsgroup or on google. > > >> The numbers do not appear at >> >> http://forums.wolfram.com/mathgroup/archive/2004/Nov/ >> >> until you click on a particular message so I'm not sure exactly how they >> are useful (but then again, I avoid mailing lists and prefer to use >> newsgroups or the web). (And I wonder why the Mathgroup archive is not >> threaded?) > > > The archive gets its message from the mailing list and I also think > it just uses a mail to html script and not a threaded system. I do > not do the archive. > > >> >> > Suppose you just put Statistics in the Subject line, mail filters might >> > not always know how to do the filtering, whereas [Statistics] >> > is easier to filter. >> >> > This process takes from 1-3 hours typically, depending on the >> > number of emails, their complexity, etc. >> >> I did not realise exactly how big a task you face. > > > Clearly if it weren't for the spam, it would be easier. > > >> >> > So, the questions are, when during this process would categorisation >> > take place? Who would do it? >> >> It would be best if contributors did such a categorisation for you, i.e. >> at the time of posting. >> >> > What would it look like? >> >> Instead of [], another suggestion would be (mock Mathematica syntax >> using /:), e.g., >> >> Statistics /: Chi-square test >> >> This would also be harder to forge and should still be easy to filter. > > > It might be possible if we can define say only 10 categories and > then put the category either in a special header or within the > test of a message. This could be done in a voluntary way by > the person sending the post. > > If people want to send me a list of 10 categories, I can collect > them and see if there really are 10 or maybe 100, which would > be silly I think. > > Another idea would be for someone clever to write a script that > could categorize a post. For example, all words in a post > could be extracted to a list and then compared to a list of > categories and those categories that that fit could be chosen > and put on say the top line of the post to help with filtering. > Some posts might not be easy to treat in this way, but it might > help. > > Paul, this is your suggestion and you are known to be very clever, want to > write such a program? > > In truth, I don't think I want to do anything unless there is > a significant vote from end users to do it and a nice way > to handle it consistently > > >> >> > How would it effect mail and newsgroup readers? >> >> I imagine that it would have little effect, except the desired one of >> allowing better filtering. >> >> > I think it would be a bad idea to put things like [Statistics] in >> > the Subject line. Would newsgroup and mail readers be able to >> > thread such Subject lines? >> >> Surely that is exactly what they are designed to do. >> >> And I could filter the messages into subfolders of my MathGroup folder >> automatically. >> >> > It might be better to put it in something like an X-Category mail header, >> > but I am not sure that all readers could handle this. >> >> This idea has merit and, again, it might be harder to forge, but I don't >> know enough about these headers. >> >> > Personally, I think they would just make the Subject lines longer >> > and harder to read. >> >> Nested Re: Re: Re: ... already does this, though Google handles this >> very well, in its threading, dropping all Re at the top level, listing >> only the subject, and then listing the contributor for each item in the >> thread. > > > Yes, the Re Re Re is a problem and I will try to fix that. > > >> >> > Who is going to do the categorisation? >> >> The contributor. >> >> > I know a lot about >> > Mathematica and mathematics, but certainly not enough to figure >> > out what every message best fits into. If I make a poor selection >> > and a message has gone out it is virtually impossible to re-do >> > the categorization in the newsgroups, mailing list, google group >> > listings, archives, etc. >> >> Sometimes categorizations have to change. You could have >> >> Numerics -> Graphics /: Accurate plotting >> >> when there is such a change. >> >> > Search therefore becomes inaccurate very quickly. >> >> I don't think that this is true. >> >> > What if someone disagrees with my selections? >> >> Not a big issue, I think. I think the group will come to consensus on a >> categorization, or move on to a different categorization as required. >> >> > How much time will this add to moderation? >> >> I would hope that it would greatly _reduce_ your moderation time. > > > I can assure you that adding more complexity to the posts will > increase moderation time. > > >> >> > If others select the categories to help me out, that will just >> > delay moderation. >> >> I do not see why. >> >> > Maybe, we can urge the person who originally writes the message to select >> > a category, but how does a new user know what category to pick? >> >> There should be a list in the rules section at >> >> http://smc.vnet.net/mathgroup.html >> >> > What if a users forgets to include a categorisation? >> >> You can add one. >> >> > Is someone going to go back and categorise the 51,000 messages that >> > are already in the archive? >> >> Unlikely, I think. However, I expect that the archive has grown >> exponentially and will continue to do so. >> >> > The simplest thing to do would be to have some group that is willing >> > categorise the posts once they get into the Wolfram Research >> > archive only. Then search could be done fairly easily. >> > >> > This sort of categorisation may be done in other newsgroups, but >> > I have not seen it. >> >> I expect that it is used on other newsgroups, but I have not seen it, or >> there are subgroups. >> >> sci.math >> sci.math.symbolic > > > If you look at these groups you will find no real categorisation > of any kind. I could not find any group that had any. > > >> >> > I am open to suggestions and comments, but I frankly this this >> > is going to be a very difficult process to do. >> >> It was intended as a suggestion to reduce your workload, to speed up the >> rate of posting to MathGroup, and to improve the automatic filtering >> (and threading) of messages. >> >> Cheers, >> Paul >> >> >> > >> > >> > Hi all, and especially Steve Christensen: >> > >> > At the recent Wolfram Technology Conference in Champaign, Luc Barthelet >> > <lucb at ea.com>, a regular user of MathGroup suggested that it would be >> > good if all postings to MathGroup included a categorisation in their >> > header, e.g. >> > >> > Newbies, Graphics, Functions, Programming, Statistics, Teaching, >> > Integration, Numerics, Symbolic Algebra, Special Functions, ... >> > >> > so a Subject line might take the form >> > >> > [Statistics]: How to fit to an elliptical function? >> > >> > (not sure if the [ ] are required or useful). In this way, sorting by >> > Subject would be easier. Of course, it's not always easy to do such a >> > categorisation, and they may change with time (as a problem stated as a >> > Numerics might end up being solved using Symbolic Algebra). >> > Nevertheless, I think such a change would be very useful. It should also >> > help when doing searches on MathGroup archives. >> > >> > Cheers, >> > Paul >> >> -- >> Paul Abbott Phone: +61 8 6488 2734 >> School of Physics, M013 Fax: +61 8 6488 1014 >> The University of Western Australia (CRICOS Provider No 00126G) >> 35 Stirling Highway >> Crawley WA 6009 mailto:paul at physics.uwa.edu.au >> AUSTRALIA http://physics.uwa.edu.au/~paul >> > > > > -- DrBob at bigfoot.com www.eclecticdreams.net

**Follow-Ups**:**Re: Re: Re: MathGroup /: Descriptive headings***From:*DrBob <drbob@bigfoot.com>

**References**:**Re: MathGroup /: Descriptive headings***From:*"Steven M. Christensen" <steve@smc.vnet.net>