What Polymath Needs is Wasted Time July 26, 2009Posted by Phi. Isett in Essays, Uncategorized.
The following is an extensive reply to a post on Terence Tao’s blog.
My point of view is based in part on my experience as a moderator of a webforum of up to a dozen active members which devoted a few years to the collaborative production of a complex storyline with several, deeply interwoven subplots. Time will tell how well large collaborations can produce mathematics –they are certainly an amazing tool for story-writing, and some comparisons can be made upon abstraction so the experience may be relevant. The Google Groups format had tremendous advantages and shortcomings, but an inability to harness people’s free time ultimately lead to our story’s stagnation.
At the moment, polymath seems to function in many ways analogously to various forms of entertainment and “time-wasting” (reading blogs and webcomics, participating in forums, watching movies, watching YouTube, etc.) – indeed, this “wasted time” is in some sense exactly the incredible resource which polymath must compete to harness, although within a more restricted audience and for the noble purpose of serving mathematics. I am sort of joking, but this is only my interpretation of Prof. Tao’s original request that the participation in the latest polymath problem solving experiment be casual.
Now, much work has already been done to procrastinate as efficiently as possible– we read a select few blogs/webcomics/twitters and have efficient means of getting to them (feeds, bookmarks); we look at Rotten Tomatoes to decide which movies to watch. On YouTube, we often look at the “N Views” and the “x/5.0 (y votes)“ user-rating figure to decide which videos are best worth our wasted time (and many videos on YouTube are “replies” to others, which are linked). YouTube, for instance, has had to developed mechanisms to help ensure good posts do not get squashed, which is a problem polymath also faces. Regardless of their shortcomings and their obviously differing objectives from those of Polymath, these “procrastination” activities provide good examples of how to effectively organize a mass of information which is spread among many people so as to save time, and that is the problem I want to address here (as opposed to the closely related problem of optimal technology and formatting). In other words, I will not propose any particular technologies, but rather make a list of things which I believe a technology has to be capable of doing in order to harness massive collaboration and spare time in a satisfactory way – a task which is crucial to maximizing the long term impact of polymath.
I wanted to be able to reference people, but the blog format makes this very difficult (.. or at least I don’t know how to do it!..), so instead I must apologize to everyone whose ideas I reproduce here without giving credit. Doing so will obviously be a mistake on my part (and will thus help me raise the point that easy citation is important); many ideas and issues raised below were scattered within the first 43 comments on this entry of Prof. Tao’s blog. In fact, none of the ideas presented below are mine, essentially; they have all been taken from various other internet services which have attempted to solve similar problems.
Here is a table of contents of issues raised in that post that will be addressed:
- Sloppy writing — see *Favorites*, *Post Classification*, *View Counting*
- Evaluation/classification of ideas (currently under-explored?, flawed?, showing promise?, novel observation?, etc…) – see *Post Classification*
- Ease of linking to / citing other posts – see *Finally* and *Private messaging*
- Estimating time commitment – see following paragraph
- Keeping a sense of chronology but being able to edit – I think Google Wave and its playback feature should be able to help once it supports LaTeX… But see *Finally*
- Need for Leadership – See the entire section on personal accounts.
- “Visualizing the tree” – see *Finally*, but I have not considered the problem of visualizing logical dependences
- Private vs. public messaging – see *Private messaging*
- Personal notes/ journal – see *Personal blog*
Consider the following model of Polymath, which to a first approximation is a perturbation of YouTube combined with certain aspects of webforum functionality. A person involved in Polymath must have a Polymath account (blog activity / wiki activity, etc. within particular projects is all somehow contained inside a larger entity called “Polymath” – compare Google Groups), and although one need not necessarily go by his true identity in such an account, he must register for any individual project (and, for example, verify he is not a robot). Before deciding whether to participate in a project, he knows which of his “Friends” are involved (where “friendship” is mutual akin to Facebook friendship–humans themselves are a natural and efficient means by which we already organize our time). He also has good ways to estimate the starting cost: all the posts are publicly readable, as are results of simple survey data gathered from those involved, as is present a description of prerequisite knowledge, and statistics about the distance to a leaf in the comment tree, and the age of the project, blahblahblah. Only after registering, however, may he contribute to the project, classify posts, etc.; having done so, the project is added to his “My Projects” folder, from which he can access any project to which he subscribes (compare YouTube, Google Groups). From here on in, everything is contained within a particular polymath project.
The perfect means by which to organize any particular project—be it forum, wiki, blog, or what-have-you– is an open problem we hope will be solved by market forces, but I describe two features which I believe to be essential in this part (see Prof. Tao’s comment on his own entry for a very good list).
Firstly, I believe there must be personal profiles (similar to those of a webforum and YouTube). A personal profile is in part generated automatically, and is in part privately constructed. I hope that these personal profiles can help to provide the amount of leadership in a project that is clearly necessary. When viewed by other registered users of a project, a personal profile provides at the very least the following:
- A *personal record*: a record of what this person has done within the project which can be easily searched and allows for easy citation. — For many reasons, it is important to record what particular individuals have done and have found important. For example, one or more people’s participation in a problem during a period of time can give an approximation to what one might call a “line of attack” or “chapter” in the greater problem– this fulfills a good part of the job of a summary and can facilitate in writing them. I learned this lesson first-hand while building summaries of years’ worth of subplots of the story I mentioned earlier; I used people’s records to trace sub-stories. Again, human records provide great, natural organizational means – I suspect those of you who wrote summaries for the DHJ project may have used a feature to search for individuals, and I’ll come back to this point.
- A directory of *Favorites* — Here one bookmarks the posts he has found most valuable to his understanding of the problem. A person uses his “Favorites” directory to make them easily available for citing and referencing. They are public information so that people will have an easier time “getting on the same page” and producing summaries. One should be able to view the Favorites in the order in which they were placed in the Favorites list, rather than simply the order in which they came into being. Favorites have brief comments attached to them describing why they’re bookmarked, some of which may be publicly available, some of which are personal notes. Favorites also often have more extensive notes attached to them, which one can also opt to make public every now and then.
- A *personal blog* – The blog contains public and private entries (see the remarks of David Speyer). Here you get to express your own thoughts and evolving point of view. You can also tell people you’re leaving the project, you can ask for help regarding something (other mechanisms should exist for this purpose), inform people of upcoming mini-collaborations you’re having, link to breakthrough posts you’ve found – in the end it’ll be an equilibrium exactly what they’re used for, but they have much potential to help people keep up with parts of the project in which you are concentrating. However, note that the personal blog is meant to be… *personal*, as opposed to the public blogs devoted to the project. There may be no need for comments or public discussion on these blogs, and in fact it may be better to disable such features to prohibit them from containing discussions of general interest in the wrong place (perhaps their entries may, however, be referenced or bookmarked? Or not). Just as we have feeds for our favorite blogs/webcomics/twitters, we can have feeds for the most important blogs within the project, and an option to make the contents of our feed publicly available to help newcomers (just like last.fm allows other people to know what music you’re listening to) analogously in principle to the above Favorites idea for posts. Above all, we have seen already through the examples of Prof.’s Tao and Gowers the impact blogs can have for leadership purposes.
- A *private messaging* capability – There are reasons to be apprehensive about this, but in the end I am guessing we will find out it’s both useful and necessary, and that it must have all the features of any other form of posting. [A Student] suggested private messaging as a means to correct or ask questions politely. I suspect such conversations would sometimes lead to important discoveries, which must then be brought to the attention of the problem’s community, but only in an organized form. Therefore if they exist, PM’s must be much more robust! (This need for robustness truly makes me think of where GoogleWave can come in once it can support LaTeX) In the collaborative story-writing setting, the best plot twists were discussed in secret by a few people before being unveiled in some well-planned posts (which also tended to be written better)– I see no reason that analogous mini-collaborations should not play an important part in the PolyMath process. Indeed, such private messages will exist inevitably regardless of whether or not we choose to welcome and incorporate them – we cannot get rid of them, no matter how hesitant we may be as to whether or not they align with the spirit of polyMath.
Secondly, I believe that we must incorporate statistics to classify and measure the popular opinion regarding individual posts. In his comment, Prof. Tao put this idea under the “less mandatory” section, and I agree that it will not be mandatory in every setting. But in the situations where things get very, very big and post numbers become extremely large, I think they are basically necessary just as Rotten Tomatoes and YouTube statistics are necessary to help people efficiently waste their time at the movies and online (and, like I said, wasted time is the resource for which polyMath competes). Here are some examples of how this can be done in conjunction with my previous recommendations:
- *Post classification* – An anonymous poster made the excellent observation that many people probably waste time evaluating the accuracy / importance of (sometimes poorly / vaguely written) arguments. But if an argument is flawed, and the flaw has been pointed out, this information should be obvious upon viewing the post and a link to the corrected analysis should be present before a million people have to struggle with it. Or, likewise, if an argument is vague/imprecise, and a more precise version has been written, this classification and a link should be immediately available. The post may be vague not as a fault but because it is what one of my professors would call “the two-minute version” and there are “five minute” and “five hour” versions also available. In whatever case, corrections, elaborations (or, just as importantly, compressions), if they exist should be linked and labeled as such. The taxonomy here is incomplete – it would also be nice to know when an approach has been (by consensus) beaten to death and we have done all we can with it and understand its limits, or when the idea is “underexplored” or “showing promise” – similar to how Wikipedia allows us to label its articles’ flaws (e.g. “not enough references”, “too much jargon”, etc.). (Maybe this element of Wiki is the best solution for the classification problem?)
- These can be incorporated with the Favorites that I proposed above, so that if something should happen to a Favorite post (e.g. a flaw is discovered and elaborated), those who are concerned about that post are informed in their Favorites page. I believe this kind of classification with linking is all we need to “punish” errors – when errors become common, we will put the common errors (or their corrections) in our Favorites list, so that we may quickly provide a link when they reappear. It is good for a Favorites list to have errors in it, as long as they remain well-classified.
- A *view counter* – Gives an idea how commonly viewed the knowledge is. It is important for individuals to bookmark a not-well-known post which they feel is important so as to prevent the phenomenon of everyone always looking at the most popular posts. Many posts will become unpopular because they are badly written, so I propose using Favorites or Post Classification to highlight those worth the read despite their low numbers (or not worth reading despite high numbers). One should also keep track of how many people have set any given post as a favorite. With these statistics (along with the age of the post, tags and a subject header), one can approximately isolate what’s really worth reading when the amount of material out there vastly exceeds his available time.
- No numerical ratings — While classification of posts can be quite useful, numerical ratings or “thumbs up / thumbs down” ratings do not promote the overall good. Not only do they offer no explanation as to what may be good or bad about an entry, they tend to be assigned without any real, deep thought. It is very easy to crush a funny-sounding, good idea with just a few bad reviews.
*Finally*, I want to discuss the importance of universal ease of reference / linking and its interaction with “visualizing the tree” and with summaries. For very large projects with very many collaborators, I agree with Kareem Carr, for example, that we do need ways of understanding and approximating the tree structure. I think we could use some kind of metric to measure, in a useful way, the distance between summaries. A summary pertains to a certain region of the tree, which can be approximated based on the humans referenced, and the individual posts referenced. For example, one may approximate what you might call a sub-problem by collecting a handful of the main people involved and the time in which it took place, looking for the correct region of the graph in which they interact most closely, and then taking a small neighborhood of those posts. In this way one can try to associate a region to a summary.
We’ve already noticed that much can be learned from the paradigm of open-source programming, but I hope to have convinced the reader that models from various other popular internet services may also inform the constitution of polymath in particular by helping us see how people themselves can be used to naturally organize massive material. There’s a lot of wasted time out there just waiting to be harnessed for the service of mathematics! Like.. all the time I just spent writing this essay…… ……. …………
If no product exists which is robust enough to do all of the above (while meeting the demands Dr. Tao listed)… I would suggest some people work to develop the product. On the internet you never know what will be the next Facebook, so you might as well start very robust. Last of all, did anyone ever notice that participation in PolyMath projects may end up proving valuable to gauging graduate school applicants? Maybe this is an extremely tricky question. I also wonder what would happen if polymath tried to write a textbook…
Edit (28 July) : I should have stressed that the underlying assumption in all the above is that things be organized so well people actually want to use them. Like, the blogs should be good enough for taking actual notes, for example.