Monday, November 29, 2004

Streaming(?) Cell Processors and Graphics

This is a topic of direct research interest to me: Via the EE Times:
The eagerly anticipated Cell processor from IBM, Toshiba and Sony leverages a multicore 64-bit Power architecture with an embedded streaming processor, high-speed I/O, SRAM and dynamic multiplier in an effort, the partners hope, to revolutionize distributed computing architectures.
...

Each processing element comprises a Power-architecture 64-bit RISC CPU, a highly sophisticated direct-memory access controller and up to eight identical streaming processors. The Power CPU, DMA engine and streaming processors all reside on a very fast local bus. And each processing element is connected to its neighbors in the cell by high-speed "highways." Designed by Rambus Inc. with a team from Stanford University, these highways — or parallel bundles of serial I/O links — operate at 6.4 GHz per link. One of the ISSCC papers describes the link characteristics, as well as the difficulties of developing high-speed analog transceiver circuits in SOI technology.

The streaming processors, described in another paper, are self-contained SIMD units that operate autonomously once they are launched.

They include a 128-kbyte local pipe-lined SRAM that goes between the stream processor and the local bus, a bank of one hundred twenty-eight 128-bit registers and a bank of four floating-point and four integer execution units, which appear to operate in single-instruction, multiple-data mode from one instruction stream. Software controls data and instruction flow through the processor.
One theme of my research is demonstrating the power of modern GPUs for a wide variety of applications, and how their streaming nature represents a new model of computing akin to, but not exactly like the model used for traditional streaming algorithms. I wonder to what extent the new Cell processor fits in with this notion.

Clarity in writing

Via Critical Mass comes this screed criticizing obscure writing in literary theory circles. Much of the article is "inside baseball" for literary theorists, but there is at least one point that relates to what scientists do. It concerns one of the arguments literary theorists appear to make in defence of their alleged obscurity:
Scientists have their jargon—why can't theorists have theirs? Again, this is a valid question with a simple pragmatic answer. The public tolerates scientific jargon and not theory jargon because it believes that scientists need jargon to extend their researches and produce practical knowledge that benefits all. Only when scientists appear to abandon the common good does their language come under attack (for example, Swift's portrait of mathematicians in Book III of Gulliver's Travels, or contemporary ridicule of sociologese and psychobabble). Come the day when the theorists are able to demonstrate that their jargon enhances human life, and isn't just pretension and science-envy, public mistrust of them will end. Constantly claiming to foment social justice isn't sufficient.
I don't remember the Gulliver's Travels allusion, but the idea that only when scientists appear to abandon the common good does their language come under attack appears interesting, and in the light of the all the current evolution imbroglios, not quite right.

Alan Sokal's hilarious spoof was one of the first salvos fired at the obscurity of literary theorists (although his intent was quite different); it is interesting to see followups, even if my standpoint is of the mildly amused observer indulging in schadenfreude.

Friday, November 26, 2004

सस्ता और टिकाऊ

Via The Opti Mystic, this gem from the latest edition of the Indian blog carnival (or the Bharateeya Blog Mela):
Indian farmers have come up with what they think is the real thing to keep crops free of bugs. Instead of paying hefty fees to international chemical companies for patented pesticides, they are reportedly spraying their cotton and chilli fields with Coca-Cola.

(from the Guardian, via Jagadish)

In Hindi, there is a phrase that captures this: सस्ता और टिकाऊ
It literally translates as "Cheap, and solid", and is often used in advertisement tag lines.

An interesting problem:

emmous asks an interesting question:
Let N be a set of nonnegative integers (amen!). We say that N has the decomposition property if there are nonempty subsets N1 and N2, maybe overlapping or identical but both containing at least two elements, such that

N = { n1 + n2 | n1N1 and n2N2}.

Can we find such two subsets for a given N or it's NP-complete?
The source of this question is a paper by Mateesca, Salomaa and Yu.

I found this question interesting because we know that if N indeed had such a structure, and we were presented N1 and N2 instead, we could rank and select elements from N far more efficiently than if we had N itself.

Wednesday, November 24, 2004

Now this really makes me angry..

I had the (mis)fortune to be busy with papers around the time of the great November cataclysm, so I had no time to obsess over election results. In retrospect, I find though that there are some issues that rile me up a lot more than politics, and one of them is evolution.

It was disturbing enough to start seeing, a few years ago, that school districts were trying to teach (un)intelligent design along with evolution theory in classrooms. It has become even worse though, with districts in Georgia, Wisconsin, Maryland, Ohio and now Pennsylvania all choosing to teach such nonsense in classrooms (see the National Center for Science Education for more details)

The usual argument that has come up in discussion when I start spouting off is: "Well, the students who are really interested in science will learn the truth about evolution, so this doesn't really matter". This is of course correct: I can't imagine someone getting into a graduate program in biology without having been able to verify for themselves the basic facts of (micro)evolution under the microscope or understanding the fossil record. But scientists don't make their own policy or control their own funding; politicians do, and politicians are by and large people who got their science education in school and maybe a little in college, their background tending to be more in law/economics/business. It is crucial that we get this right, and get it right early on. If students are taught all kinds of gobbledygook in school, it messes up their fundamentals later on: we need only look at math education to see how that works.

I actually don't think everyone should be indoctrinated into the cult of evolution. Personally, I have been very confused myself about what statements regarding evolution are facts, and what is theory, and what is falsifiable: after all, a statement that we are descended from the apes is not inherently falsifiable. For more discussion, and a point-by-point elucidation of falsifiable tests of evolution, check out the fascinating talk.origins FAQ.

What upsets me more is the undermining of the basic scientific foundation of theory based on falsifiable premises[1]. (un)ID is a a big fat negation: humans did NOT develop via evolution, because evolution can't explain everything yet, and there are discrepancies. It is not a working theory of the world, and it can't even be falsified ! I mean, how do I prove that some intelligent being did NOT intervene with my development from Nim Chimpsky.

To place it on a par with evolution, as a 'competing theory of biological development', is to indulge in the kind of fake 'fairness' that science journalists have picked up as a bad habit from the political reporters, and wilfully ignores the volumes and volumes of evidence that support both the facts of evolution and many of the theoretical mechanisms.

There is a lot that is not understood about how evolution works, specifically exactly how genes traits get passed on, and how species evolve. And this is a genuine discussion worth having. But we don't teach children Newton's Laws with a big warning sticker, and we don't because we know that this provides a good first approximation to the laws of mechanics, and to understand the discrepancies predicted by relativity requires a lot more sophistication.

Should we point out that a lot of the "theory of evolution" is just a theory ? Absolutely. But we should separate out what is theory from what is not. Painting with such a broad brush is ludicrous, and cannot in good faith be interpreted as anything but a political power play to subvert the very basis of scientific enlightenment where it matters the most.

Science is not a political football. It is not a game. It is the foundation of the modern world. Deal with it.



[1] Before you start beating me over the head with Lakatos, Kuhn and Feyerabend, let me at least argue that falsificationism is a more accurate description of the day-to-day process of doing science.

Tuesday, November 23, 2004

The beleagured computational genomicists...

Much is being made of this article by John Derbyshire of NRO on the threat posed by computational genomicists to the "liberal elite". One point that has not been made though: much of computational genomics is based on reasoning about phylogenetic structure, inferring gene propogation cross species. This of course is based on the only-believed-by- 35%-of-the-US-population theory of evolution.

It is rather amusing that conservative commentators who spring to the defense of research that undermines the "liberal establishment" are not particularly concerned that this very research is based on the theory that people in their corner are trying their best to discredit.

wireless fun...

I forgot to mention that if you are in Phladelphia, a place that has both the FECSWATM and the TRCPWATM is Intermezzo, on 32nd and Walnut.

How to increase the number of hits on your blog...

This is a cottage industry in the blogosphere, with various authors willing to give you advice (free and paid) on how to increase traffic to your blog. In fact, some time back the Geomblog was cited as an example of how NOT to increase traffic to your blog, by exemplifying the kind of blog that "is for people who only talk to their own kind".

Well I have my own personal lesson on how to increase traffic to your blog, and I'll even give it out for free. The answer is: POLITICS. My evidence ?



This picture is a snapshot of my traffic over the past one month: visits in green, and page hits in purple. The prefix is my average traffic per day: not a lot, but then how many geometers are there to begin with ? The climb starts the day I posted my article about election cartograms. It lasted for a week, and I am still seeing the effects of it.

Anonymous submissions and ethical practice

Over in Lance-land, a post about chilling out after a deadline has mutated (in the comments section) into a discussion of anonymous submission to conferences. My question, which I posted there, is an ethical one.

Often I will go around giving talks about papers that are not yet published. My reason for doing so (and I have come recently to this way of thinking) is that as long as I am not fearful of being scooped, or preempted, giving a talk is how one disseminates work, and asssuming that the paper will get published eventually, this is OK, especially if the work is something I am excited about and want to talk about. Related activities might include submitting an e-print, or putting up a web page etc.

The ethical dilemma is then this: if I am submitting to a double blind review process (that many conferences adopt), is dissemination of the above kind an underhanded way of subverting the anonymity of the review process and should it be avoided at all costs ?

Obviously a way out is to just not talk about work under review. However, this doesn't tend to be common practice: many important results have spread long before they were submitted or accepted to a conference. I imagine that some nuance based on the "importance" of the paper might come into play. If I believe that the knowledge of this result will be of great interest to the world at large, then it is ok to talk about it. However, this gets us into a slippery area of reasoning: after all, one doesn't write a paper and submit it to a conference if one believes that the results are of no interest to anyone.

Comments ?
Posting has been slow, as I've been travelling, and doing some paper writing (you know, scholarly work, the kind that bloggers do in their pajamas). My internet connection has been down, and in the process of running around trying to make my deadline from various WiFi-enabled cafes, I have discovered the Four Essential Criteria for a Successful Wireless AdventureTM.
  1. A clear internet signal (preferably free)
  2. Good cell phone reception
  3. Accesssible power sockets
  4. Some reasonable espresso
You'd be surprised how difficult it is to satisfy all four of these constraints, and how essential they really are.

Along with these are the Two Recommended Criteria for A Pleasant Wireless AdventureTM:
  1. Comfy armchairs: a table and chair is NOT designed for typing on a laptop
  2. Pleasant music: it's not even Thanksgiving and I am already trying to free my mind of inane Christmas jingles !
anyway, I shall now kick back and take Lance's advice, at least until my next deadline...

Friday, November 19, 2004

Open problems.

This year's CG workshop promised to have a focus on open problems, and we had more than our share ! Check out the CG Blog for a list.

Cows not entirely deterred..

My report on the first invited talk at the CG Workshop. Read more about it here...

Thursday, November 18, 2004

I am blogging from inside the deranged cubist fantasy that is Stata Center, MIT's computer science building. The CG workshop is being held here in a lush auditorium, and the internets are flowing like wine :)



I hope to have updates as the day goes on: am looking forward to an intriguing talk by Godfreid Toussaint on Computational Geometric Aspects of Musical Rhythm.

Monday, November 15, 2004

So much for SIGGRAPH :)

I am happy to announce that this year's Fall CG workshop will have a fully-compliant 21st century web surfing experience: a website, a blog, and even a wiki !!

Hervé Brönnimann from PolyTech U has set up a comp geom space that (with the aid of volunteers like your humble blogger, Ken Clarkson, Jeff Erickson and Sariel Har-Peled) now has all kinds of goodies for the computational geometry community.

Firstly, we now have a blog for the Fall CG workshop. If you plan to attend the workshop, and would like to post entries, send mail to sureshv@gmail.com with the subject line "FWCG" to get an account, or for instructions on how to post anonymously. Please supply enough information to identify yourself :)

Secondly, we have a computational geometry wiki that runs off the excellent Mediawiki platform (the same one that runs wikipedia). The beauty of a wiki is that anyone can edit it (once you are registered), and there's lots of information on the site to help you out. We hope that through volunteer efforts, we can build this up into a valuable resource for the CG community.

Some cool features of the wiki:
  • You can write math in latex, and it will convert the text appropriately
  • You can use geombib reference for citations (thanks to Sariel)
  • Anyone can edit any page; although this seems like recipe for disaster, it actually works out reasonably well. So if you have some CG-related content you'd like to add in, or can help contribute to existing pages, dive right in.
Sariel has converted his core-sets survey into wiki format: check it out...


Thursday, November 11, 2004

A victory for robust statistics ?

In all the brouhaha over election predictions, it is worth observing that robust statistics scored a clear victory.

According to Wes Colley and J. Richard Gott, merely taking the median of all polls in the last month was good enough to predict the election outcome with almost 100% accuracy (they got Hawaii, with only 2 polls, wrong).

And what is a robust statistic ? One way of defining it is that it is a statistic on a set of numbers that requires at least a constant fraction of the input to go to infinity before the statistic itself goes to infinity (for the median, this number is 50%; however, only one number needs to go to infinity to pull the mean along with it). A robust statistic is often useful as a non-parametric measure of a data set that is sensitive to outliers; one of the best examples in two dimensions is Tukey's halfspace depth.

Aren't elections educational ?

Wednesday, November 10, 2004

USTCON is in L

Read all about it !! Lance has the annoncement, and David Molnar has additional reading.

Note that this paper could also have been titled in 4 letters:

SL = L

Tuesday, November 09, 2004

The fever catches on...

I promise (or at least try to promise) that this will be my last cartogram post. Looks like map fever has caught on a big way since my first cartogram. If you are not already sated by maps, maps and more maps, check out the extensive resources on election cartograms at UCSB provided by Sara Fabrikant.

Her maps have Alaska and Hawaii (which I and others shamelessly ignored). I must say that Alaska looks like an albatross in this map.

Monday, November 08, 2004

Making your own maps...

Since I first posted the cartograms, many people have emailed me asking if I could modify the cartograms to do various kinds of displays. Although changing color schemes is easy, changing the map itself is not (since I didn't generate the polygons for the map myself).

Anthony Robinson, a master's student at Penn State, works on various aspects of geographic visualization within the geoVISTA center, and has provided a toolkit developed by folks at Penn State that allows you to download Election 2004 results and play with them inside an exploratory analysis tool. He also provides the election data to work with !

Definitely check it out if you want to create your own maps; the folks at Penn State have been doing great work in scientific visualization for a while.

Sunday, November 07, 2004

Color schemes

One of the side discussions around the purple map was: what is a proper color scheme to color such a map, beyond just a red-blue-purple mode. Many ideas were suggested, and some of them are reflected on the page, but till a commenter mentioned it, I didn't realize that cartographers have studied this problem in great depth.

Cindy Brewer, a cartographer at Penn State, has designed a beautiful web site that helps with coloring choropleths (the technical term for maps where each region is colored with a single color). You decide what kind of data you have and how many categories of colors you needs, and the flash applet generates a color scheme, as well as color values in a variety of color maps (RGB/CMYK/etc).

Using a bivariate 10-value divergent color scheme (color spreads out from the middle of the data range), here is the Election 2004 cartogram:



For more maps, visit my new cartogram page.

More cartograms and some histograms

Gartner, Shalizi and Newman have another cartogram, made using a different ("diffusion") approach.



They also had some histograms of vote totals on this page, which they took down because of some anomalous values. I got the idea to do something similar, and here is the result (click for the big pic).



Friday, November 05, 2004

The 'Purple Haze', revisited.

[Update: Nov 8]: More color schemes (one based on colorbrewer.org) and a toolkit to make your own maps !

By now, Robert Vanderbei's purple map of the voting counts in Election 2004 has crisscrossed the internet several times. On his web page, he answers some questions about these maps:
Can you warp the counties so that each county's area is proportional to its total vote count? Such warped maps are called cartograms. There are already several of these at the state-by-state level on the web. I haven't seen any at the county-by-county level. A few years ago I collaborated briefly with David Dobkin and Stephen North on algorithms for producing cartograms. I can say that making a cartogram with so many individual elements (counties) would be very difficult.
Cartograms are indeed hard to compute: this is an interesting geometric problem, as Jeff Erickson also points out.

As it turns out, the place I work at does cartograms, and quite well at that ! Stephen North (the one mentioned above) and colleagues have developed methods for computing cartograms and I used their approach to create a cartogram of the election results. I used the data from a paper by Daniel Keim, Stephen North, and Christian Panse that uses the medial axis to construct a cartogram.

[UPDATE: the original maps had a problem in labelling Nevada because of a mismatch between county names in the two data sets: the new maps shown here reflect the correction. Thanks to the commenter who pointed this out.]

I used a variety of color schemes: for more information, and larger pics, see my new cartograms page. (note: for formatting purposes I used HTML to scale the images; if you download the image, it will be bigger)

  1. A Vanderbei-like color scheme, where the relative proportion of votes for Bush or Kerry turns the county red or blue respectively: purple regions are roughly even.



  2. The winner-take-all color scheme: a county that Bush won is marked in red, and one that Kerry one is marked in blue.


  3. Red-blue (suggested by a poster on the blog). Start with a baseline of white for equal vote-share. as the vote percentage for the winner increases, increase the strength of red or blue appropriately.



  4. Grayscale (suggested by Kathryn Myronuk). Start with a baseline of white for equal vote-share. As the vote percentage for the winner increases, make the color grayer.


  5. ROYGBIV (suggested by Kathryn Myronuk). Again, white is neutral, but go towards the R side or the V side of the color spectrum depending on who wins and how much (thresholds at .7, 0.6, .53)




It takes a while getting used to a cartogram, and having to ensure that counties still touch after being inflated can make the problem quite difficult to solve while still retaining an overall representative shape. However, it is interesting to see how large California and the North-east really are in context, and how the entire middle-west of the country shrinks.

As far as I know, the complexity of computing a cartogram (where the typical formulation would be to minimize some error metric on the discrepancy between the area of a region and weight associated with it) is unknown, and is probably NP-hard (I once thought I had a proof for NP-hardness for rectangular cartograms, but it foundered). However, much of the challenge in cartogram design comes from trying to balance accuracy and aesthetics; the map when distorted should still look like the original map !

For more on this, check out the AT&T Info Viz page on spatial data transformation.

Thursday, November 04, 2004

Unicode is cool

This is my name, typed in Hindi using the INSCRIPT keyboard.

सुरेश वेंकटसुब्रमणियन
and in Tamil:

சுரேஷ் வெங்கடசுப்ரமணியன்


In order to do this earlier, I needed all kinds of LaTeX goo...

Incidentally, this site has all kinds of useful information on entering and reading characters from different languages.

Aside: I just realized that Safari mangles the fonts: Hindi slightly, but the Tamil is completely unrecognizable. I can understand that the mapping table for Tamil is not loaded, but the Hindi rendering has the right map, but is wrongly rendered. Interesting...

Monday, November 01, 2004

Another e-publishing model

Another example of how ease of publishing on the internet is making new models of academic discourse possible: Brad DeLong points to the Encyclopedia of Economic and Business History, a new article collection that
is designed to provide students and laymen with high quality reference articles in the field. Articles for the Online Encyclopedia are written by experts, screened by a group of authorities, and carefully edited. A distinguished Advisory Board recommends entry topics, assists in the selection of authors, and defines the project's scope.
This places it somewhat below a survey journal in terms of scholarly status, but above other web resources for laymen like Wikipedia or Mathworld or even ScienceWorld in terms of reliability and academic rigor.

It's a laudible endeavour. I suspect it works well in this area because economics have a tradition of public writings in addition to their academic work (as evidenced by the number of economics blogs). Given the levels of economic illiteracy among the public, and the importance of economic issues in making policy, such a resource can be of immense benefit for those of us trying to follow the often arcane and impenetrable discussions that swirl around issues like outsourcing, the economy, social security and tax cuts.

Disqus for The Geomblog