Less research is needed
blogs.plos.orgLast year a professor in my department (astronomy) suggested that he and I write a similar tongue-in-cheek paper to be published on April 1. The idea was to promote a moratorium on new astronomical data for one year. This would give observers time to reduce all the data they've already collected and theorists time to catch up to the observers.
It's facetious, of course, but there was a serious point behind it all. There is a certain tendency in science for a researcher to perform the same study over and over again just using larger or slightly modified data sets simply because that's what he knows how to do. Most of the time these sorts of Version 2.0 studies just reduce the error bars on the result without telling anyone anything new.
Now, of course, sometimes interesting results do come from such things. But much more often interesting results come from studies that attack a radically different problem or use a radically different approach. Science is a manpower-limited, not data-limited endeavor. Scientists have a finite amount of time that they can devote to research and they have to choose what projects to work on. There is still a great deal of low-hanging fruit---projects that require relatively small amounts of funding, relatively small amounts of manpower, and have the potential to yield genuinely new results. There are, for example, some really excellent projects that are being done with a telescope that basically consists of putting a commercial camera lens on a telescope mount [1]. But the difficulty of these sorts of projects is that they require creativity, and that is hard to come by. I'm not faulting anyone, though---I'm not an especially creative researcher myself!
Part of the problem is that grant agencies have a strong bias towards funding incremental science. While they say that they are in favor of funding breakthrough science rather than incremental science, the projects that actually get funded tell a different story. And it's hard to blame them because no one knows a good way to predict breakthrough results. It's an especially difficult problem to solve for theorists---in order to write a compelling theory proposal you basically have to have solved the problem already!
I've heard a number of solutions to these problems, but they're all as compelling to me as a year-long data moratorium (which, to be fair, would indeed force the community to become more creative). Hmm, maybe I'll actually write up that paper for April 1, 2015.
[1] http://www.astronomy.ohio-state.edu/~assassin/index.shtml
Another problem is that funding agencies have a very strong bias in favor of large projects. Most EU money gets poured into the big pools of money available only to research teams spanning three or more countries. It's incredibly difficult to propose a really novel, creative, innovative -- whatever adjective you like -- project that has 14 senior researchers and a small army of postdocs and grad students collaborating on it.
Those projects tend to come from individual researchers working on an idea, and there's basically no money available for that. For some fields, this isn't feasible, of course. You can't do high-energy experimental physics in your office with a laptop. But quite a lot of research can be at least partially done without massive amounts of support, and the system is set up in a way that more or less prevents that from happening.
This is also true. The time allocation committee for the Hubble Space Telescope, for example, has a policy that projects are judged solely on their scientific merit, without any regard to the amount of time being asked for. This, of course, leads to a strong bias towards very large projects.
There are so many issues raised here that it hard to know where to respond.
1. Once area we could stop is useless data-mined correlation studies that show statistical significance (assuming you ignore that data-mining has occurred) between action X and outcome Y - the sort where a retrospective study of 500,000 nurses finds that eating candied peanuts reduces prostate cancer by 15%. The rule of thumb in any of these studies is that unless the effect is 300% or greater (smoking and lung cancer is 1500%) then the result is certain to be garbage.
2. We need less “novel” research and more replication of past results. The whole scientific system is set up to reward novelty over accuracy. It is so bad that unless I have seen two independent groups repeat something I doubt it is real no matter how famous the group.
3. We need to reward being right over being first. Right now groups rush papers out so they don’t get scooped and so don’t check their results as well as they should. I would personally like to remove the date off all scientific papers to stop these silly games - after all if something is true does it become less true just because it was published last year rather than last week.
4. We need to reward people who put the effort into replicating work. A simple proposal would be to give publication right to every group that replicated (or could not replicate) a study in the same journal. If some study is published in Nature and you go to the effort of replicating it then you should get an automatic Nature publication.
5. Stop scientist from holding on to raw data. In theory scientist are supposed to share their data, but in practice this doesn’t happen very often. It should be possible to report groups that don’t share data to the funding bodies and if they are found to not be not sharing (or only sharing some of the data) then the group is banned from getting any new funding. It would only take a few banning to stop this immoral data hoarding.
I couldn't agree more with almost everything you just said (especially number 2). My only potential issue is with number 5. While I agree that raw data should be shared for purposes of reproducibility and progress, I can also partially sympathize with investigators who put in enormous time and effort to coordinate and run large studies / clinical trials.
If investigators were forced to immediately release their raw data from these studies, there would be armies of other investigators swooping in to scoop the original team on follow on studies from the data. While this would certainly be great for science, it partially punishes investigators for actually conducting the large trials. I'm not sure how justifiable it would be to put in the effort to conduct a large clinical trial and then only get 1-2 papers out of it (even if they went into NEJM / JAMA / Lancet etc).
What are your thoughts?
The purpose of 5 is to improve science not ‘reward scientists’ [1]. If we moved to a system where the raw data was shared automatically then the number of “exclusives” any group could get from a study would decline, but the value of each paper would go up. As long as everyone was sharing then I don’t think funding bodies would stop funding groups willing to go to the effort of doing large studies. It is the funding that determines what research is done, not the how many papers a group can milk out of the study. It should be quality over quantity.
[1] For those outside of science what happen now is groups with the data hold back the data and then use access to the data to establish “collaborations” - basically they will give you access to the data as long as you put their names on any resulting papers. The people with the data often don’t actually contribute anything to the new publication other than access to the data and their names - my old boss was a expert at doing this.
The encode project maybe had the right idea with this. The data was made public, but researchers who did not contribute to the data production could not use it for a period of 6 months or a year for their own publications.
Well, as we're already dreaming, why not stipulate that the person who gets credit for the breakthrough should be the person who gathered the data, not the person who analyzed it.
> On my first day in (laboratory) research, I was told that if there is a genuine and important phenomenon to be detected, it will become evident after taking no more than six readings from the instrument.
This is the reverse of a rule of thumb I find useful, that if you wish to measure something and get an approximate picture of your uncertainty, you should measure it 7-8 times.
The author's rule of thumb hinges delicately upon the definition of "readings", in particular upon the reach and precision of a given reading. I can look in the sky on dark nights and see Mercury, but even if I watch it through binoculars for years, I'll never resolve the "Genuine and Important" precession of its orbit [1], the first solid evidence for General Relativity.
Some important phenomena are subtle and rare. You can watch a liter of pure water for ~1500 years before you can expect a single neutrino from the Sun to interact and make a tiny flash of light [2].
[1] http://en.wikipedia.org/wiki/Tests_of_general_relativity#Cla...
Perhaps this is one of those white lies we tell to justify doing the right thing. A dishonest means to an honest end.
Is this maybe how researchers publish negative results without having to admit failure? We often complain about the dearth of published negative results. We talk about pre-registering studies and so forth.
It seems better to me for researchers to recast a negative result as an inconclusive positive result "requiring more study", than to not publish it at all. Just because there is a call for further research doesn't mean we have to do it.
> Despite consistent and repeated evidence that electronic patient record systems can be expensive, resource-hungry, failure-prone and unfit for purpose, we need more studies to ‘prove’ what we know to be the case: that replacing paper with technology will inevitably save money, improve health outcomes, assure safety and empower staff and patients.
Paper-based systems are also failure-prone and unfit for purpose. They just fail in familiar ways that the old guard have accepted as just part of the business.
All systems fail at some point. Paper-based systems have the advantage of failing locally. Electronic systems have a horrible tendancy to fail globally.
As the saying goes, to err is human, but to really foul things up requires a computer.
Bad electronic record system is probably worst then paper based one. Good electronic record system might be better then paper based one. The trouble is that quality is not the thing that wins you big contract with big institution all too often, so I would not be surprised if the common electronic systems were the bad ones.
Agreed, but we are also heavily biased towards this view as a community. I think we should try to steer the conversation away from this particular point.
But dear god... some people manage to make some awful software.
Perhaps less primary research is needed, and more secondary research i.e. more reviews.
It strikes me that making the scientific literature machine-parsable and query-able may help a great deal.
Currently the literature is "scraped" to produce scientific metadata which is stored in databases such as PubMed. Of course, that's back to front. Experimental data, findings, methods, workflows, etc etc should be stored in databases of some sort, and "literature" produced by querying the data.
A pipe-dream, of course. But some steps have been taken towards something approaching this.
https://sdm.lbl.gov/sdmcenter/ http://authors.library.caltech.edu/28168/ http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=5...
I really disliked that post: http://slatestarcodex.com/2014/07/11/links-for-july-2014/#co...
Maybe smarter research is needed? It seems to me that the problem is a similar one to what data science is trying to solve. How do we make sense of all this data?
Of course more research is still needed in many areas anyway.