Friday, March 21, 2014

Proposed open access symbol

I have proposed a new Unicode symbol to denote true open access, for instance applied to scholarly literature, in a similar way in which © and ® denote copyright and registered trademarks respectively. The proposed symbol is an encircled lower case letter a, in particular in a font where the a has a 'tail', as in a font like Arial and Times, for instance, (a), and not as in a font like Century Gothic (without the 'tail' as it were).

My proposal should be on the Unicode discussion list (http://www.unicode.org/consortium/distlist-unicode.html), and I am soliciting support, and input from technically-minded as well as legally-minded open access supporters.

This is the symbol I have in mind:















Jan Velterop

Wednesday, December 11, 2013

Lo-fun and hi-fun

I have recently been talking to some major (and minor) publishers about what they could do in regard of open access, given the increasing demand, even if converting to ‘gold’ open access models is not realistic for them, in their view. I suggested that they should make human-readable copies of articles freely accessible immediately upon publication. Access to human-readable articles would of course not satisfy everybody, but it would satisfy the ‘green’ OA crowd, if I assume Stevan Harnad is their prime spokesperson. He dismisses machine-readability and reuse as distractions from his strategy of ‘green’ open access, and he even supports embargoes, as long as articles are self-archived in institutional repositories, which is his primary goal. Human-readable final published versions directly upon publication would be an improvement on that. It would also likely satisfy the occasional reader from the general public, who wishes to be able to access a few scientific articles.

How could those publishers possibly agree to this? Well, I told them, they could reconsider their view that there is a fundamental difference between the published version of an article and the final, peer reviewed and accepted author manuscript (their justification for allowing the author-manuscript to be self-archived). There may well be, of course, and there often is, but it is not likely to be a material difference in the eyes of most readers. Instead of making much (more than there usually is) of any differences in content, they could distinguish between low-functionality versions and high-functionality ones of the final published article, the ‘lo-fun’ version just suitable for human reading (the print-on-paper analogue), and the ‘hi-fun’ version suitable for machine-reading, text- and data-mining, endowed with all the enrichment, semantic and otherwise, that the technology of today makes possible. The ‘lo-fun’ version could then be made freely available immediately upon publication, on the assumption that it would not likely undermine subscriptions, and the ‘hi-fun’ version could be had on subscription. Librarians would of course not be satisfied with such a ‘solution’.

Although initially greeted with interest, the idea soon hit a stone wall. Although no one has explicitly said that they would never do this, the subsequent radio silence made me conclude that among the publishers I talked with the fear might have emerged that a system with immediate open access to a ‘lo-fun’ version accompanied by a ‘hi-fun’ version paid for by subscriptions would expose the relatively low publisher added value in terms of people’s perceptions and in terms of what they would be prepared to pay for it. That fear is probably justified, I have to give it to them.

There is no doubt that formal publication adds value to scientific articles. The success of the ‘gold’ open access publishers, where authors or their funders are paying good money for the service of formal publication, is testament to that. There must be a difference – of perception at the very least – between formally published material and articles ‘published’ by simply depositing them in an open repository. That added value largely consists of two elements: 1) publisher-mediated pre-publication peer review and 2) technical ‘production’, i.e. standardised to a sufficient degree, correctly coded (e.g. no ß where a β is intended), ‘internet- and archive-proof’,  rendered into several user formats, such as PDF, HTML and Mobile, aesthetically pleasing where possible, interoperable, search-engine optimised, and so forth. The first element is mostly performed by the scientific community, without payment, and although the publisher organises it, that doesn’t amount to a substantial publisher-added value, in the common perception. The second element on the other hand, is true value added by the publisher, is seen as such by reasonable people, and it is entirely justifiable for a publisher to expect to be paid for that. There are some authors who could do this ‘production’ themselves, but the vast majority make a dog’s dinner out of it when they try.

There is of course a third element in the equation: marketing. Marketing is responsible for brand and quality perception. Quality mainly comes from good authors choosing to submit to a journal. Getting those good authors to do that is in large part a function of marketing. The resulting brand identity, sometimes amounting to prestige, is also an added value that a self-published article, even if peer-reviewed, lacks. But alas, it is not commonly seen to be an important value-add that needs to be paid for.

Having 'lo-fun' and 'hi-fun' versions of articles makes the publishers’ real contribution explicit. That’s the rub, of course.

Back to ‘gold’, I’m afraid. Or rather, not so afraid, as ‘gold’ OA doesn’t have any of the drawbacks of ‘lo-fun’. Fortunately ‘gold’ is more and more showing to be a healthily viable and sustainable business model for open access, at least as long as the scientific community sets so much store by publisher-mediated pre-publication peer review (see previous post for my thoughts on that).

Jan Velterop

Tuesday, November 05, 2013

Essence of academic publishing

Let me start with a bit of context, all of which will be known, understood and widely discussed. The blame of unaffordability of the ever-increasing amount of scholarly literature, be it because of high subscription prices or article processing fees for ‘gold’ open access, is often laid at the door of the publishers.

The blame, however, should be on the academic preoccupation with the imperative of publisher-mediated prepublication peer review (PPR).

Of course, publishers, subscription-based ones as well as open access outfits, have a business which depends to a very large degree on being the organisers of PPR and few of them would like to see the imperative disappear. The ‘need’ – real or perceived – for publisher-mediated PPR in the academic ecosystem is the main raison d’être of most publishers. And it is responsible for most of their costs (personnel costs), even though it is actually carried out by academics and not publishers. The technical costs of publishing are but a fraction of that, at least the cost of electronic publishing (print and its distribution are quite expensive, but to be seen as an optional service and not as part of the essence of academic publishing).

Despite it being the imperative in Academia, publisher-mediated PPR has flaws, to say the least. Among causes for deep concern are its anonymity and general lack of transparency, highly variable quality, and the unrealistic expectations of what peer review can possibly deliver in the first place. The increasing amount of journal articles being submitted is making the process of finding appropriate reviewers not easier, either.

Originally, PPR was a perfectly rational approach to ensuring that scarce resources were not spent on the expensive business of printing and distributing paper copies of articles that were indeed not deemed to be worth that expense. Unfortunately, the rather subjective judgment needed for that approach led to unwelcome side effects, such as negative results not being published. In the era of electronic communication, with its very low marginal costs of dissemination, prepublication filtering seems anachronistic. Of course, initial technical costs of publishing each article remain, but the amounts involved are but a fraction of the costs per article of the traditional print-based system, and an even smaller fraction of the average revenues per article many publishers make.

Now, with the publishers’ argument of avoiding excessive costs of publishing largely gone, PPR is often presented as some sort of quality filter, protecting readers against unintentionally spending their valuable time and effort on unworthy literature. Researchers must be a naïve lot, given the protection they seem to need. The upshot of PPR seems to be that anything that is peer reviewed before publication, and does get through the gates, is to be regarded as proper, worthwhile, and relevant material. But is it? Can it be taken as read that everything in peer-reviewed publications is beyond doubt? Should a researcher be reassured by the fact that it has passed a number of filters that purport to keep scientific ‘rubbish’ out?

Of course they should. These filtering mechanisms are there for a reason. They diminish the need for critical thinking. Researchers should just believe what they read in ‘approved’ literature. They shouldn’t just question everything.

Or are these the wrong answers?

Isn’t it time that academics who are relying on PPR ‘quality’ filters – and let us hope it’s a minority of them – should stop believing at face value what is being presented in the ‘properly peer-reviewed and approved’ literature, and go back to the critical stance that is the hallmark of a true scientist: “why should I believe these results or these assertions?” The fact that an article is peer-reviewed in no way absolves researchers of applying professional skepticism to whatever they are reading. Further review, post-publication, remains necessary. It’s part of the fundamentals of the scientific method.

So, what about this: a system in which authors discuss, in-depth and critically, their manuscripts with a few people who they can identify and accept as their peers. And then ask those people to put their name to the manuscript as ‘endorsers’. As long as some reasonable safeguards are in place that endorsers are genuine, serious and without undeclared conflicts of interest (e.g. they shouldn’t be recent colleagues at the same institution as the author, or be involved in the same collaborative project, or have been a co-author in, say, the last five years), the value of this kind of peer-review – author-mediated PPR, if you wish – is unlikely to be any less than publisher-mediated PPR. In fact, it’s likely to offer more value, if only due to transparency and to the expected reduction in the cost of publishing. It doesn’t mean, of course, that the peer-endorsers should agree with all of the content of the articles they endorse. They merely endorse its publication. Steve Pettifer of the University of Manchester once presented a perfect example of this. He showed a quote from Alan Singleton about a peer reviewer’s report[1]:

"This is a remarkable result – in fact, I don’t believe it. However, I have examined the paper and can find no fault in the author’s methods and results. Thus I believe it should be published so that others may assess it and the conclusions and/or repeat the experiment to see whether the same results are achieved."

An author-mediated PPR-ed manuscript could subsequently be properly published, i.e. put in a few robust, preservation-proof formats, properly encoded with Unicode characters, uniquely identified and identifiable, time-stamped, citable in any reference format, suitable for human- and machine-reading, data extraction, reuse, deposit in open repositories, printing, and everything else that one might expect of a professionally produced publication, including a facility for post-publication commenting and review. That will cost, of course, but it will be a fraction of the current costs of publication, be they paid for via subscriptions, article processing charges, or subsidies. Good for the affordability of open access publishing for minimally funded authors, e.g. in the social sciences and humanities, and for the publication of negative results that, though very useful, hardly get a chance in the current system.

Comments welcome.

Jan Velterop


[1] Singleton, A. The Pain Of Rejection, Learned Publishing, 24:162–163
doi:10.1087/20110301

Tuesday, February 05, 2013

Transitions, transitions


Although I am generally very skeptical of any form of exceptionalism, political, cultural, academic, or otherwise, I do think that scholarly publishing is quite different from professional and general non-fiction publishing. The difference is the relationship between authors and readers. That relationship is far more of a two-way affair for scholarly literature than for any other form of publishing.

Broad and open dissemination of research results, knowledge, and insights has always been the hallmark of science. When the Elseviers/Elzevirs (no relation to the current company of the same name, which was started by Mr. Robbers [his last name; I can’t help it] a century and a half after the Elsevier family stopped their business), among the first true ‘publishers’, started to publish scholarship, for example the writings of Erasmus, they used the technology of the day to spread knowledge as widely as was then possible.

In those days, publishing meant ‘to make public’. And ‘openness’ was primarily to do with escaping censorship. (Some members of the Elsevier family went as far as to establish a pseudonymous imprint, Pierre Marteau, in order to secure freedom from censorship). But openness in a wider sense — freedom from censorship as well as broad availability — has, together with peer-review, been a constituent part of what is understood by the notions of scholarship and science since the Enlightenment. Indeed, science can be seen as a process of continuous and open review, criticism, and revision, by people who understand the subject matter: ‘peers’.

The practicalities of dissemination in print dictated that funds must be generated to defray the cost of publishing. And pre-publication peer review emerged as a way to limit waste of precious paper and its distribution cost by weeding out what wasn’t up to standards of scientific rigour and therefore not worth the expense needed to publish. The physical nature of books and journals, and of their transportation by stagecoach, train, ship, lorry, and the like, made it completely understandable and acceptable that scientific publications had to be paid for. Usually by means of subscriptions. However, scientific information never really was a physical good. It only looked like that, because of the necessary physicality of the information carriers. The essence of science publishing was the service of making public. You paid for the service, though it felt like paying for something tangible.

The new technology of the internet, specifically the development of web browsers (remember Mosaic?), changed the publishing environment fundamentally. The need for carriers that had to be physically transported all but disappeared from the equation. The irresistible possibility of unrestrained openness emerged. But something else happened as well. With the disappearance of physical carriers of information, software, etc. the perception of value changed. The psychology of paying for physical carriers, such as books, journals, CDs, DVDs is very different from the psychology of paying for intangibles, such as binary strings downloaded from the web, with no other carrier than wire, or optical cable, or even radio waves. In order to perceive value, the human expectation — need, even — for physical, tangible goods in exchange for payment is very strong, though not necessarily rational, especially where we have been used to receiving physical goods in exchange for money for a very long time. That is not to say that we wouldn’t be prepared to value and to pay for intangibles, like services. We do that all the time. But it has to be clear to us what exactly the value of a service is — something we often find more difficult, reportedly, than for physical goods.

This is a conundrum for science publishers. Carrying on with what they are used to, but then presented as a service and not ‘supported’ by physical goods any longer, can look very ‘thin’. Yet it is clear that the assistance publishers provide to the process of science communication is a service par excellence. Mainly to authors ('publish-or-perish') and less so to readers (‘read-or-rot’ isn’t a strong adage). Hence the author-side payment pioneered by open access publishers (Article Processing Charges, or APCs).

Although it would be desirable to make the transit to open access electronic publishing swiftly, the reality of inertia in the ‘system’ dictates that there be a transition period and method. This transition is sought in many different ways: new, born-OA journals that gradually attract more authors; hybrid journals that accept OA articles against author-side payment; ‘green’ mandates, that require authors to self-archive a copy of their published articles; unmediated, ‘informal’ publishing such as in arXiv; even publishing on blogs.

What may be an underestimated transition — and no-doubt a controversial one — is a model (a kind of ‘freemium’ model?) that’s gradually changing from restrictive to more and more open, extending the ‘free’, ‘open’ element and reducing the features that have to be paid for by the user. I even don’t think it is recognized as a potential transition model at the moment at all, but that may be missing opportunities. Let’s take a look at an example. If you don’t have a subscription you can’t see the full-text. However, where only a short time ago you saw only the title and the abstract, you now see those, plus keywords and the abbreviations used in the article, its outline in some detail, and all the figures with their captions (hint to authors: put as much of the essence of your paper in the captions). All useful information. It is not a great stretch to imagine that the references are added to what non-subscribers can see (indeed, some publishers already do that), and even the important single scientific assertions in an article, possibly in the form of ‘nanopublications’, on the way to eventual complete openness.

Of course, it is not the same as full, BOAI-compliant open access, but in areas where ‘ocular’ access is perhaps less important than the ability to use and recombine factual data found in the literature, it may provide important steps during what may otherwise be quite a protracted transition from toll-access to open access, from a model based on physical product analogies to one based on the provision of services that science needs.

Jan Velterop

Saturday, January 19, 2013

On knowledge sharing — #upgoerfive

This post was written with the  #upgoerfive text editor, using only the most common 1000 words in English.

At one time there was a man who some people thought was god. Other people thought he was sent to the world by god. This man had two water animals you could eat and five pieces of other food and he wanted the many people who were with him to have enough to eat. But two water animals and five other pieces of food were not enough for the people if they all had to eat. So the man who some people thought was god and others that he was sent by god, made the food last until all the people had had enough to eat. This was a wonder. The people saw this and did not know if they could believe what they saw. But when it seemed true that he had a power that no other men or women had, they believed the man was really god or sent by god, because he could do what other men could never do at all. This story became very well known. And many people believe it is about food.

But I think it is not about food. I think it is about food for thought. About what we know, not about what we eat. Because if we give food that we have to others, we do not have it anymore for us to eat. But if we tell others what we know, they know it, too, and we still know it as well. So we can not share our food and still have it all, but we can share what we know and still have it all. We should share what we know if it is good for us all. Especially people who work on knowing more and more every day, as their job. They are paid by us all to work in their jobs on knowing more and more, and they really should share what they come to know with us, and in such a way that we can understand it, too.

Jan Velterop

Tuesday, January 15, 2013

Imagine if funding bodies did this


There is apparently a widespread fear that if a ‘gold’ (author-side paid) open access model for publishing scientific research is supported by funding bodies, the so-called article processing fees, paid for by funders on behalf of authors, might see unbridled increases. This fear is not unwarranted if not addressed properly. If funders agree to pay whatever publishers charge, they undermine the potential for competition among publishers and provide them with an incentive to maximize their income, while at the same time removing any price sensitivity on the part of the publishing researcher. However, it is not very difficult to address this problem.

In order to avoid untrammeled article processing fee increases, funding bodies should foster competition amongst publishers, and create price sensitivity to article processing charges in researchers publishing their results.

Imagine if they did the following:
  • Require open access publishing of research results;
  • Include in any grants a fixed amount for publishing results in open access journals;
  • Allow researchers to spend either more or less than that amount on article processing charges, any surplus to be used for the research itself, or any shortfall to be paid from the research budget;
  • Require any excess paid over and above the fixed amount to be justified by the researcher to the funder;
  • Provide a fixed amount for more than one publication if the research project warrants that, but so that researchers have an incentive to limit the number of published articles instead of salami-slicing the results into as many articles as possible, again by giving them discretion over how the fixed amounts are spent. 
Jan Velterop

Sunday, September 09, 2012

'Pixels of information'

My friend Barend Mons wrote to me and I think it is worth sharing his letter on this blog. I checked with him, and he agrees that it can be shared on this blog.
Dear Jan,

I'm writing to you inspired by your remark that "OA is not a goal in itself but one means to an end: more effective knowledge discovery".

What we need for eScience is Open Information to support the Knowledge Discovery process. As eScience can be pictured as 'science that can not be done without a computer', computer reasonable information is the most important element to be 'open'. 
You're right, Barend. That's why I think CC-BY is a necessary element of open access. 
As we discussed many times before, computer reasoning and 'in silico' knowledge discovery leads essentially to 'hypotheses' not to final discoveries. There are two very important next steps. First, what I would call 'in cerebro' validation, mainly browsing the suggestions provided by computer algorithms mining the literature and 'validating' individual assertions (call them triples if you wish) in their original context. 'Who asserted it, where, based on what experimental evidence, assay...?' etc. In other words, why should I believe (in the context of my knowledge discovery process) this individual element of my 'hypothesis-graph' to be 'true' or 'valid'? Obviously in the end, the entire hypothesis put forward by a computer algorithm and 'pre'-validated by human reasoning based on 'what we collectively already know' needs to be experimentally proven (call it 'in origine' validation).

What I would like to discuss in a bit more depth is the 'in cerebro' part. For practical purposes I here define 'everything we collectively know', or at least what we have 'shared' as the 'explicitome' (I hope Jon Eisen doesn't include that in his 'bad -omes'), essentially a huge dynamic graph of 'nanopublications' or actually rather 'cardinal assertions' where identical, repetitive nanopublications have already been aggregated and assigned an 'evidence factor'.  Whenever a given assertion (connecting triple) is not a 'completely established fact' (the sort of assertion you repeat in a new narrative without the need to add a reference/citation) we will go to narrative text 'forever' to 'check the validity' in my opinion.

Major computer power is now exploited for various intelligent ways to infer the 'implicitome' of what we implicitly know (sorry, Jon, should you ever see this!), but triples captured in RDF are certainly no replacement for narrative in terms of reading a good reasoning, why conclusions are warranted, extensive description of materials and methods etc. So the 'validation' of triples outside their context will be a very important process in eScience for many decades to come. In fact your earlier metaphor of the 'minutes of science' fits perfectly in this model. 'Why would I believe this particular assertion'? ... Well, look in the minutes by whom, where and based on what evidence it was made'.

Now here is a very relevant part of the OA discussion: The time when some people thought that OA was a sort of charity model for scientific publishing is definitely over, with profitable OA publishers around us. The only real difference is: do we (the authors) pay up front, or do we refuse that (for whatever good reason, see below) and now the reader has to pay 'after the fact'. So let's first agree that there is no 'moral superiority', whatever that is, in OA over the traditional subscription model.  
Not sure if I agree, Barend. OK, let's leave morals out of it, but first of all, articles in subscription journals can also be made open access via the so-called 'green' route of depositing the accepted manuscript in an open repository; and secondly, OA at source, the so-called 'gold' route, is definitely practically and transparently the superior way to share scientific information with anyone who needs or wants it.
We have also seen the downsides of OA, for instance for researchers in developing countries who may still have great difficulty to find the substantial fees to publish in the leading Open Access journals.

I believe however, that we have a great paradigm shift right in front of us. Computer reasoning and ultralight 'RDF graphs' distributing the results to (inter alia) mobile devices will allow global open distribution of such 'pixels of information' at affordable costs, even in developing countries. Obviously, a practice that will be associated is to 'go and check' the validity of individual assertions in these graphs. That is exactly where the 'classical' narrative article will continue to have its great value. It is clear that the costs of reviewing, formatting, cross-linking and sustainably providing the 'minutes of science' is costly and that the community will have to pay for these costs via various routes. I feel that it is perfectly defensible that those articles for which the publishing costs have not been paid for by the authors, and that are still being provided by classical publishing houses, should continue to 'have a price'. As long as all nanopublications (let's say the assertions representing the 'dry facts' contained in the narrative legacy as well as data in databases) are exposed in Open (RDF) Spaces for people and computers to reason with, the knowledge discovery process will be enormously accelerated. Some people may still resent that they may have to pay (at least for some time to come) for narrative that was published following the 'don't pay now — subscribe later' adage. We obviously believe that the major players from the 'subscription age' have a responsibility, but also a very strong incentive to develop new methods and business models that allow a smooth transition to eScience-supportive publication without becoming extinct before they can adapt.

Best,
Barend
Your views are certainly worth a serious and in-depth discussion, Barend. I invite readers of this blog to join in and engage in that discussion.

Jan Velterop