Beyond the article

This is a guest post from Adam Dinsmore, Evaluation Officer at the Wellcome Trust.

The penultimate session of 2:AM Amsterdam focused on efforts to develop means of tracking the use of research outputs other than the peer reviewed article. Be they products of the internet age like software and code, or antecedents to it like datasets and books.

The session began with Josh Borrow of Durham University, who presented a blueprint for assessing the societal impact achieved by public engagement with science developed in partnership with Pedro Russo of Leiden Observatory. Josh began by echoing a theme from Simon Singh’s keynote address – that those researchers who don’t wish to engage with the public shouldn’t have to – but followed by saying that those who do wish to engage should be appropriately supported and rewarded.

At present this doesn’t seem to be the case. When asked about factors which impede them from engaging with the public researchers most commonly cited a lack of time; suggesting that time spent engaging with the public doesn’t contribute to career development in the same way that research or teaching does. While research has become more progressively more accessible to the public through outreach activities and the Open Access movement, academia is yet to put proper incentives in place to reward those who make themselves and their work more discoverable.

As a partial solution, Josh suggested a move towards a system of public engagement mentorship within academia, analogous to the student-supervisor relationships already present in research. In Josh’s proposed four-point model (illustrated below) researchers and their mentors would keep detailed records of the planning and implementation of their public engagement activities, thereby facilitating improved post-hoc evaluation. Researchers would also be encouraged to build up public engagement portfolios which could be presented as evidence of public good to funding and tenure boards.

Figure taken from A Blueprint for Assessing Societal Impact Through Public Engagement (Borrow & Russo, 2015).

Figure taken from A Blueprint for Assessing Societal Impact Through Public Engagement (Borrow & Russo, 2015).

Josh ended his presentation with an appeal to the altmetric community to help public engagement practitioners develop useful metrics to help incentivise public engagement among researchers (and provided a link to an arXiv paper for those wanting to read more).

Next on stage was Robin Haunschild of the Max Planck Institute for Solid State Research, who presented the results of a detailed study of Mendeley bookmarking activity across 1.2m DOIs published in 2012. Robin presented his results as answers to three explicit research questions.

  1. Are there differences and similarities between disciplines in bookmarking papers?
  2. How do researchers in different career stages differ in terms of bookmarking papers?
  3. Are there patterns of similar readership between specific countries?

To probe these questions Robin applied the community-finding algorithm of Blondel, Guillaume, Lambiotte, and Lefebvre (2008) to his dataset, from which four distinct disciplinary communities arose. These were labelled by Robin as i) biology and geo-sciences, ii) social science and humanities, iii) bio-medical sciences, iv) natural science and engineering. In all four groups the category with the addition ‘miscellaneous’ was the most commonly bookmarked.

Next Robin showed that students and early-career researchers accounted for the majority of Mendeley readership activity in his sample, including masters students, PhDs, and postdocs. These were also the most commonly interconnected communities, though all career stage groups were connected to some extent.

Regarding patterns of Mendeley use in different countries the Blondel algorithm appeared broadly to group countries according to development. Four communities of nations were identified. The first (in order of size) containing 53 nations, of which a majority were members of the OECD (including the US, UK, Germany, France, Japan, and Canada, as well as Russia and China). Next was a smaller group including Brazil, Mexico, and Norway, followed by a group of 10 countries of which the largest nodes were Nigeria and Niger, and a fourth group containing only two countries.

Robin was followed by Sunje Dallmeier-Tiessen of CERN, who presented case studies of two tools designed to levy the potential of altmetrics to track “anything but the paper” in the ostensibly disparate fields of social science and high energy physics. Sunje’s work is an attempt to move beyond what she calls the article-data paradigm, citing CERN’s upcoming release of several thousand Monte Carlo simulations as an example of knowledge transfer with little existing infrastructure allowing us to track their use, and therefore incentivise further openness.

The first case study detailed the Dataverse data repository currently in operation at Harvard University. Most of the items in the repository belong to the social sciences, though it is open to all researchers and efforts are underway to encourage use by other disciplines. The Dataverse is able to integrate with collaborative workspaces and analytical software such as ROpenSci.

The second tool described by Sunje was an analysis preservation tool currently being developed by CERN, which counts the use and re-use of non-traditional objects (NTOs; like the code underlying the Monte Carlo simulations mentioned above). The tool currently focuses mainly on citation of the NTOs, as there appears to be little demand for altmetrics (though Sunje wondered whether providing altmetrics might cultivate demand among researchers).

Martin Fenner of DataCite then took to the 2:AM stage, initially flanked by a slide set written in Comic Sans as the result of a losing bet made in an Amsterdamian drinkery the previous evening. Martin presented the early results of the ‘Making Data Count’ project, a joint undertaking of California Digital Library, PLOS, and DataOne.

Some initial scoping work surveyed research managers regarding the types of metric they would like access to, with citations and downloads rated highest. The team therefore worked to adapt LaGotto (PLOS’s open source metric gathering software) to collect citation and download stats for NTOs without DOIs in the DataOne repository network. As datasets in particular are seldom cited in the reference lists of academic articles – often appearing as URLs in the main body of the article instead – the tool had to be configured to search whole articles.

The tool was written to collect two sets of usage stats; those which excluded ‘good’ machines like ruby and java clients in accordance with Counter regulations, and those which didn’t (‘bad’ bots were removed from both counts). Both sets of usage data were very similar, suggesting that Counter compliance likely isn’t necessary when preparing dataset usage statistics. Martin concluded by mentioning the group’s plans to integrate the tool with other client applications and turn it into a service after the research project is concluded.

The session literally and figuratively built to a crescendo as Martijn Roelandse of Springer spoke to the audience about BookMetrix, a free service developed by Springer with assistance from Altmetric, which provides data on the attention received by books. Books (including monographs and conference proceedings) have always been more important to certain parts of the social sciences and humanities than peer reviewed articles, but until now little has been done to help researchers provide evidence of the reach that these outputs have.

Martijn began by showing BookMetrix’s submission to the 2015 ALPSP Awards – a 2 minute video featuring short descriptions of BookMetrix’s various functions intercut with clips from La maquina de escribir (a piece of orchestral music featuring a flamboyant soloist rapping at a typewriter, presumably working on a new submission to Springer).

Still from La maquina de escribir, de L.Anderson (Solista: Alfredo Anaya). LIO EN LOS GRANDES ALMACENES, featured in Martijn’s talk.

Still from La maquina de escribir, de L.Anderson (Solista: Alfredo Anaya). LIO EN LOS GRANDES ALMACENES, featured in Martijn’s talk.

Some of the data coming out of the service has been surprising. Across the 230,000 books being tracked by BookMetrix the average number of citations is 20 and the citation half life somewhere between 20 and 30 years. ‘Old’ content (i.e. that published before 2000) still appears to accrue significant readership and citations. The service also allows users to benchmark disciplines against eachother to compare rates of book citations, and permits an analysis at the chapter level as well as the book level.

As scholars continue to find new and innovative ways to share the outputs of their research with users via the web research managers must respond by adequately tracking this behaviour, and providing worthy incentives for researchers wishing to make their work as discoverable as possible. The talks in this session gave an interesting glimpse into how this is already being done.

 

 

Altmetrics manifesto: 5 years on

This is a guest post contributed by Bianca Kramer, subject specialist in Life Sciences and Medicine at Utrecht University Library.

2010 – the year of the Deepwater Horizon oil spill, Wikileaks, and the Arab spring. The year social media really exploded. And the year the Altmetrics manifesto was published online.

On the occasion of the manifesto’s 5th anniversary there was a huge cake, and the additional treat of having all 4 founders of the manifesto (Jason Priem, Dario Taraborelli, Cameron Neylon and Paul Groth) on stage together for the closing session of the 2:AM Altmetrics Conference. With Jennifer Lin guiding the discussion, they talked about altmetrics’ past, the direction it has taken so far, and their hopes for its future.

Photo credit: Guus van den Brekel (@digicmb) http://ow.ly/i/dANiJ (re-used with permission)

Photo credit: Guus van den Brekel (@digicmb) http://ow.ly/i/dANiJ (re-used with permission)

After some bantering about the dirty secrets surrounding the Manifesto’s inception (Jason and Dario not being quite sure how to shape it: a movement, an organization, a petition, a hashtag? Cameron thinking of it as “something on the web, might be useful to add my name to, probably not going anywhere…”, and Paul making the grand suggestion to publish it on the Scholarly Kitchen), the discussion quickly turned to more serious matters.

Asked about the impact altmetrics has had, both for society and scholarship, all four emphasized the community that has evolved around the concept of altmetrics, and all that this community has done. The many grants awarded to develop the application of altmetrics, the companies that have been founded (and in some cases acquired by large players) to provide services, the attention it has garnered from the field of bibliometrics, adding a tremendous amount of expertise and resulting in an explosion in the scientific literature on altmetrics. We are now at the point where people are embarassed to admit they use the impact factor in assessment, and other forms of output measurement are being asked for, e.g. by NIH. This is not solely to the credit of altmetrics, but clearly a lot has changed in these five years. The time might be right for a Kuhnian revolution in the field of scientometrics.

However, there was a strong feeling among the founders that altmetrics has not delivered on its promise if it is mainly used as a different way of counting. With people publishing more, and more different types of output, the real importance of altmetrics lies in its potential to filter, connect and tell stories. The data on how people interact with research outputs can be seen as a form of person-centred peer review and, rather than just used as another measurement, could be used for filtering and recommendation of information. This is a concept that was also explored during the hackday preceding the conference.

Altmetrics can also show us how information travels, how knowledge travels. A signal made up by a tweet, followed by a Mendeley read and then a citation is different from a Facebook like, a mention in mainstream media and finally, a policy change. Altmetrics can make these signals visible and enable us to look at the network properties of this flow of information, from a much wider perspective than only that of citations. One step in this direction are the Mendeley reader networks presented earlier at the conference.

Thus, rather than just changing the way we measure, altmetrics can be used to create knowledge, connect people and bring science forwards. Maybe that’s the real revolution we should be striving for.

As Jennifer concluded at the end of the session, the work of the altmetrics founders is not done. Our work as a community is not done.

See you in another 5 years at 7:AM!

[disclaimer: while I’ve tried to incorporate many things that were said during the panel, this blogpost reflects my own impression and interpretation of the session]

 

The 2:AM hack day

This is a guest post contributed by Shane Preece, Developer at Altmetric.

A hack day is when a group of people, who rarely or never get to work together, have some interest in common and can spend a day tackling a problem. On the best days, real developers get matched with real users, and then they build something together.

These quick prototypes are usually made open source, or at least publicly visible. From there the work can be built on if enough enthusiasm can be drummed up.

At the 2:AM hackday, we ended up with two projects. (A third was started, though unfortunately they had to leave before completion.)

The first was a tool heavily based on the past work of Manos Tsagkias, who gave us the code to generate music recommendations based on a set of Twitter data. In our case though, we made some tweaks to work with academic outputs. Given all the tweets of the people you follow on Twitter, we took all the links to articles and decided which articles you’d be most interested in reading – anything that slipped through your social networking cracks. “Most interested” boiled down to “which articles were talked about the most”, but any number of interesting metrics could be added.

The promise of this tool is that it’ll help you find media you should be reading based on your own selection of experts, rather than anything pushed into the mainstream based solely on the Matthew’s Effect.

The second tool showed off the recent initiative by Mozilla: open badges. This project was demonstrated by Todd Carter. These are badges, similar to the work of Amy Brand and Liz Allen, which show off the role of the each “author” of a paper. Data scientists can be labeled as such, and the supervisor, key researcher, finance manager are clearly identified too. People trying to make a name for themselves in a particular field will find it easier now that they can label themselves.

2AM-hack-author-profile

Both tools are prototypes in the strictest terms – a lot of the functionality is hardcoded. The twitter project doesn’t actually call out to Twitter just yet – it’s just a hardcoded list of my followers! But the promise of the hack day tools is available for anyone to pick up and run with.

Meanwhile, there was an R workshop going on next door, for those who were interested in learning the basics of a programming language designed for statistics. This is what surprised me to the most: many people with very little technical background were sitting down for a day of quite technical work. It goes to show that there are many more areas than just “computer science” where knowing how to write software is useful.

Everyone left the day with a new skill, hopefully. And they definitely left with a bunch of new friends.

You can find all of the outputs of the day on the Github: 2AM-hack.

Altmetrics in the library session

This is a guest post contributed by Ian Mulvany.

Use of Altmetrics in US-based academic libraries
Stacy Konkiel – Altmetric, Sarah Sutton – Emporia State University & Michael Levine-Clark, University of Denver

Stacy presented the results of a wide ranging survey of librarians. They delved into questions such as how aware are librarians of ALMs in comparison to other metrics? How are they using ALMs? Are they being used to enhance library services? The core question they were interested in is are people actually using ALMs in the library at all? It’s easy to create a story around how you might use ALMs in the library, e.g. to understand usage, or for collection development, but are they being used?

They asked every librarian working in a specific class of research library in the US. They sent a survey to 1300 people, and got response from 400 (which is an astonishingly high response rate for a survey).

The scholarly communication and support librarians had a higher level of awareness of these tools, but the base line was that about 30% of libraries responded with familiarity with ALMs, where baseline familiarly on things like citations was at about 80 – 90 %. (It was surprising to me that citation awareness was not close to 100%)

Apart from awareness they also asked how often libraries were likely to discuss these kinds of things when providing reference services. ALMs were brought up rarely. (It would be great to do a longitudinal study on this to determine the rate of change and uptake in this community).

For collections development libraries the usage is very low, about 5%. the most used are usage statistics (40%) (of course I’d argue that usage is kind of an ALM. This also raises the question that what are the 60% of people using who don’t even use usage statistics when doing collections development?).

This survey forms a good baseline, but as with any complex issue it begins to raise many interesting questions.

Stacy starts asking questions of the room. We find out that about 1/2 of the audience are librarians. No one in the audience thinks that ALMs should not be used for collection development. There might be a selection bias in the audience.

Q&A

Q: When working in the context of a library where there are deals, is there not a responsibility on the publisher to present some of this information? Even a the journal level A: yes

Understanding impact through alternative metrics, library-based assessment services
Kristi Holmes, Director, Galter Health Sciences Library at Northwestern University, Feinberg School of Medicine

There is a highly motivated audience within libraries to plug in to this space. The Northwestern library NUCATS has a focus on supporting translational medicine.

In 2000 is was taking 18 years to get form discovery to dissemination. By 2015 that had come down to 7 years, we are getting smarter about this, but there is still room for improvement.

The NIH provides the largest program for translational medicine. There is a lot of pressure to get these awards.

At NUCTS they are measuring a lot of things, for example influence of a research output e.g. time to publication, number of technology transfer products, ROI of pilot awards, the number of collaborations. They are trying to do this in a meaningful way.

They want to understand both productivity as well as impact, and they need great data to be able to do that. They want to know who is paying attention, what conversations are happening around a work.

Even when looking at data from a great source like Scopus, there is still a lot of missing data – the finding that led to the work, collaborations, who is citing the work. There is a lot of low hanging fruit here.

What is impact? There are a lot of definitions, what is critical is the context. It’s not just the paper or the tweets, but understanding what it leads to. If a paper leads to a new methodology, a new standard of care, if it gets used in the med school curriculum – these are very impactful. If you can bill for the work this is also meaningful. These can serve as indicators that there is a change in the way that we are delivering healthcare. It’s messy, but it can be really meaningful.

They are trying to build an ecosystem with the library as a partner. The library is a trusted neutral space, they have a tradition of service and support, the people in the library really work for the mission of the organisation.

They launched a metrics and impact core. They help people track their publications. They are helping their researchers put their NIH biosketch together. They have helped hundreds of researcher with this.

They call it a core because this is terminology that resonates with the people who they are serving.

They are using tools like altmetrics.com, plumb analytics, bibliometrics analysis, social network analysis. They are using a lot of surveys.

They found the case studies that were produces as part of the REF magnificent, and they want to make that work in their site.

They are creating dashboards to provide insight over a large number of facets of what is happening with research in the library.

They are mainly trying to shine a light on what is going on.

You need to know who cares, who the people are who you need to make sure are paying attention to their perspectives. You need to know who is going to be the champion in your organisation. Find out whats missing. Ask what you can do today!

They have a google group – res-impact@googlegroups.com. There will be MOASIC meeting in Toronto with the MLA in May 2016.

Q&A

Q: can you say more about your stakeholders?

A: it really is about getting out an having conversations, the dean is generally the last person to know, they work a lot with their students and junior faculty. JF are highly motivated to explain why what they are doing is worthwhile.

Q: Can you show whether AMLs add to, or not, to PubMed

A: everyone cares about improved health, and it seems like there really are good data in this space, e.g. patient care organisations. Medical libraries are like their own little thing, people are very motivated and also under a lot of publication and impact pressure.

Altmetrics opportunities for librarians
Wouter Gerritsma, Deputy Librarian, Digital Services & Innovation at Vrije Universiteit Amsterdam

Librarians have a natural role to play in the altmetrics space.

Wouter discusses where you are likely to encounter ALMs today. Basically, all over the palce; Researchers find ALMs often on journal and article pages. If you are using a Primo discovery system ALM indicators are integrated. DSpace can also integrate this. Web Of Science provides usage data. Scopus has also introduced a whole dashboard. Wouter finds that he can’t escape these in a library environment. He needs to understand these so that he can explain them to his users.

The first role of librarians is library outreach. His experience is to start small and gain experience. It’s important to know that ALMs are not only about social networks.

He also recommends that if you make presentations on this topic then do share them for the wider community.

The field is in rapid development, and we should follow it and see what is happening in this space.

He wants libraries to be the managers of research information systems within the universities. There is a big transition happening in Dutch universities. The focus on RIM systems that exists in Europe is different to that in the US. There is an opportunity to use these platforms as a location for aggregating usage and ALM data, and there should also be connections to IRs.

He likes the idea of getting rid of the pure concept of the Institutional Repository, and think of something bigger.

He likes to talk about the Institutional Bibliography.

The challenge is how do you make use of that collection of information? Collecting the information without making wider use of it is a wasted opportunity. You have to make use of the collection that you have, and you can then start to do analytics on this data. There are opportunities to do recommendations, citation analysis, visualisations, ALMs, bibliometrics. These are all built on top of a comprehensive collection of data.

What is needed is that librarians need to develop new skill sets. Get you CRIS in good shape, only when you have a good collection of data you can start to do the analysis.

What we have not mentioned in the room yet is that we collect ALMs at a single article level. We want to aggregate to a researcher. It’s a sensitive subject but people are already doing this based only on citations. You can also roll up to the department or university level.

We need to break beyond looking at only peer reviewed articles. These are of course the mainstay of the output, but don’t forget about books, conference, thesis, non peer reviewed output. Collecting ALMs on these items is an important challenge.

The way to do this is the use the DOI. ORCID is also critical. All Dutch universities have ISNIS, but they don’t know about it, they have VIAF, but they don’t know about it. They are being asked to fill in ORCIDS now.

Don’t collect ALMs for peer reviewed articles only.

Stay away from predictors of citations, it’s about allowing researchers to tell the story of their work. Those stories are important because we are increasingly looking for evidence that sits along side citations, and not only citations.

Q&A

Q: in what department has story telling worked best?

A: in the Netherlands and in the REF they are looking for societal impact. ALMs can give information around things like where news articles have been written about research, and that helps crating stories that describe societal impact. (in fact Wouter kind of side stepped the question, and it would be great to see some solid cases studies presented at a future meeting).

Q: What about things that are not even published, like software?

A: Github can give you usage.

Alternative metrics in Dutch university libraries
Alenka Princic, TU Delft Library, Netherlands

Likes to thing of these a CoMetrics, or CoMet, or complimentary metrics (not conmetrics).

The role of the library is changing from being a collection provider to a partner in science.

Research assessment is becoming a new service area for librarians. The librarians are the key group who are using these bibliometrics. In Delft they do not have a dedicated analyst for this.

At Delft researchers are publishing for impact.

In Delft they are doing some bibliometric analysis, but it’s somewhat ad-hoc. They have a research support portal, there is information in this portal to help researchers understand how to improve the footprint and impact of their researcher. This is an initiative that is in development, and they aim to expand the services that they provide.

For ALMs it is currently bottom up. They are looking mostly at the major tools, and they are experimenting and creating pilot programs. They are keen to do this in collaboration with others on this.

Within the Dutch libraries, getting 10 respondents from 13 Dutch libraries, looking at this qualitative data what they can say is that

  • About 1/3 have bibliometrics services rolled out for individuals
  • About 1/3 have these services for group assessment
  • About 1/3 don’t have these in place

For ALMs

  • none have this in place on the institutional level.

General awareness of the tools is quite good, but actual usage is low at this time, however the low numbers in this survey are hard to interpret.

There are a lot of people who are experimenting and observing. Quite a few have deferred looking at this until they implement a new CRIS (current research information systems).

In terms of future vision, about 1/3 have not had the time to create a future vision for ALMs. 1/3 have said that it needs maturity, but it needs potential.

The speed is considered a great benefit of ALMs. It might be able to quantify the success in achieving goals set by the researcher, specifically around reaching target groups or audiences.

It could be part of an open peer review system. Can evaluate non-traditional outputs. It could also be used to demonstrate exclusiveness of an institution, and could provide additional data for strategic purposes, e.g. how international an organisation is, how is the institution doing on open science indicators.

Q&A

Q: why are Dutch Libraries not doing workshops in engaging early stage researchers on these topics?

A: does not see the topic as so black and white, in that it’s not the case that they don’t do anything at all. There is a lot of engagement and advice on science engagement. Perhaps it’s not solely the role of science communication and solely the role of the librarian, but sits in the middle?

Q: Have we got past what we call this thing? Is it really just part of telling the story, and what you call it is about who you are speaking to.

A: The speaker agrees.

Altmetrics as indicators of economic and social impact

This is a guest contribution from Isabella Peters.

This session focussed on in how far altmetrics are capable of indicating economic and social impact. Anup Kumar Das (Jawaharlal Nehru University, India) talked about “Altmetrics and the changing societal needs of research communications at R&D centres in an emerging country: A case study of India”. He presented in what ways the best research university in India supports their researchers in building up altmetrics skills. Interestingly, until 2010 India has not used any citations or altmetrics for research evaluation and funding decisions. Also, there are no general strategies and no awareness for appropriate means for science communication in India. Still, universities do not work with communications officers and alike. Here Anup sees a chance for documentation officers, research officers, and information scientists to be engaged in taking that role. Right now Anup and colleagues train researchers how to set up social media accounts to promote their research and they also use listservs and Facebook and Google groups to disseminate the outcomes of the university. In order to guide readers to the content a blog (Ccp-jnu.blogspot.in), Twitter account (@indiasts), and audio archives (mixcloud.com/cssp_jnu) are in their repertoire as well. They often rely on papers and other research output stored in one of India’s 66 open access repositories (e.g., shodhganga.inflibnet.ac.in). Granting access to content via open access repositories and combining them with altmetrics information is the only way to overcome the lack of references to India’s scientific articles, thinks Anup.

Similar approaches have recently been started in Singapore. Theng Yin Leng (Nanyang Technological University, Singapore) introduced the project “Altmetrics: Rethinking and Exploring new ways of measuring research outputs” (SRIE Award No. NRF2014–‐NRF–‐SRIE001–‐019) in which she and colleagues plan to develop a dashboard helping researchers and institutions monitoring the impact scientific publications have on the (social) web. Training the researchers to effectively and responsibly communicate scholarly results to interested peer groups is also part of the project as well as research on algorithms to derive metrics from social media.

Next, Lauren Ashby (SAGE) and Mathias Astell (Nature Publishing Group) shed light on “The empty chair at the altmetrics table” and discussed the absence of educational impact metrics and a framework for their creation. They made the case for journals, books, and other types of research output and their educational impact on people, for example in teaching and learning in the field of nursing. What they have found is that there are certain types of publications that are used by a variety of audiences whereas there are others which address very specific needs of a small group of people. Journals, for example, are similarly often used by practitioners, students, teachers, and researchers whereas textbooks are of greater value to practitioners. However, scientists often do not get credit for publishing textbooks, no dedicated metrics are at hand, and, thus, incentives for developing these publications are low. Lauren and Mathias proposed to use counts from syllabi and reading lists as well as publisher usage statistics and university or public library holdings to overcome this lack of indicators. This would also raise global awareness of usage of scholarly outputs in education.

Who is actually consuming and disseminating scientific publications on the web has also been studied by Juan Pablo Alperin (Simon Fraser University, Canada) and was presented in “Evolving altmetrics to capture impact outside the academy”. He constructed an automated tool (Twitter bot) to send tweets to researchers who have a Twitter account and also tweeted a link to publications indexed in SciELO (scielo.org). The tweet contained the question “Are you affiliated with a university?” which was answered by 5% of contacted researchers. Apparently, 36% of those are not related with a university although one might have accepted that people tweeting scientific articles are professional researchers (i.e., employed at a university). Instead it was, amongst others, podcasters, associations, patient groups, a restaurant owner, and unemployed who read research articles for self-enhancement who tweeted the article URLs. Juan Pablo says that this finding explains the low correlations between citation numbers and tweets which are often found: more than a third of twitterers will never cite formally in scientific publications. Hence, we still need to better understand who produces social media metrics and what that means for indicator use.

Taking up this last point and since there has been a vivid discussion on what altmetrics actually mean and reflect, Stefanie Haustein (Université de Montréal, Canada) and Rodrigo Costas (CWTS Leiden, the Netherlands) had a look at theoretical frameworks which might help making sense of altmetrics. In their talk on “Citation theories and their application to altmetrics” Stefanie and Rodrigo especially stressed that the heterogeneity of available data from social media platforms (e.g., recommendations, microblogging etc.) and the different concepts underlying their affordances make it difficult to have a clear definition for altmetrics. However, they believe that established frameworks that can be borrowed from bibliometrics can guide altmetrics methodologies and interpretation of results. Also, the theories highlight the heterogeneity of actions performed on the social media platforms. For example they found that the Normative Theory can be applied to services like F1000 and Mendeley but not to Twitter because of its brevity and diverse user groups. Moreover, theories can be classified according to the purpose of (alt)metrics use, i.e. research evaluation (e.g., Normative Theory or Social Constructivist Theory) or content analysis and mapping (e.g., Concept Symbols). More details on the elaborated thoughts on citation theories and their application to altmetrics can be found in the full paper available at arXiv (http://arxiv.org/abs/1502.05701).

The session showed that there is a diversity of altmetrics initiatives spread all over the world but that those projects differ in their state of maturity. In general the uptake of altmetrics indicators for several target groups and use cases has increased and interest in altmetrics research is high. All speakers were unified in their call for concerted efforts on better understanding what altmetrics are about and for pragmatic approaches to standardize altmetrics indicators. Unfortunately, there was no time for questions during the conference but the authors signalled that they can be reached via diverse social media channels as well as email after #2AMconf.

Science outreach – the good, the bad, and the pointless

This is a guest post contributed by James Hardcastle, Research Manager at Taylor & Francis.

Let me start this brief blog by saying I’m afraid it will not won’t do justice to Simon’s delivery, humour and anecdotes – an entertaining session all round. Although you can unfortunately not catch up on his session on the conference YouTube channel due to licensing restrictions, I hope the below will at least give you an idea of what took place.

Simon’s major theme on during his lecture was not everything we do in terms of outreach works, some expensive projects have little impact and cheap projects have great returns.  On the side of bad outreach were E=MC2 the ballet funded by Institute of Physics, and Lab in Lorry, expensive that ultimately engage few people.  On the good side of outreach, Sceptics in the Pub, Numberphile and The training partnership all come in for praise. The linking factors between these are they are generally dirt cheap, generally profitable, generally grass roots.

We are starting to big money on science outreach, how do we assess success and how do we allocate money. Give money to schools and let the free market decided? use teachers as the unit of effectiveness, assuming 1 teacher cost €50,000 is the project better than hiring a teaching?  Within science outreach there is a lack of criticism so some foolish projects get funding.  We are drawn to unproven, new, radical ideas, particularly those that link to the arts, could the money be better spent elsewhere. Ultimately how much should we spend money on science outreach? In science communication we have poor ideas and too much money the opposing to science where we have lots of ideas and no money.

During the panel there were two key themes. Researchers should focus research, and not all researchers want to be science communicators, nor should they be spending their time and resources on this. To be a good science communicator takes practice, time and effort; few academics will give an amazing lecture to 6th form students the first time they do it.  Outreach shouldn’t be seen as a box ticking exercise, something required to keep funders happy, either it won’t get done or it will get done badly. However as research groups are growing there is more chance that one member of the research group will be keen on communication outside the lab.

Overall there was recognition that science communication is still important and is no longer limited to talking in schools, but includes blogs or videos, but most scientists are better at science than science outreach.

2:AM Session 1: Standards in altmetrics

Blogging of this year’s conference has begun! This first post is contributed by Natalia Madjarevic

The 2:AM conference kicked off with a session on Standards in altmetrics, and the first speaker, Geoff Bilder (Director of Strategic Initiatives at CrossRef), began by discussing the emergence of altmetrics to help track the evolving scholarly record and made the case for altmetrics data standards. He described receiving an early email from the team at eLife discussing altmetrics: “[How can we] agree to do things in similar ways, so that the data is more comparable?” And how could CrossRef help? Bilder called for altstandards for altmetrics, and the treatment of data surrounding the research process in the same way as research data: open, comparable, auditable and quotable. CrossRef’s DOI Event Tracker pilot will be available in 2016 [bit.do/DETinfo]

Next up, Zohreh Zahedi, of CWTS-Leiden University, shared the latest findings from the Thomson Reuters-supported project analysing altmetrics data across a number of providers: MendeleyAltmetric.com and Lagotto (used by PLOS). The project looked at the altmetrics for the same set of 30,000 DOIs from 2013 publications (15k pulled from Web of Science and 15k CrossRef) with the data extraction conducted at the same date and time across each provider. The study investigated specific source coverage across providers and found varying results, sharing possible reasons for altmetrics data inconsistencies. Zahedi called for increased transparency in altmetrics data collection processes and, furthermore, data transparency from altmetrics sources (e.g. Twitter, Facebook).

Martin Fenner, (DataCite), provided an update on the NISO Alternative Assessment Metrics (Altmetrics) Initiative, currently in Phase II, with several groups working on the five topics identified in Phase I. The working group topics can be found here. All groups are currently finalising draft documents for public comment, to be made available in February 2016 and finalised in Spring 2016: some as standards, best practices, recommendations, definitions and codes of conducts – hopefully ready for discussion at 3:AM!

Finally, Gregg Gordon (SSRN) discussed the findings of a recent PLOS Biology paper: The Question of Data Integrity in Article-Level Metrics. The study conducted an in-depth analysis of article-level metrics and gaming across SSRN and PLOS outputs. In the case of SSRN, the study found gaming of PDF downloads by monitoring user IP ranges but saw a 70% decrease in potential fraudulent downloads after adding a pop-up warning message to the site. Gordon closed by highlighting the importance of clean and auditable altmetrics data to ensure emerging metrics are trusted and used by the academic community.

This session offered a pretty comprehensive overview of where we’re at with altmetrics standards in terms of establishing standards, in-depth data analysis, and the importance of auditable data in order to increase researcher confidence in altmetrics.

It’s conference week!

After months of preparation, we’re finally there: over the next few days the 2:AM hack day, conference, and altmetrics15 workshop will take place in Amsterdam. We’ve got some great presentations and workshops lined up – take a look at the full schedule to see what’s happening when.

There’ll be lots happening over the week – stay tuned to Twitter and follow #2amconf for all the latest updates.

We’ll also be live streaming via our YouTube channel (so you don’t need to miss out even if you couldn’t make it in person!) and our guest bloggers will be sharing their take on each session here.

Consistency challenges across altmetrics data providers/aggregators

ZohrehThis is a guest post from Zohreh Zahedi, PhD candidate at the Centre for Science and Technology Studies (CWTS) of Leiden University in the Netherlands. 

At the 1:AM conference in London last year, a proposal put forward by myself, Martin Fenner and Rodrigo Costas on “studying consistency across altmetrics providers” received an 1:AM project grant, provided by Thomson Reuters. The main focus of the project is to explore consistency across altmetrics providers and aggregators for the same set of publications.

Altmetric.com, the open source solution Lagotto and Mendeley.com participated in the study while other altmetrics aggregators (Plum Analytics and Impact Story) didn’t due to some difficulties such as agreeing on a random sample and its size and extracting the metrics exactly at the same date/time.

By consistency we mean having reasonably similar scores for the same DOI per source across different altmetrics providers/aggregators. For example, if Altmetric.com and Lagotto report the same number of readers as the source (Mendeley) for a same DOI, they are considered to be consistent. This is very critical to understand any potential similarities or difference in metrics across different altmetric aggregators. This work is the extension of a 2014 study using a smaller sample of 1000 DOIs, and all coming from one publisher (PLOS). In that study we showed that altmetrics providers are inconsistent, in particular regarding Facebook counts and number of tweets

(http://dx.doi.org/10.6084/m9.figshare.1041821).

Data & method:

For this purpose, we collected a random sample of 30,000 DOIs obtained from Crossref (15,000) and WoS (15,000), all with a 2013 publication date. We controlled the time by extracting the metrics for the data set at the same date/time on July 23 2015 starting at 2 PM using the Mendeley REST API, Altmetric.com dump file and the Lagotto open source application used by PLOS. Common sources (Facebook, Twitter, Mendeley, CiteULike and Reddit) across different provider/aggregators were analyzed and compared for the overlapped DOIs.

Preliminary results:

Screen Shot 2015-10-02 at 15.44.45Several discrepancies/inconsistencies among these altmetrics data providers in reporting metrics for the same data sets have been found. In contrast to our previous study in 2014, Mendeley readership counts were very similar between the two aggregators, and to the data coming directly from Mendeley. One important reason is a major update of the Mendeley API between the two studies. On the other hand, we found similar results for Facebook counts and tweets as before that there are still huge differences across Altmetric.com vs. Lagotto in collecting and reporting these metrics. 

Possible reasons for inconsistency:

We have summarized here some of the possible reasons we identified for inconsistencies across the different providers such as:

  • Differences in reporting metrics (aggregated vs. raw score/public vs. private posts)
  • Different methodologies in collecting and processing metrics (Twitter API)
  • Different updates: possible time lags in the data collection or updating issues
  • Using different identifiers (DOI, PMID, arXiv id) for tracking metrics
  • Difficulties in specifying the publication date (for example different publication dates between WoS and Crossref) influence data collection
  • Accessibility issues (resolving DOIs to URLs issues; cookies problems, access denies) differ across different publisher platforms

All in all, these problems emphasize the need to adhere to best practices in altmetric data collection both by altmetric providers/aggregators and publishers. For this we need to develop standards, guidelines and recommendations to introduce transparency and consistency across providers/ aggregators.

Fortunately, the National Information Standards Organization (NISO) has initiated a working group on altmetrics data quality in early 2015 which aims to develop clear guidelines for collection, processing, dissemination and reuse of altmetric data that can benefit from a general discussion of the results of this project. Much works need to be done!

2:AM Amsterdam: Setting the standard

AdamThis is a guest post from Adam Dinsmore, a member of the Wellcome Trust’s Strategy Division. Adam describes the importance of rigorous data standards and infrastructure to funders who wish to use altmetrics to better understand their portfolios, and looks ahead to the standards session at next month’s event.

It was a moment of some personal and professional pride last September when the Wellcome Trust played host to the inaugural altmetrics meeting (1:AM London). As a large funder of biomedical research the Trust is always keen to better understand the attention received by the outputs of the work that it supports, and over the two days delegates were given much cause to consider the potential of altmetrics to help us gather intelligence on the dissemination of scholarly works.

Screen Shot 2015-09-23 at 12.27.15Among the biggest developments in the UK’s metrics debate since 1:AM was the publication of The Metric Tide[1]; a three volume report detailing the findings of the Higher Education Funding Council for England’s (HEFCE) Independent Review of the Role of Metrics in Research Assessment and Management (or IROTROMIRAAM for short). The review, commissioned by then Minister of State for Universities and Science David Willets in Spring 2014, sought to bring together thinking on the use of metrics in higher education from across the UK’s researchscape. A call for evidence launched in June 2014 attracted 153 responses from funders, HEIs, metric providers, publishers, librarians, and individual academics.

Attendees at last year’s 1:AM event heard an update on the review’s progress from the report’s eventual lead author James Wilsdon (viewable at our Youtube channel) who described the group’s aims to consider whether metrics might support a research environment which encourages excellence, and crucially how their improper use might promote inefficient research practises and hierarchies.

The full report expounds further, crystallising more than a year of thoughtful consultation into an evidence base from which several important recommendations proceed. Among them is a call for greater interoperability between the systems used to document the progression of research – from funding application to scholarly inquiry to publication and re-use – and the development of appropriate identifiers, standards, and semantics to minimise any resulting friction. Fortunately for those with a vested interest in an efficient research ecosystem (i.e. everyone) some very clever people are working to make these systems a reality.

It’s important that the systems used to track the proliferation of scholarly work are able to interconnect and speak a common language. Image: 200 pair telephone cable model of corpus callosum by Brewbrooks (CC-BY-2.0).

It’s important that the systems used to track the proliferation of scholarly work are able to interconnect and speak a common language.
Image: 200 pair telephone cable model of corpus callosum by Brewbrooks (CC-BY-2.0).

In 2 weeks the second annual altmetrics meeting (2:AM Amsterdam) – which this year is being hosted at the Amsterdam Science Park – will open with a session on Standards in Altmetrics, featuring a presentation from Geoff Bilder on a newly announced CrossRef service potentially able to track activity surrounding research works from any web source. First piloted in Spring 2014, the DOI Event Tracker will capture online interactions with any scholarly work for which a DOI can be generated (articles, datasets, code) such as bookmarks, comments, social shares, and citations, and store these data in a centralised clearing house accessible to anyone. Critically, CrossRef have stated that all of the resultant data will be transparent and auditable, and made openly available for free via a CC-0 “no rights reserved” license. The service is currently stated for launch in 2016.

The session will also feature an update on the National Information Standards Organization’s (NISO) Alternative Assessment Metrics (Altmetrics) Initiative. Since 2013 NISO has been exploring ways to build trust in metrics by establishing precise, universal vocabularies around altmetrics to ensure that the data produced by them mean the same things to all who use them. In 2015 NISO has convened three working groups tasked with the development of specific definitions of altmetrics, calculation methodologies for specific output types, and strategies to improve the quality of the data made available by altmetric providers.

The continuing work of these groups speaks to the challenges inherent in establishing consistent, transparent data provision across the altmetric landscape. Zohreh Zahedi of CWTS-Leiden University will present the findings of a study of data collection consistency among three altmetrics providers, namely Altmetric.com, Mendeley, and Lagotto. The study examined data provided by these vendors for a random sample of 30,000 academic articles, finding several discrepancies both in terms of coverage of sources like Twitter, CiteULike, and Reddit and the scores derived from them. These findings provide an important indication that the use of altmetric data remains laden with caveats regarding the context in which they were derived and exported.

It is heartening that real attention is being paid to the issues of interoperability and consistency often raised by funders, publishers, and HEIs, and drawn together by the Metric Tide report. The presentations from CrossRef, NISO, and the CWTS-Leiden group are bound to stimulate much thought and discussion, which will then be built upon in a standards-themed workshop session later in the day. These discussions portend a time when a rigorous data infrastructure allows altmetrics to approach their hitherto unrealised potential. I look forward to hearing about it in Amsterdam!

[1] Wilsdon, J., et al. (2015). The Metric Tide: Report of the Independent Review of the Role of Metrics in Research Assessment and Management. DOI: 10.13140/RG.2.1.4929.1363