For reasons I can't fathom, Internet Archive Scholar got attention today, a mass of it, painting it as a "new" service. Actually, it has been out there for about a year. BUT....

If beautifully structured access to academic citations by the millions is your bag or desperately needed tool, especially ones that are ONLY left in the Wayback Machine, you are in LUCK. And this will be your favorite day. Try it.

@textfiles How does it compare to Google Scholar? Similar name, so I was wondering if it was similar in intent and/or scope.

@textfiles Well of course it's better, that went without saying. No matter what it is, it *has* to be better. We are comparing Internet Archive to Google... There's almost always nothing to worry about as to which of them is better. And honestly, I can't think of a situation Google (as a company) might do a better job of anything the archive does.

@textfiles Can confirm, it is better, but UI/UX takes a bit of getting used to.

@JigmeDatse @textfiles we are a search engine over research publications, so pretty similar to Google Scholar in a lot of ways.

we have no current intention of building high-quality author profile pages, or computing "leaderboard" style citation metrics summaries, which are features of Google Scholar.

one difference is that all of our biblio metadata and code is openly available for reuse, and we have an open search API

@scholar All of those differences sound like positive differences. I did some looking very briefly, and other than being a little unfamiliar, and things working in ways that didn't "feel natural" it was only a few seconds of, "Oh, maybe this works this way" to get what I was wanting.

@textfiles Never been so excited to share something on work Teams :party:

@textfiles I have no idea how I missed this last year, especially since I subscribe to the IA's blog. Amended my post on Waxy.

@textfiles This is brilliant. I had no idea it existed. Thanks so much for this!


wow great to learn about this.

i tested it but didn't see some of my pubs so wondering about how completeness

@BeTongLen @textfiles it isn't perfect, but getting better all the time!

by default we only display results that we have an accessible full text copy of. you can change the "availablity" filter to "all records" to override this.

if you know of an open copy of a paper we don't have, you can click through to 'fatcat' and submit a "Save Paper Now" request for us to crawl and index it

@textfiles is it true you send material to be digitised to places like Philippines because the labour costs are lower?

@Loukas not all, but a majority percentage. Innodata in Cebu, a very legit company.

@Anarchy_How @textfiles it is possible to directly edit the catalog ("fatcat") by clicking through the green link below results.

if that is too much hassle, you can DM this account here on mastodon, chat in, or use the contact info on the about page


I read about this (that this exists) from Big Book of R the other night. The attention is news.

@textfiles Well, today I learned!! Very cool - do you know if there are any efforts underway to create something like the equivalent of the GS "profile" functionality where you can verify as yourself and then claim/de-duplicate/correct your own references? Maybe one that uses @ORCID_Org identifiers even...?

@melissaekline @textfiles @ORCID_Org we don't have any plans to take on author profile pages ourselves, though somebody could certainly build something like that on top of the catalog API. getting human names and de-duplication correct is hard and can be harmful if done wrong.

the fatcat catalog does have a concept of "creators" (eg, authors and editors), and can be edited by anybody to update author/paper linkages. it contains a lot of ORCID-based records, but is not very complete

@textfiles Maybe because Paul Ford mentioned it in this month's Wired in his piece "A Tweet Before Dying?"

@textfiles oops. Accidentally added two other relies I meant to be posts. Since deleted. If you saw them sorry for the confused look I’m sure they gave you.

@textfiles The Internet Archive Scholar is the best alternative to Google Scholar I have ever seen. Wow! Can I create an account and curate my publications?

@MarcusStensmyr @textfiles By the way @giorgiogilestro it looks like we are now co-authors! The author parsing software could use the assistance of manual curation, @textfiles (Giorgio F. Gilestro was the editor, not an author).

@albertcardona @MarcusStensmyr @textfiles @giorgiogilestro whoops! we have the "editor" metadata in the catalog:

but doesn't come through in search results. how would you expect this to display in search results? "(Ed.)" after the editor name? PubMed seems not to display; PLOS (publisher) shows in separate metadata box

@scholar @albertcardona @MarcusStensmyr @textfiles you don't need to show the editor's name at all. It's not important ( certainly not in this context )

@textfiles oh my god. this found, in one search, a paper on soveit cybernetics from the 60s i could not find anywhere else. To be fair, their webpage says: "*This is a new service. Metadata is being improved and features have not been finalized.*" so it would be understandable if someone (like me) happened on the site randomly. we'd think it was a new service.

Sign in to participate in the conversation
Internet Archive

A Mastodon Server for Internet Archive employees and Role Accounts (Announcements)