"One site, scholar.archive.org, has PDFs going back to the 18th century. Itโs empowering to look for this stuff instead of waiting for it to be socially discovered and jammed into my brain."
๐ @ftrain
https://www.wired.com/story/tweet-dying-revolutionary-internet/
(h/t @jstogdill for the reference)
For reasons I can't fathom, Internet Archive Scholar got attention today, a mass of it, painting it as a "new" service. Actually, it has been out there for about a year. BUT....
If beautifully structured access to academic citations by the millions is your bag or desperately needed tool, especially ones that are ONLY left in the Wayback Machine, you are in LUCK. And this will be your favorite day. Try it.
๐ด ๐ข๐ฌ, ๐ฅ๐ฒ ๐๐ฑ๐ก ๐ฏ ๐๐ป๐๐๐ฒ๐ ๐ฉ๐๐ฌ๐ฏ๐ ๐๐ง๐ ๐ฉ ๐๐ฌ๐๐ฌ๐ ๐ฆ๐ฏ ๐ฉ ๐ฎ๐ฐ๐๐ฉ๐ฏ๐ ๐จ๐๐ฉ๐๐ง๐ฅ๐ฆ๐ ๐ธ๐๐ฆ๐๐ฉ๐ค ๐ช๐ฏ ๐ฆ๐๐๐ค๐ฆ๐ ๐๐๐ง๐ค๐ฆ๐ ๐ฎ๐ฆ๐๐น๐ฅ. ยท๐ฆ๐ฏ๐๐ผ๐ฏ๐ง๐ ๐ธ๐๐ฒ๐ ๐๐๐ช๐ค๐ผ ๐ฆ๐ ๐ฉ ๐๐ฎ๐ฑ๐ ๐ฎ๐ฆ๐๐น๐ ๐ ๐ฅ๐ฑ๐ ๐ฉ๐๐ฑ๐ค๐ฉ๐๐ฉ๐ค ๐ ๐๐ฎ๐ต๐๐ ๐ ๐จ๐๐ฉ๐๐ง๐ฅ๐ฆ๐ ๐ฎ๐ฆ๐๐ป๐.
โ Oh wow, my page and birdsite account get a shoutout in a recent academic article on English spelling reform. Internet Archive Scholar is a great resource to make available the fruits of academic research.
Pushed a fresh snapshot of fatcat metadata last week:
https://archive.org/download/fatcat_bulk_exports_2022-11-24
Hundreds of millions of paper, file, and journal records. More info about these dumps, and schema, at https://guide.fatcat.wiki/bulk_exports.html
Scholar is built on an open, editable bibliographic catalog: https://fatcat.wiki
Most of the records are automatically imported from our wonderful upstream sources, but any human can directly submit corrections and additions through the web interface or API. These submissions are then reviewed in the open before merging. The entire catalog is versioned and can be downloaded in bulk or synchronized using a "changelog" feed.
You can learn more about editing at:
https://guide.fatcat.wiki/editing_quickstart.html
"his way" / bunch of dudes
Hemingwayesque: 104 hits https://scholar.archive.org/search?q=Hemingwayesque
Kiplingesque: 340 hits https://scholar.archive.org/search?q=Kiplingesque
Turneresque: 358 hits https://scholar.archive.org/search?q=Turneresque
Kafkaesque: 2,423 hits https://scholar.archive.org/search?q=Kafkaesque
via:
https://scholar.archive.org/work/iptlbacnkngpjgrcfeuxa7zvne
https://en.wiktionary.org/wiki/Category:English_words_suffixed_with_-esque
"his way" / bunch of dudes
Sinatraesque: 3 hits https://scholar.archive.org/search?q=Sinatraesque
Cocteauesque: 8 hits https://scholar.archive.org/search?q=Cocteauesque
Bowiesque: 7 hits https://scholar.archive.org/search?q=Bowiesque
Ramboesque: 20 hits https://scholar.archive.org/search?q=Ramboesque
Bergmanesque: 31 hits https://scholar.archive.org/search?q=Bergmanesque
Pynchonesque: 24 hits https://scholar.archive.org/search?q=Pynchonesque
Escheresque: 36 hits https://scholar.archive.org/search?q=Escheresque
Felliniesque: 48 hits https://scholar.archive.org/search?q=Felliniesque
McCarthyesque: 42 hits https://scholar.archive.org/search?q=McCarthyesque
Daliesque: 57 hits https://scholar.archive.org/search?q=Daliesque
python library
trafilatura (https://github.com/adbar/trafilatura) is a nice python library that we use to extract article full text from HTML documents for indexing in scholar. It has good accuracy and recall, works with "old" HTML (eg from web archives), and pulls out metadata like title, author, and date. There are lots of similar tools, mostly focused on news articles, and trafilatura is an improvement.
Thanks to Adrien Barbaresi for maintaining it!
Scholars continue to publish papers in Latin, well in to the twenty first century! Here is a snippets of Dennis Toscano's Masters thesis from the University of Kentucky (2016), contextualizing an anonymous poem, itself in Latin, from 1741:
Opus cui titulus est "Carthago Indiarum obsessa sed non expugnata" est carmen divulgatum sine nomine auctoris saeculo duodevicesimo ad celebrandam victoriam quam Hispani a Britannis Carthagenae Indiarum anno...
https://scholar.archive.org/work/wltkjjt7yjeyzpoghtdftrfuea
https://scholar.archive.org/search?q=lang%3Ala+year%3A%3E2000+%21journal%3Abohemica
Quadratic Equation, in Braille. Via Visual impairment in MSOR by Emma Jane Rowlett and Peter James Rowlett (2010)
https://scholar.archive.org/work/lnt5w5xenjfnvbd33snrfrrxhq
Search engine for tens of millions of preserved research papers.
An @internetarchive project: free software, open metadata, open API, non-profit, ad-free, privacy respecting.