Corpus full texts


(Juste Raimbault) #1

A gold mine : https://www.reddit.com/r/DataHoarder/comments/9ifnda/600_gb_corpus_of_all_paywalled_scholarly_sources/
How about integrating that into gargantext, for example in collaboration with multivac which can handle such a massive dataset ? (and for example provide a new search option into full texts of this corpus, do text mining on full texts, etc.)