Mot-clé : machine-learning
PleIAs – OCRonos-Vintage | The case for specialized pre-training: ultra-fast foundation models for dedicated tasks
« Pre-training foundation models is generally thought to be the exclusive domain of a handful of AI labs and big tech…22.03.2024
Releasing Common Corpus: the largest public domain dataset for training LLMs
« (…) Common Corpus is an international initiative coordinated by Pleias, involving researchers in LLM pretraining, AI ethics…22.03.2024
The Model Openness Framework: Promoting Completeness and Openness for Reproducibility, Transparency and Usability in AI
« Generative AI (GAI) offers unprecedented possibilities but its commercialization has raised concerns about transparency, reproducibility, bias, and safety. Many « open-source »…04.01.2023
Defining artificial intelligence for librarians
« The aim of the paper is to define Artificial Intelligence (AI) for librarians by examining general definitions of AI, analysing…25.01.2022
Retour sur l’analyse automatique de corpus de revues SHS
« Dans le cadre du projet Revue 2.0 et des expérimentations de la phase 2 du projet, le HN Lab a…20.07.2021
scite: a smart citation index that displays the context of citations and classifies their intent using deep learning
« Citation indices are tools used by the academic community for research and research evaluation which aggregate scientific literature output and…27.05.2021
data.gouv.fr : Les données ouvertes pour l’apprentissage automatique (Machine Learning)
« Après le mois d’avril dédié à la qualité des données, le mois de mai est dédié aux réutilisations de données…29.04.2021
3 new tools to try for Literature mapping — Connected Papers, Inciteful and Litmaps
« Tired of entering keywords and getting thousands of hits and not sure where to start your literature review? Or having…14.04.2021
Semantic maps and metrics for science using deep transformer encoders
« The growing deluge of scientific publications demands text analysis tools that can help scientists and policy-makers navigate, forecast and beneficially…31.03.2021
Streamer, une plateforme logicielle au service de l’apprentissage automatique sur les flux de données
« Multiplication des sources de données, prolifération des objets connectés, augmentation du nombre de capteurs : désormais omniprésents, les flux de…30.03.2021
NLM-Chem, a new resource for chemical entity recognition in PubMed full text literature
« (…) The NLM-Chem corpus consists of 150 full-text articles, doubly annotated by ten expert NLM indexers, with ~5000 unique chemical…12.03.2021
CUAD: An Expert-Annotated NLP Dataset for Legal Contract Review
« Many specialized domains remain untouched by deep learning, as large labeled datasets require expensive expert annotators. We address this bottleneck…arxiv.org, Dan Hendrycks, Collin Burns, Anya Chen, Spencer Ball, 10 mars 2021, arXiv:2103.06268v1
The battle for ethical AI at the world’s biggest machine-learning conference
« Bias and the prospect of societal harm increasingly plague artificial-intelligence research — but it’s not clear who should be on…11.12.2019
Responsible Operations: Data Science, Machine Learning, and AI in Libraries
« Responsible Operations is intended to help chart library community engagement with data science, machine learning, and artificial intelligence (AI) and…oclc.org,Thomas Padilla, 2019, Dublin, OH: OCLC Research. https://doi.org/10.25333/xk7z-9g97.
Towards Learning from User Feedback for Ontology-basedInformation Extraction (.pdf)
« (…) To automate the evolution of ontologies, we developed ConTrOn- Continuously Trained Ontology – that automatically extracts information from data…04.12.2019
Systematic review of research on artificial intelligence applications in higher education – where are the educators?
« According to various international reports, Artificial Intelligence in Education (AIEd) is one of the currently emerging fields in educational technology.12.11.2019
Allen Institute’s Semantic Scholar now searches across 175 million academic papers
« Some studies suggest that the number of scientific papers published in English each year exceeds 3…07.11.2019
Why Every Python Developer Will Love Ray
« There are many reasons why Python has emerged as the number one language for data science. It’s easy to get…06.11.2019
Blog Google: Understanding searches better than ever before
« (…) With the latest advancements from our research team in the science of language understanding–made possible by machine learning–we’re making…22.08.2019
SEMANTiCS 2019 « The Power of AI and Knowledge Graphs », Sept. 09-12, 2019 , Karlsruhe (Germany) [programme]
« SEMANTiCS conference is the leading European conference on Semantic Technologies and AI. Researchers, industry experts and business leaders can develop…13.08.2019
Application of Natural Language Processing Algorithms to the Task of Automatic Classification of Russian Scientific Texts
« This work is devoted to the study of applicability of modern methods of machine learning to the task of automatic…08.08.2019
Research Questions and a Proposal for the Future Governance of Translation Data (.pdf)
« This article seeks to assess the impact of data-driven methods of machine translation (MT), not just on translators, but more…jostrans.org/issue32, Joss Moorkens, Dave Lewis, juillet 2019
Making Neural Networks FAIR
« Research on neural networks has gained significant momentum over the past few years. A plethora of neural networks is currently…arxiv.org, Anna Nguyen, Tobias Weller, York Sure-Vetter, 26 juillet 2019, arXiv:1907.11569v1
Gerrish, Charlotte. « European Copyright Law and the Text and Data Mining Exceptions and Limitations » [thesis]
« We are in a digital age with Big Data at the heart of our global online environment. Exploiting Big Data…21.05.2019
Identifying Clinical Terms in Medical Text Using Ontology-Guided Machine Learning
« (…) We present a neural dictionary model that can be used to predict if a phrase is synonymous to a…07.05.2019
Un outil de machine learning passe des ouvrages au Tamis
« Le projet Tamis, ou Traitement Algorithmique des Métadonnées en Imagerie et Sémantique, est un programme ayant…19.03.2019
Introduction to Data Science Data Analysis and Prediction Algorithms with R
« (…) The link for the online version of the book is https://rafalab.github.io/dsbook/ The R markdown…28.02.2019
Introducing TensorFlow Datasets
« Public datasets fuel the machine learning research rocket (h/t Andrew Ng), but it’s still too difficult…20.04.2018
AISTATS 2018 : The 21st International Conference on Artificial Intelligence and Statistics, 9-11 April, Lanzarote, Canary Islands [book of proceedings]
« Since its inception in 1985, AISTATS has been an interdisciplinary gathering of researchers at the intersection of artificial intelligence, machine…proceedings.mlr.press/v84, Editors: Amos Storkey, Fernando Perez-Cruz, 2018
Enhancing Usability for Automatically Structuring Digitised Dictionaries
« The last decade has seen a rapid development of the number of NLP tools which have been made available to…05.03.2018
Cours d’initiation au Machine Learning
« (…) Une présentation efficace et concrète du Machine Learning par Google (…) »06.02.2018
TensorFlow sort en version 1.5
« We’re delighted to announce that TensorFlow 1.5 is now public! Install it now to get a bunch of new features…developers.googleblog.com, Laurence Moroney, 26 janvier 2018
Thibaut Thonet. « Modèles thématiques pour la découverte non supervisée de points de vue sur le Web » [thèse]
« Les plateformes en ligne telles que les blogs et les réseaux sociaux permettent aux internautes de s’exprimer sur des sujets…tel.archives-ouvertes.fr, Thibaut Thonet, Université Toulouse 3 – Paul Sabatier, 2017tel-01655278
Programming Languages for Data Science and ML – With Source Code Illustrations
« This resource is part of a series on specific topics related to data science: regression, clustering, neural networks, deep learning,…06.11.2017
The New Version Of Tensorflow 1.4.0 Released!
« TensorFlow – the robust, open-source machine learning Software library, is used for numerical computation using data flow graphs that consist…20.10.2017
NaCTeM involved in research partnership with the BBC to unlock the potential of data in the media
« BBC Research and Development has launched a five-year research partnership with eight UK Universities, including the University of Manchester, to…11.10.2017
Les archives de la Société des Nations à Genève dopées par l’intelligence artificielle
« Les Universités de Genève et de Tsinghua, en Chine, s’associent pour numériser un tronçon délicat des trois kilomètres d’histoire écrite…23.08.2017
The ‘time machine’ reconstructing ancient Venice’s social networks
« Machine-learning project will analyse 1,000 years of maps and manuscripts from the floating city’s golden age. (…) Although the…09.03.2017
Search Earth with AI eyes via powerful satellite image tool
« GeoVisual Search makes it possible to search satellite images of the entire world for matching objects. And it’s just the…16.12.2016
Evernote se lance dans le machine learning, et se réserve le droit de lire vos notes
« L’application de prise de notes entame plusieurs chantiers dont la migration vers le cloud mais aussi vers le machine learning,…30.06.2016
Windie, la plateforme MOOC nouvelle génération
« L’écosystème des MOOCs va t’il se structurer autour d’une économie de plateformes ? L’avenir nous le dira… Une chose est…19.05.2016
Announcing SyntaxNet: The World’s Most Accurate Parser Goes Open Source
« At Google, we spend a lot of time thinking about how computer systems can…24.02.2016
NLP and Machine Learning for Multi-Lingual Markets, Chez Proxem
« Proxem is a Paris-based text-analytics company — “basically, we convert your text into data” — whose…breakthroughanalysis, Françis Régis Chaumartin, Seth Grimes, 18 février 2016
Big data : plus qu’une technologie, une culture
« … « Nous avons découvert que les modèles apprenaient plus vite avec un très grand nombre de données imparfaites, incomplètes,…02.09.2014
A Dating Site for Algorithms
« … A startup called Algorithmia has a new twist on online matchmaking. Its website is a place for businesses…24.06.2014
AnyStyle.io: convertir ses références bibliographiques en mode texte (Word, etc) dans un format importable dans Zotero
« … un service web (code source libre) appelé AnyStyle permet de le faire. En copiant-collant…20.06.2014
INRIA : « tout le monde confond Big Data et Machine Learning »
« Les articles des journalistes et les échanges sur les réseaux sociaux montrent la confusion qui existe entre le…02.10.2013