Hunting for Hidden Treasures: Chemistry Text Mining in Patents and Other Documents

Normally, when talking about patent analytics, if text mining is being discussed it’s being used in conjunction with the practice of organizing, or discovering insights within text-related data. In particular, it’s normally associated with spatial concept mapping, or the identification of relationships between collections of text-based information.

But text mining isn’t just about document clustering, and looking for similarities between concepts, it also refers to entity identification, and extraction. During this process raw text data is scrutinized for specific terms, term fragments, and concepts which, when recognized can be tagged, extracted, and further analyzed. This aspect of text mining, and how it can be applied to chemistry, and chemical patents was the subject of a symposium during the 248th American Chemical Society (ACS) Meeting held recently in San Francisco.

This symposium, organized by David Deng from ChemAxon looked at recent advancements in the field, and provided details on how this discipline has grown, and expanded over the past few years. ChemAxon has graciously created a website where the abstracts, and in most cases the slides from the presentations can be reviewed. Those readers who are interested in chemical patent information are certain to find a number of interesting topics discussed within this collection of talks. The title, and authors of each of the talks are provided. Clicking on the title will transfer the reader to a page on the ChemAxon website where they can read the abstract, and in most cases download the presentation slides.

When your language is science: Abstracting, classifying, and indexing patents in the Derwent World Patents Index

Don Walter (Thomson Reuters)

Chemistry and reactions from non-US patents

Daniel M. Lowe (NextMove Software)

Document-to-Structure to be trilingual: Extract, display, and search chemical information within English, Chinese, and Japanese patents

David Deng (ChemAxon)

Chemically aware text mining platform

Andrew Hinton, David Milward (Linguamatics)

CHEMDNER task: Automatic recognition of chemical entities in text

Julen Oyarzabal (Spanish National Cancer Research Centre (CNIO), University of Navarra)

Structuring the unstructured: Creating knowledge through visual analytics and the use of Tibco Spotfire with Attivio for text analytics of scientific patents

Joshua A. Bishop, Philip J. Skinner (PerkinElmer Informatics)

Recent enhancements in the accuracy of CLiDE tool for extracting chemical structure data from patents and other documents

Anikó T. Valkó (Keymodule, University of Leeds)

Structure Clipper: An interactive tool for extracting chemical structures from patents

Christopher Kibbey (Pfizer)

Computer-assisted Markush structures curation from patent documents

David Deng (ChemAxon)

Use of reverse text-mining to establish whether indexing and classification of chemical patents is still necessary

Robert A Stembridge (Thomson Reuters)

Extraction of chemical reactions from full text documents: From n-tuples of value attribute pairs toward the automated construction of reaction databases

Lutz Weber (OntoChem)

ChemInfoCloud: Opensource based Cloud compatible chemical textmining tools for harvesting largescale medical literature

Muthukumarasamy Karthikeyan (CSIR-Indian National Chemical Laboratory)

Knowledge mining by structure search

Jinbo Lee (Scilligence)

Toward extracting analytical science metrics from the RSC archives

Stuart Chalk (Royal Society of Chemistry, University of North Florida)

SureChEMBL: An open system for exploration of patent chemistry space

Michal M. Nowotka (European Bioinformatics Institute)

Computer analysis of the scientific literature

Meena Nagarajan (IBM Research)

Using the BRAIN, biorelations and intelligence network, for knowledge discovery

Albert Mons (Euretos)

There are a number of very interesting looking talks here, and for those of us who are interested in this topic, but were unable to make it out to San Francisco, having access to these pages is almost as good as having been able to attend in person.

Comments 2

    1. Hello AJ,

      Agreed, but I’m told that a number of the talks will end up on the ACS Presentations on Demand site, so we may be able to see some of them from there.


Leave a Reply

Your email address will not be published. Required fields are marked *