Innovative technology
  Linguistic expertise
    Customized service

Your gateway to unique state-of-the-art solutions based on experience, knowledge and innovative technologies.

NLP's R&D Projects

Judicial text processing

Two recent collaboration of NLP Technologies with RALI Lab, Université de Montréal, under supervision of  Dr. Guy Lapalme deal with judicial texts:
  • Automatic summarization of Legal Text (ASLI)
  • Intelligent system for Semantic processing and Automatic summarization of Legal information (ISASLI)Résumé automatique

Automatic summarization

Researchers at NLP technologies and RALI, under the direction of Dr. Guy Lapalme, have been working in automatic summarization for many years:
  • Work has started at the end of nineties, with Horacio Saggion who developed SumUM for summarizing scientific articles.
  • SumUM was then adapted for CATS to summarize clusters of documents.
  • LetSUMstarted a series of works on judicial documents.
  • Since 2002, RALI has systematically participated to competitions of Document Understanding Conference (DUC), more recently Text Analysis Conference (TAC)
  • In 2007-2008, RALI in collaboration with NLP Technologies took part in the ASLI project dealing in great part on the automatic summarization of judgments.
  • In 2009, RALI and NLP Technologies continued their collaboration with the ISASLI project.
  • In 2010, Pierre-Étienne Genest started his work on summarization by abstraction.Le système SumUM

SumUM system

SumUM was developed by Horacio Saggion in his Ph.D. thesis(1997-2000). Horacio is now working as Ramon y Cajal Senior Research Fellow in the TALN Group at the Department of Information and Communication Technologies, Universitat Pompeu Fabra in Barcelona.

SumUM produces short summaries (10-15 lines) of long scientific documents (15-20 pages). SumUM produces a summary in two steps: first an indicative summary, which identifies important subjects of a document; then an informative summary is produced which elaborates a few subjects selected by the user.

The input to the system is a scientific article in English, containing the following structural elements: title, author and affiliation, introduction, main sections, conclusion, references and acknowledgments. The output of the system is a short indicative summary which is generated from information found in the original text. During evaluations, these have been found to be of comparable quality to summaries written by authors.

For the DUC2002 summary evaluation competition, Atefeh Farzindar slightly modified SumUM without changing the algorithms or the syntactic patterns that were manually developed from a corpus study. Although the system was applied to a different type of text than the newspaper articles used for development, SumUM 's scores were among the best at DUC2002.For DUC2003, SumUM was again modified to deal with the summarization of multiple documents and again it scored very well.


CATS system

CATS (Cats is an Automatic Text Summarizer) has been developed by Atefeh Farzindar and Frédérik Rozon during the summer of 2005 to take part in the Document Understanting Conference 2005 (DUC2005) competition. The task was to summarize in less than 250 words clusters of about 20 newspaper type articles dealing with similar events. The summary had to address a specific topic defined by question written in about 20 lines. CATS' performance, described in this article, was excellent compared with the thirty other systems that participated to the competition.

LetSUM (Legal text Summarizer) system

In collaboration with the LexUM, then a group at the Centre de recherche en droit public of the Law Faculty of Université de Montréal, Atefeh Farzindar has worked on the summarization of legal judgments. The approach exploits the thematic structure of judicial decisions in order to build a table-style summary; this improves coherence and readability of the summary. LetSUM allows jurists to rapidly consult the main ideas of a judgment in order to find appropriate jurisprudence. Atefeh defended her thesis in March 2005 and she has created an enterprise NLP Technologies to develop a legal document management system.

Automatic summarization of Legal Text (ASLI) Project

Between July 2007 and June 2008, NLP Technologies and RALI worked together on the ASLI (Automatic Summarization of Legal Information) research project funded by the Precarn Alliance Program).

Collaboration with the RALI dealt with two main aspects:
  • Semantic rules of the legal field make it possible to segment a legal document, to identify the topics, to select the relevant sentences, to identify the category of the judgment and to determine the legal citations. The rules of legal field are dynamic and diversified; a general standard model must be developed. Moreover, during the design of the depository of information we must consider the maintainability: the ease with which the system can be modified to correct faults and improve performance.
  • The Federal Court of Canada publishes all its judgements in the two official languages of Canada. However, it can take up to 9 months for manual translations judgements to come out. During this period of time, the court would like to be able to use automatically translated judgements and summaries. Once the official translations become available, the Court would replace machine translations by the official ones. The machine translation engine will translate the judgements, the summaries and the extracted information produced by the NLP Technologies analysis engine.

More information is available in:
  • Article (in French) in Forum, November 2007.
  • Emmanuel Chieze, Atefeh Farzindar and Guy Lapalme. Automatic Summarization and Information Extraction from Canadian Immigration Decisions. Proceedings of the Semantic Processing of Legal
  • Texts Workshop, p. 51-57, may 2008 LREC 2008 [PDF]
  • Emmanuel Chieze, Atefeh Farzindar and Guy Lapalme. An Automatic System for Summarization and Information Extraction of Legal Information. Accepted to Semantic Processing of Legal Texts [PDF].
  • Fabrizio Gotti, Atefeh Farzindar, Guy Lapalme and Elliott Macklovitch. Automatic Translation of Court Judgments. AMTA'2008 The Eighth Conference of the Association for Machine Translation in the Americas, p. pp 1-10, Waikiki, Hawai'i, oct 2008 [PDF]
  • Atefeh Farzindar and Guy Lapalme. Machine Translation of Legal Information and Its Evaluation. Canadian AI '09: Proceedings of the 22nd Canadian Conference on Artificial Intelligence, series. Lecture Notes in Artificial Intelligence, p. 64-73, Kelowna, Canada, jun 2009 Springer-Verlag. [PDF]
  • Atefeh Farzindar. Automatic Translation Management System for Legal Texts. MT Summit XII: Proceedings of the twelfth Machine Translation Summit, p. 417-424, Ottawa, Ontario, aug 2009 [PDF]

Intelligent system for Semantic processing and Automatic Summarization of Legal Information (ISASLI) Project

Between January and December 2009, with the financial support of Precarn, RALI and NLP Technologies collaborated to improve the summary revision system and to explore new statistical methods to help the adaptation of the system to new domains. This project also involved the participation of Palomino System Innovations Inc.

Collaboration with the RALI dealt with two main aspects:
  • The development of RevSum a web based graphical interface to help the manual revision of automatic summaries which shows a combined vied of the original judgement and its automatic summary. It provides a simple sentence based interaction. This interface is now in production at NLP Technologies. It was then modified to create HexTAC used to create extractive summaries pour TAC2009 competition.
  • Statistical summarization methods for judgements. Learning experiments were done on more than 4000 text-summary pairs.
More information is available in:
  • Atefeh Farzindar et Mehdi Yousfi-Monod. RevSum - le logiciel Web d'aide à la révision de résumés automatiques. Présentation RALI-OLST, 23 juin 2009. [Slides in French]
  • Pierre-Etienne Genest, Guy Lapalme et Mehdi Yousfi-Monod. HEXTAC: the Creation of a Manual Extractive Run. Proceedings of TAC2009.[PDF]
  • Mehdi Yousfi-Monod, Atefeh Farzindar and Guy Lapalme. Supervised Machine Learning for summarizing Legal Documents. Proceedings of Canadian AI 2010, May 2010, Ottawa.