RIC: Research Interest Comparator

Rodriguez, carlos, Ph.D.
Affiliation: CNIO
Email: crodriguezp@cnio.es
Home Page: http://www.cnio.es
New This Month | New This Year | Abstract | Selected Publications | RIC Statistics Results - FULL MEDLINE:

Coh-metrix: analysis of text on cohesion and language.
Arthur C Graesser ... Zhiqiang Cai
Behav Res Methods Instrum Comput 2004 May; 36(2)193-202.

Score: 0.472
Automatic processing of multilingual medical terminology: applications to thesaurus enrichment and cross-language information retrieval.
H Déjean ... F Sadat
Artif Intell Med 2005 Feb; 33(2)111-24.

Score: 0.466
Automatic disambiguation of morphosyntax in spoken language corpora.
C Parisse ... M T Le Normand
Behav Res Methods Instrum Comput 2000 Aug; 32(3)468-81.

Score: 0.442
Automatic extraction of linguistic knowledge from an international classification.
R Baud ... J R Scherrer
Medinfo 1998 ; 9 Pt 1()581-5.

Score: 0.430
Current trends with natural language processing.
A M Rassinoux ... R Baud
Medinfo 1995 ; 8 Pt 2()1657.

Score: 0.423
Automatic extraction of acronym-meaning pairs from MEDLINE databases.
J Pustejovsky ... M Morrell
Medinfo 2001 ; 10(Pt 1)371-5.

Score: 0.418
GALEN: a third generation terminology tool to support a multipurpose national coding system for surgical procedures.
B Trombert-Paviot ... H Idir
Int J Med Inform 2000 Sep; 58-59()71-85.

Score: 0.410
The nature of lexical knowledge.
A T McCray
Methods Inf Med 1998 Nov; 37(4-5)353-60.

Score: 0.399
Natural language processing of medical texts within the HELIOS environment.
A M Rassinoux ... J R Scherrer
Comput Methods Programs Biomed 1994 Dec; 45 Suppl()S79-96.

Score: 0.398
The effects of learning two languages on levels of metalinguistic awareness.
S J Galambos ... S Goldin-Meadow
Cognition 1990 Jan; 34(1)1-56.

Score: 0.394
Acquisition of lexical resources from SNOMED for medical language processing.
P Zweigenbaum ... P Courtois
Medinfo 1998 ; 9 Pt 1()586-90.

Score: 0.392
Communication in science.
H Deda ... H Yakupoglu
Acta Neurochir Suppl 2002 ; 83()17-23.

Score: 0.392
NLP techniques associated with the OpenGALEN ontology for semi-automatic textual extraction of medical knowledge: abstracting and mapping equivalent linguistic and logical constructs.
M B do Amaral ... A L Rector
Proc AMIA Symp 2000 ; ()76-80.

Score: 0.381
Models of natural language understanding.
M Bates
Proc Natl Acad Sci U S A 1995 Oct; 92(22)9977-82.

Score: 0.379
New trends in natural language processing: statistical natural language processing.
M Marcus
Proc Natl Acad Sci U S A 1995 Oct; 92(22)10052-9.

Score: 0.377
A Dutch medical language processor.
P Spyns ... G De Moor
Int J Biomed Comput 1996 Jun; 41(3)181-205.

Score: 0.368
Galen-In-Use: using artificial intelligence terminology tools to improve the linguistic coherence of a national coding system for surgical procedures.
J M Rodrigues ... F Meusnier-Carriot
Medinfo 1998 ; 9 Pt 1()623-7.

Score: 0.367
Recent advances in natural language processing for biomedical applications.
Nigel Collier ... Patrick Ruch
Int J Med Inform 2006 Jun; 75(6)413-7.

Score: 0.366
Implementing the Medical Desktop: tools for the integration of independent information resources.
J W Loonsk ... H Litt
Proc Annu Symp Comput Appl Med Care 1991 ; ()574-7.

Score: 0.362
Structuring medical information into a language-independent database.
M B do Amaral ... Y Satomura
Med Inform (Lond) 1994 Jul-Sep; 19(3)269-82.

Score: 0.360
Terminology-driven literature mining and knowledge acquisition in biomedicine.
Goran Nenadić ... Jun-ichi Tsujii
Int J Med Inform 2002 Dec; 67(1-3)33-48.

Score: 0.358
Information extraction from Korean radiology reports mingled two language.
Miyoung Kwak ... Jinwook Choi
AMIA Annu Symp Proc 2005 ; ()1014.

Score: 0.357
Knowledge representation and indexing using the unified medical language system.
K Baclawski ... B Indurkhya
Pac Symp Biocomput 2000 ; ()493-504.

Score: 0.357
A reduced ambiguity lexical system.
Paul Frenger
Biomed Sci Instrum 2004 ; 40()424-8.

Score: 0.356
Accessibility and language characteristics in Catalonia.
C Wessels ... J M Beck
Tijdschr Econ Soc Geogr 1994 ; 85(2)130-40.

Score: 0.356
The organization and use of information: contributions of information science, computational linguistics and artificial intelligence.
D E Walker
J Am Soc Inf Sci 1981 Sep; 32(5)347-63.

Score: 0.355
Natural language from artificial life.
Simon Kirby
Artif Life 2002 ; 8(2)185-215.

Score: 0.354
GNARE: automated system for high-throughput genome analysis with grid computational backend.
Dinanath Sulakhe ... Natalia Maltsev
J Clin Monit Comput 2005 Oct; 19(4-5)361-9.

Score: 0.354
Knowledge discovery in biology and biotechnology texts: a review of techniques, evaluation strategies, and applications.
J Natarajan ... W Dubitzky
Crit Rev Biotechnol 2005 Jan-Jun; 25(1-2)31-52.

Score: 0.354
Galen: a third generation terminology tool to support a multipurpose national coding system for surgical procedures.
B Trombert-Paviot ... H Idir
Stud Health Technol Inform 1999 ; 68()901-5.

Score: 0.350
Harnessing health information in the Third World.
S E Coghlan ... M S Khan
World Health Forum 1993 ; 14(3)301-4.

Score: 0.350
What is bioinformatics? A proposed definition and overview of the field.
N M Luscombe ... M Gerstein
Methods Inf Med 2001 ; 40(4)346-58.

Score: 0.350
A light knowledge model for linguistic applications.
R H Baud ... A M Rassinoux
Proc AMIA Symp 2001 ; ()37-41.

Score: 0.350
Getting to the (c)ore of knowledge: mining biomedical literature.
Berry de Bruijn ... Joel Martin
Int J Med Inform 2002 Dec; 67(1-3)7-18.

Score: 0.348
Towards a unified medical lexicon for French.
Pierre Zweigenbaum ... Stéfan Darmoni
Stud Health Technol Inform 2003 ; 95()415-20.

Score: 0.347
GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles.
C Friedman ... A Rzhetsky
Bioinformatics 2001 ; 17 Suppl 1()S74-82.

Score: 0.347
UMLS language and vocabulary tools.
Allen C Browne ... Alexa T McCray
AMIA Annu Symp Proc 2003 ; ()798.

Score: 0.346
Wnt pathway curation using automated natural language processing: combining statistical methods with partial and full parse for knowledge extraction.
Carlos Santos ... David J States
Bioinformatics 2005 Apr; 21(8)1653-8.

Score: 0.346
[Basic science and applied science]
R Pérez-Tamayo
Salud Publica Mex 2001 Jul-Aug; 43(4)368-72.

Score: 0.345
Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program.
A R Aronson
Proc AMIA Symp 2001 ; ()17-21.

Score: 0.345
The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text.
Thomas C Rindflesch ... Marcelo Fiszman
J Biomed Inform 2003 Dec; 36(6)462-77.

Score: 0.344
Full text multilingual automatic morphosemantems for stand-alone or Internet based applications.
C Lovis ... J R Scherrer
Medinfo 1998 ; 9 Pt 1()155.

Score: 0.343
UMLF: a unified medical lexicon for French.
Pierre Zweigenbaum ... Stéfan Darmoni
Int J Med Inform 2005 Mar; 74(2-4)119-24.

Score: 0.343
Multilingual natural language generation as part of a medical terminology server.
J C Wagner ... J R Scherrer
Medinfo 1995 ; 8 Pt 1()100-4.

Score: 0.343
Unsupervised learning of natural languages.
Zach Solan ... Shimon Edelman
Proc Natl Acad Sci U S A 2005 Aug; 102(33)11629-34.

Score: 0.343
Applying database technology to clinical and basic research bioinformatics projects.
Thomas P Nadeau ... Eva L Feldman
J Integr Neurosci 2003 Dec; 2(2)201-17.

Score: 0.342
Extending a natural language parser with UMLS knowledge.
A T McCray
Proc Annu Symp Comput Appl Med Care 1991 ; ()194-8.

Score: 0.342
The six languages of social work.
M Bloom ... A Chambon
Soc Work 1991 Nov; 36(6)530-4.

Score: 0.341
Natural language processing and semantical representation of medical texts.
R H Baud ... J R Scherrer
Methods Inf Med 1992 Jun; 31(2)117-25.

Score: 0.339
NLP-based information extraction for managing the molecular biology literature.
Bisharah Libbus ... Thomas C Rindflesch
Proc AMIA Symp 2002 ; ()445-9.

Score: 0.336
Telemedicine and terminology: different needs of context information.
J Ingenerf
IEEE Trans Inf Technol Biomed 1999 Jun; 3(2)92-100.

Score: 0.334
Decision support and disease management: a logic engineering approach.
J Fox ... R Thomson
IEEE Trans Inf Technol Biomed 1998 Dec; 2(4)217-28.

Score: 0.333
Evaluating the UMLS as a source of lexical knowledge for medical language processing.
C Friedman ... G Hripcsak
Proc AMIA Symp 2001 ; ()189-93.

Score: 0.332
Flexible information storage in MUDR(II) EHR.
Josef Spidlen ... Jana Zvárová
Int J Med Inform 2006 Mar-Apr; 75(3-4)201-8.

Score: 0.331
The foundations and development of metalinguistic knowledge.
Dean Sharpe ... Philip David Zelazo
J Child Lang 2002 May; 29(2)481-4; discussion 489-94.

Score: 0.330
The natural language processing of medical databases.
R R Grams ... Z M Jin
J Med Syst 1989 Apr; 13(2)79-87.

Score: 0.328
MEDSYNDIKATE--a natural language system for the extraction of medical information from findings reports.
Udo Hahn ... Stefan Schulz
Int J Med Inform 2002 Dec; 67(1-3)63-74.

Score: 0.328
A survey of current work in biomedical text mining.
Aaron M Cohen ... William R Hersh
Brief Bioinform 2005 Mar; 6(1)57-71.

Score: 0.328
Creating knowledge repositories from biomedical reports: the MEDSYNDIKATE text mining system.
Udo Hahn ... Stefan Schulz
Pac Symp Biocomput 2002 ; ()338-49.

Score: 0.328
Effective access to distributed heterogeneous medical text databases.
W B Croft ... D B Aronow
Medinfo 1995 ; 8 Pt 2()1719.

Score: 0.327
ModelDB: an environment for running and storing computational models and their results applied to neuroscience.
B E Peterson ... G M Shepherd
J Am Med Inform Assoc 1996 Nov-Dec; 3(6)389-98.

Score: 0.326
Automated access to a large medical dictionary: online assistance for research and application in natural language processing.
A T McCray ... S Srinivasan
Comput Biomed Res 1990 Apr; 23(2)179-98.

Score: 0.325
Electronic Patient Information -- Pioneers and MuchMore. A vision, lessons learned, and challenges.
W Giere
Methods Inf Med 2004 ; 43(5)543-52.

Score: 0.325
The structure of science information.
Zellig S Harris
J Biomed Inform 2002 Aug; 35(4)215-21.

Score: 0.324
Galen-In-Use: an EU Project applied to the development of a new national coding system for surgical procedures: NCAM.
J M Rodrigues ... F Meusnier
Stud Health Technol Inform 1997 ; 43 Pt B()897-901.

Score: 0.323
Information and knowledge extraction from medical texts.
J Kontos ... I Malagardi
Stud Health Technol Inform 2000 ; 57()260-9.

Score: 0.323
Distributed data mining on grids: services, tools, and applications.
Mario Cannataro ... Paolo Trunfio
IEEE Trans Syst Man Cybern B Cybern 2004 Dec; 34(6)2451-65.

Score: 0.320
Mining chemical structural information from the drug literature.
Debra L Banville
Drug Discov Today 2006 Jan; 11(1-2)35-42.

Score: 0.320
Theodor Bücher Lecture. Metabolomics, modelling and machine learning in systems biology - towards an understanding of the languages of cells. Delivered on 3 July 2005 at the 30th FEBS Congress and the 9th IUBMB conference in Budapest.
Douglas B Kell
FEBS J 2006 Mar; 273(5)873-94.

Score: 0.320
Collaborative health care information system development through sharable infrastructure, services, and paradigms.
R A Greenes ... S R Deibel
Medinfo 1995 ; 8 Pt 1()190-4.

Score: 0.320
Use of project ontologies and terminology servers to support software engineering.
M BÃ¥ng ... T Timpka
Medinfo 1998 ; 9 Pt 1()639-43.

Score: 0.320
Natural language processing, lexicon and semantics.
E Wehrli ... R Clark
Methods Inf Med 1995 Mar; 34(1-2)68-74.

Score: 0.319
Two biomedical sublanguages: a description based on the theories of Zellig Harris.
Carol Friedman ... Andrey Rzhetsky
J Biomed Inform 2002 Aug; 35(4)222-35.

Score: 0.319
Memory-based language processing: psycholinguistic research in the 1990s.
G McKoon ... R Ratcliff
Annu Rev Psychol 1998 ; 49()25-42.

Score: 0.319
Two applications of information extraction to biological science journal articles: enzyme interactions and protein structures.
K Humphreys ... R Gaizauskas
Pac Symp Biocomput 2000 ; ()505-16.

Score: 0.319
The bilingual lexicon: implications for studies of language choice.
S Quay
J Child Lang 1995 Jun; 22(2)369-87.

Score: 0.319
The distinction between linguistic and conceptual semantics in medical terminology and its implication for NLP-based knowledge acquisition.
W Ceusters ... A Waagmeester
Methods Inf Med 1998 Nov; 37(4-5)327-33.

Score: 0.319
Structural semantic interconnections: a knowledge-based approach to word sense disambiguation.
Roberto Navigli ... Paola Velardi
IEEE Trans Pattern Anal Mach Intell 2005 Jul; 27(7)1075-86.

Score: 0.318
Computational tools for the modern andrologist.
C Niederberger
J Androl 1996 Sep-Oct; 17(5)462-6.

Score: 0.318
Towards linking patients and clinical information: detecting UMLS concepts in e-mail.
Patricia Flatley Brennan ... Alan R Aronson
J Biomed Inform 2003 Aug-Oct; 36(4-5)334-41.

Score: 0.318
Information revolution in nursing and health care: educating for tomorrow's challenge.
B M Kooker ... S S Richardson
Semin Nurse Manag 1994 Jun; 2(2)79-84.

Score: 0.317
Lexicons and linguistics.
B L Gordon
Med Rec News 1976 Apr; 47(2)6, 8-9.

Score: 0.317
Terminology-driven mining of biomedical literature.
Goran Nenadic ... Sophia Ananiadou
Bioinformatics 2003 May; 19(8)938-43.

Score: 0.317
Automatic pathway building in biological association networks.
Anton Yuryev ... Ilya Mazo
BMC Bioinformatics 2006 ; 7()171.

Score: 0.316
Mining terminological knowledge in large biomedical corpora.
Hongfang Liu ... Carol Friedman
Pac Symp Biocomput 2003 ; ()415-26.

Score: 0.316
Multimedia technologies in education.
Joseph Liaskos ... Marianna Diomidus
Stud Health Technol Inform 2002 ; 65()359-72.

Score: 0.316
Linguistic approaches to biological sequences.
D B Searls
Comput Appl Biosci 1997 Aug; 13(4)333-44.

Score: 0.316
A collaborative institutional model for integrating computer applications in the medical curriculum.
C P Friedman ... E L Juliano
Proc Annu Symp Comput Appl Med Care 1991 ; ()752-6.

Score: 0.315
An introduction to the Semantic Web for health sciences librarians.
Ioana Robu ... Benoit Thirion
J Med Libr Assoc 2006 Apr; 94(2)198-205.

Score: 0.315
Users conceptual views on medical information databases.
M Joubert ... A Tafazzoli
Int J Biomed Comput 1994 Oct; 37(2)93-104.

Score: 0.314
Trends in computational biology: a summary based on a RECOMB plenary lecture, 1999.
J C Wooley
J Comput Biol 1999 Fall-Winter; 6(3-4)459-74.

Score: 0.314
Biological nomenclatures: a source of lexical knowledge and ambiguity.
O Tuason ... C Friedman
Pac Symp Biocomput 2004 ; ()238-49.

Score: 0.314
Learning medical and dental sciences through interactive multi-media.
A Demirjian ... B David
Medinfo 1995 ; 8 Pt 2()1705.

Score: 0.313
Improving natural language and speech interfaces by the use of metalinguistic phenomena.
R G Leiser
Appl Ergon 1989 Sep; 20(3)168-73.

Score: 0.313
Facilitating research in pathology using natural language processing.
Hua Xu ... Carol Friedman
AMIA Annu Symp Proc 2003 ; ()1057.

Score: 0.313
BRIGEP--the BRIDGE-based genome-transcriptome-proteome browser.
A Goesmann ... F Meyer
Nucleic Acids Res 2005 Jul; 33(Web Server issue)W710-6.

Score: 0.312
The extraction of useful information from the biomedical literature.
R Kostoff
Acad Med 2001 Dec; 76(12)1265-70.

Score: 0.312
From French vocabulary to the Unified Medical Language System: a preliminary study.
O Bodenreider ... A T McCray
Medinfo 1998 ; 9 Pt 1()670-4.

Score: 0.311
Research priorities in language and learning.
P Fletcher
Ann Otol Rhinol Laryngol Suppl 1980 Sep-Oct; 89(5 Pt 2)182-4.

Score: 0.311
[Scientific productivity standards and the National Automous University of Mexico School of Medicine]
Federico Martínez ... Enrique Piña
Gac Med Mex 2004 Nov-Dec; 140(6)599-606.

Score: 0.311
Abstract:

http://www.ccg.unam.mx/Computational_Genomics/
Studies and academic information
Universitat Pompeu Fabra
Linguistics Ph.D.
(Language & Communication), Departament de Traducció i Filología
(2005, cum laude) Barcelona, Catalonia (Spain)
Universitat Pompeu Fabra
Applied Linguistics Master
(Computacional Lexicography and Terminology), Institut Universitari de Lingüística Aplicada
(1999) Barcelona, Catalonia (Spain)
National Autonomous University
B.A. Philosophy
(1996) Mexico City, Mexico.
Academic interests:
Information Extraction/Retrieval, Computational Semantics, Pragmatics, Knowledge Engineering, Bioinformatics, Dialogue modeling, Philosophical issues of linguistic phenomena, Minority and Endangered Languages, Open Source Technology Development.
Scholarships awarded
Awarded two scholarships (1998-1999) at the Applied Linguistics Institute. Pompeu Fabra University.
Awarded a grant and financial assistance by Mexico's National Council on Science and Technology (CONACYT), from 1998-1999, and 2003-2004
National Researcher System (SNI) level I award, since 2006
Ph.D. Research
(Jointly tutored by Toni Badia and by Enric Vallduví, Pompeu Fabra University).
Development of an Information Extraction application designed to process technical texts as enormous dictionaries by parsing metalinguistic segments where information is offered about the rules or the units of a technical sublanguage. Such information extraction generates Metalinguistic Information Database with computationally-tractable data that can be useful not only for lexicography and terminology, but also for AI systems that need unorthodox information not readily available in existing semantic networks, lexicons or traditional ontologies. http://www.tdx.cesca.es/TDX-0228105-114717/


Recent R&D projects:
System for automatic language ID for Mexico's indigenous languages
DIME project for modeling task-oriented dialogues (IIMAS, UNAM)
Development of Natural Language Toolkit (nltk.sf.net) interface for CLIC-TALP and 3LB corpora
Eslema, an electronic corpora for Asturian (Oviedo University)
Extraction of Bacterial Regulatory Systems from Text for RegulonDB (Center for Genomic Sciences, UNAM)
Publications
(in preparation) Recreating manual curation of bacterial regulatory systems using Natural Language Processing. BMC Bioinformatics.
(under review) Metalinguistic Information Extraction from Corpora Language Resources and Evaluations. Kluwer/Springer.
(Forthcoming, January 2007) Researching research using metalinguistic information extraction: A common ground for knowledge through knowledge of language.
Studies in Pragmatics. Elsevier, Toronto.
2006 Balancing Transactions in Practical Dialogues. CICLING-2006, Series: Lecture Notes in Computer Science. VOL. 3878, pp 331-342 Springer-Verlag.
2005 Explotación computacional del metalenguaje en corpus especializados para la generación de lexicones no convencionales. Journal of the Spanish Society for Natural Language Processing, Num. 35, September, 2005.
2005 Metalinguistic Information Extraction from specialized texts to enrich computational lexicons. Ph.D. dissertation. Departament de Traducció i Filología. Universitat Pompeu Fabra, Barcelona
2004 Procesamiento automático del metalenguaje en Terminología: Nuevas herramientas para la enseñanza y la investigación. Red Iberoamericana de Terminología (RITERM) 2004, Barcelona
2004 Mining metalinguistic activity in corpora to create lexical resources using Information Extraction techniques: the MOP system ACL 2004, Barcelona.
2004 Metalinguistic Information Extraction for Terminology, 3rd International Workshop on Computational Terminology (CompuTerm 2004) Coling 2004. Geneve.
2003 Researching research using metalinguistic Information Extraction. Eight International Pragmatics Conference of the International Pragmatics Association (IPrA). Panel on Lexical Markers of Common Ground. Toronto, 2003.
2003 Applying Information Extraction techniques to metalinguistic discourse, in Topics in Computational Linguistics and Intelligent Text Processing; Lecture Notes in Computer Science. Springer-Verlag. (forthcoming)
2002 Cambridge Klett Compact SP-EN Dictionary (colaborator), Cambridge University Press, Stuttgart.
2002 Automatic Extraction of Non-standard Lexical Data for a Metalinguistic Information Database, in CICLing 2002, Mexico City, Mexico. Series: Lecture Notes in Computer Science. VOL. 2276 Springer-Verlag.
2001 "Parsing Metalinguistic Knowledge from Texts" In: A. Gelbukh (ed.) Selected papers from CICLING-2000 Collection in Computer Science (CCC); co-published by Mexican National Polytechnic Institute (IPN)
2001 "Las características del conocimiento especializado y la relación con el conocimiento en general" & "Principios metodológicos de la propuesta teórica (II)", in La Terminología Científico-Técnica: Reconocimiento, análisis y extracción de información formal y semántica. Institut Universitari de Lingüística Aplicada, Universitat Pompeu Fabra. Barcelona, 2001.
2000 "Extraction of knowledge about terms from metalinguistic activity in texts" In: A. Gelbukh (ed.) Proceedings of the Conference on Intelligent text processing and Computational Linguistics, CICLING-2000. Instituto Politécnico Nacional, Mexico City, Mexico.
1999 "Explicit Metalinguistic Operations in specialized discourse: The construction of lexical meaning in theoretic science". In: P. Sandrini (ed.) Terminology and Knowledge Engineering TKE'99, Innsbruck, Austria.
1999 "Operaciones Metalingüísticas Explícitas en textos de especialidad". (Explicit Metalinguistic Operations in Special Domain Texts) Master's Dissertation. Institut Universitari de Lingüística Aplicada. Universitat Pompeu Fabra. Barcelona.
1996 "La concepción de Filosofía en el Wittgenstein tardío" (The meaning of "Philosophy" in the later Wittgenstein) Philosophy Degree Dissertation. Philosophy department. National Autonomous University, Mexico City.
Professional Experience
Starting October 2006:
Postdoctoral Researcher at the Structural Computational Biology Group, under Dr. Alfonso Valencia, developing Text Mining for Bioinformatics. (National Center for Oncological Research, Madrid, Spain)
October 2005 to September 2006:
Postdoctoral Researcher at the Computational Genomics Group, Center for Genomic Sciences, developing Bio-NLP and Information Extraction applications for the RegulonDB database and other resources. Teaches Text Mining module at the Genomic Sciences degree. (National Autonomous University, Cuernavaca)
January 2005 to September 2006:
Researcher at the Computer Science Department, Applied Mathematics And Systems Research Institute, teaching course on Natural Language Processing, and collaborator on various Computational Linguistics research projects, mainly lexical acquisition and DIME dialogue system. (National Autonomous University, Mexico City)
May 2003 to December 2004:
Resaercher and Professor at the Engineering Institute, teaching course on Natural Language Processing, and collaborator on various research projects with the Language Engineering Group, including tools and applications for teaching Computational Linguistics. (National Autonomous University, Mexico City)
November 2001 to May 2003:
Editor, Software Developer and Project Manager with leading developers for educational, ESL & SSL programs from major U.S. publishers (Prentice Hall, Houghton Mifflin, etc.; Massachusetts, USA)
February 2001 to May 2002:
Linguist Team Leader and TM specialist at Lionbridge's (Harvard Translations) publishing and production department. Lionbridge is a world leader in software localization, as well as technical translation and QA for highly technical documents from the Life Sciences and Financial industries, among others. (Boston, Massachusetts.)
February-September 2000:
Coordinator for content structuring and ontology buildup for an Internet directory QuieroInternet.com Ocxon-Digital. (Barcelona, Catalonia (Spain).)
May 1999-October 2000:
Works for Klett-Verlag, on the new 2002 Cambridge Klett Compact bilingual dictionary (Barcelona, Catalonia (Spain).)
1997-1998:
Pompeu Fabra University's Applied Linguistics Institute, work on multilingual corpora for term extraction and analysis, and to coordinate academic seminars and the biannual Terminology Summer School (Barcelona, Catalonia (Spain)).
Additional Information
Highly competent user of all kinds of IT applications and MAC, Windows and Linux Operating Systems. Fast learner of new environments. Python and Perl programming languages.
Team-work oriented; capable of coordinating projects and people. Experience at all kinds of editorial activities, including planning, design, translation, proofing, printing, publishing, etc. Experience with project manager software and methodologies, TM tools, and Desktop Publishing applications.
Languages: Native speaker of Spanish, highly fluent in English (lifelong bilingual education), highly fluent in Catalan (speaking and reading), competent reader of French and Portuguese.
Personal interests: technology, linguistics, poetry, education, cognitive science, philosophy, literature, multiculturalism, etc.

research Interests:


My activities and interests in research are attested by my academic publications, as well as by projects I have been involved with, both past and present. I also have teaching experience in the fields of Computational Linguistics and Language Engineering.
My graduate research has focused lately on lexical acquisition, development of a novel metalinguistic information extraction system useful for terminology and the study of scientific discourse. This system was based on an extensive, marked-up corpus of definition-like sentences, as well as other technical corpora from various sources, and used collocations, machine-learning and Information Extraction techniques. You can see a demo at www.iling.unam.mx/crodri/cdrom/. I am interested in expanding on this work and developing other IE techniques, for example for tracking elected officials and government programs mentioned on the daily press.
I am also seeking funding for a project (in collaboration with Carnegie Mellon University and Brandeis University as U.S. based partners) to implement automatic indigenous language identification technologies for assisting health and immigration authorities at processing monolingual speakers in the U.S. We plan to create the technology infrastructure for an Indigenous Language Spoken Corpora Repository to train the identifying algorithms.
I am interested in developing an Information Extraction module for teaching the field?s techniques using the Natural Language ToolKit. One of my enduring interests is developing Open Source tools for teaching Computational Linguistics in a manner accessible both to Linguistics and to Computer Science students. You can find a small demo of a tool to teach statistical language models at www.iling.unam.mx/crodri/gens/.
As you can see, an important part of my interests is to exploit theoretical and technological advancements to create usable, socially-relevant applications that will help the citizen improve its quality of live and empower him in some aspect of his/her life. I basically work with Python, Perl, HTML and XML, among other languages and standards, to develop Information Extraction and Lexicography tools. I have experience with Web Technologies, Translation Memories, software localization, terminology management, Knowledge Engineering and Desktop Publishing applications. I am very technically-minded, and an avid follower of state of the art in technology.
Keywords extracted from the abstract: [ eliminated words list ]
Count Word
6.000 1996
6.000 1997-1998
12.198 1998-1999
12.198 1999
5.677 1999-october
6.000 2000
12.198 2001
12.198 2002
12.198 2003
6.000 2003-2004
16.352 2004
12.198 2005
12.198 2006
6.000 331-342
5.337 3lb
2.280 3rd
5.960 academic
2.713 accessible
2.948 acl
4.664 acquisition
2.994 activities
1.702 activity
1.615 additional
3.747 advancements
1.673 ai
3.566 alfonso
2.297 algorithms
3.489 am
0.666 analysis
4.700 análisis
Count Word
10.744 aplicada
1.614 application
8.317 applications
5.244 applied
2.540 applying
2.500 art
2.441 aspect
2.576 assistance
3.416 assisting
1.438 association
5.506 asturian
4.282 attested
2.075 austria
3.055 authorities
6.831 automatic
5.063 automático
10.643 autonomous
1.577 available
3.972 avid
3.307 award
10.416 awarded
2.334 bacterial
4.314 badia
3.352 balancing
12.343 barcelona
2.692 based
3.263 basically
4.494 biannual
7.850 bilingual
6.000 bio-nlp
Count Word
8.456 bioinformatics
1.455 biology
2.946 bmc
1.941 boston
3.574 brandeis
3.766 buildup
6.168 cambridge
2.052 capable
5.026 características
3.572 carnegie
4.146 catalan
13.462 catalonia
3.664 ccc
3.991 ccg
5.830 cdrom
3.084 center
5.540 cesca
6.000 cicling
12.198 cicling-2000
6.000 cicling-2006
6.000 científico-técnica
3.851 citizen
7.394 city
6.000 clic-talp
6.000 co-published
1.901 cognitive
6.000 colaborator
5.646 coling
2.774 collaboration
9.559 collaborator
Count Word
1.993 collection
5.646 collocations
3.158 com
2.797 common
1.850 communication
5.823 compact
5.793 competent
10.301 computacional
14.209 computational
6.000 computationally-tractable
7.445 computer
6.000 computerm
2.215 con
6.000 conacyt
4.314 concepción
5.052 conference
9.460 conocimiento
2.319 construction
1.512 content
5.577 convencionales
2.888 coordinate
3.281 coordinating
3.791 coordinator
13.170 corpora
4.545 corpus
2.521 council
3.263 course
7.019 create
12.198 crodri
3.879 cuernavaca
Count Word
4.339 cum
4.801 curation
1.618 daily
1.723 data
6.087 database
4.666 de
2.376 december
6.000 definition-like
3.320 degree
5.232 del
10.094 demo
6.710 departament
1.958 department
1.430 design
1.856 designed
8.400 desktop
1.851 develop
4.263 developer
3.971 developers
6.073 developing
3.247 development
6.803 dialogue
8.914 dialogues
4.217 dictionaries
7.948 dictionary
10.120 dime
3.165 directory
9.154 discourse
10.999 dissertation
2.973 documents
Count Word
1.563 domain
1.900 dr
5.559 ed
3.387 editor
1.774 editorial
2.647 education
2.068 educational
4.196 el
3.646 elected
2.398 electronic
2.706 elsevier
4.033 empower
6.405 en
3.598 endangered
3.619 enduring
7.621 engineering
1.071 english
3.179 enormous
4.566 enric
3.751 enrich
5.677 enseñanza
2.558 environments
1.906 es
4.498 esl
6.000 eslema
4.863 especialidad
5.169 especializado
5.830 especializados
7.107 etc
2.561 evaluations
Count Word
2.114 example
2.281 existing
2.873 expanding
6.062 experience
5.942 explicit
3.420 exploit
6.000 explotación
6.000 explícitas
1.942 extensive
6.000 extracción
12.218 extraction
19.529 fabra
2.033 fast
2.726 february
6.000 february-september
1.656 field
2.081 fields
12.003 filología
4.447 filosofía
4.742 financial
7.896 fluent
2.337 focused
4.433 follower
2.486 formal
7.676 forthcoming
2.513 french
2.881 funding
12.198 gelbukh
6.000 generación
1.326 general
Count Word
2.829 generates
4.289 geneve
5.401 genomic
5.389 genomics
4.534 gens
2.009 government
2.053 graduate
2.793 grant
5.183 ground
2.332 group
2.400 hall
2.244 harvard
0.783 health
1.945 help
2.143 her
5.747 herramientas
4.908 highly
3.157 him
1.930 his
3.624 houghton
4.447 html
6.419 http
5.496 iberoamericana
2.508 id
1.665 identification
2.284 identifying
1.729 ie
1.106 ii
5.994 iimas
12.198 iling
Count Word
2.755 immigration
2.963 implement
1.237 important
1.811 improve
2.698 including
8.159 indigenous
3.088 industries
5.577 información
8.910 information
3.338 infrastructure
3.011 innsbruck
5.172 institut
4.198 institute
2.380 instituto
7.305 intelligent
6.141 interested
12.159 interests
2.240 interface
5.049 international
2.568 internet
4.370 investigación
1.371 involved
3.750 ipn
5.485 ipra
1.930 issues
4.613 january
3.433 jointly
0.466 journal
5.153 kinds
9.424 klett
Count Word
6.000 klett-verlag
5.169 kluwer
7.346 knowledge
5.698 la
12.571 language
12.303 languages
2.701 las
3.916 lately
1.741 later
4.397 laude
5.719 leader
1.912 leading
3.973 learner
8.402 lecture
1.171 level
13.019 lexical
16.352 lexicography
6.000 lexicones
9.911 lexicons
2.796 life
3.372 lifelong
4.801 linguist
3.217 linguistic
18.861 linguistics
16.352 lingüística
4.814 linux
12.198 lionbridge
1.583 literature
2.278 live
3.680 localization
Count Word
2.795 mac
5.075 machine-learning
2.399 madrid
1.799 mainly
1.299 major
1.362 management
6.346 manager
1.816 manner
2.498 manual
6.000 marked-up
1.492 markers
3.991 massachusetts
6.259 master
2.222 mathematics
5.626 meaning
3.719 mellon
3.232 memories
2.683 mentioned
12.198 metalenguaje
29.042 metalinguistic
6.000 metalingüísticas
3.015 methodologies
6.000 metodológicos
2.976 mexican
11.128 mexico
4.858 mifflin
8.095 mining
2.614 minority
4.828 modeling
1.205 models
Count Word
5.942 module
4.496 monolingual
4.069 mop
4.848 multiculturalism
4.573 multilingual
9.052 mx
9.406 my
2.647 nacional
6.647 national
1.969 native
6.602 natural
1.656 need
2.246 net
2.296 networks
6.000 nltk
0.849 no
4.351 non-standard
7.846 notes
1.543 novel
2.661 november
4.275 nuevas
4.033 num
5.207 october
6.000 ocxon-digital
2.454 offered
3.430 officials
3.258 oncological
0.981 only
4.568 ontologies
4.014 ontology
Count Word
3.556 open
5.418 operaciones
2.249 operating
4.422 operations
2.703 oriented
3.008 other
2.000 others
3.458 oviedo
2.378 panel
2.775 papers
5.773 para
8.756 parsing
1.478 part
2.488 partners
1.811 people
7.967 perl
2.054 personal
3.961 ph
2.458 phenomena
3.127 philosophical
9.451 philosophy
2.373 plan
1.704 planning
3.738 poetry
4.166 politécnico
3.200 polytechnic
20.075 pompeu
3.595 portuguese
8.529 postdoctoral
2.358 pp
Count Word
2.183 practical
14.991 pragmatics
3.752 prentice
1.828 preparation
4.786 press
4.762 principios
3.593 printing
2.502 proceedings
5.239 procesamiento
1.363 process
7.285 processing
1.322 production
1.839 professional
2.936 professor
3.013 programming
3.643 programs
6.881 project
8.634 projects
5.216 proofing
4.816 propuesta
5.789 publications
3.501 publishers
9.416 publishing
8.818 python
3.260 qa
1.421 quality
6.000 quierointernet
3.194 reader
2.272 readily
2.044 reading
Count Word
1.555 recent
5.104 reconocimiento
4.873 recreating
1.820 red
3.656 regulatory
10.946 regulondb
5.087 relación
3.827 repository
6.000 resaercher
11.021 researcher
8.303 researching
5.807 resources
2.619 retrieval
0.858 review
6.000 riterm
2.651 rules
4.585 sandrini
7.580 scholarships
1.073 school
7.249 science
4.503 sciences
2.170 scientific
4.837 see
2.755 seeking
2.008 segments
1.688 selected
2.838 semantic
3.081 semantics
3.722 seminars
6.000 semántica
Count Word
3.256 sentences
6.962 september
3.344 series
2.393 sf
1.271 small
4.667 sni
6.000 socially-relevant
2.081 society
7.247 software
3.560 source
2.004 sources
6.000 sp-en
6.593 spain
5.811 spanish
3.659 speaker
3.312 speakers
2.949 speaking
1.883 special
2.696 specialist
5.201 specialized
3.505 spoken
3.427 springer
13.458 springer-verlag
4.113 ssl
1.379 standards
2.284 starting
1.284 state
1.806 statistical
1.545 structural
3.674 structuring
Count Word
1.816 students
1.660 studies
3.185 stuttgart
5.496 sublanguage
2.293 summer
3.825 system
4.934 systems
5.071 tardío
4.540 task-oriented
4.013 tdx
6.000 tdx-0228105-114717
3.145 teach
4.271 teaches
7.976 teaching
2.180 team
4.975 team-work
7.977 technical
6.000 technically-minded
4.923 techniques
2.954 technological
5.026 technologies
6.217 technology
1.952 term
11.370 terminology
14.231 terminología
1.846 terms
9.702 text
5.631 textos
12.983 texts
5.295 teórica
Count Word
3.831 theoretic
1.972 theoretical
5.026 tke
4.452 tm
3.857 toni
2.019 tool
9.083 toolkit
7.659 tools
2.809 topics
4.514 toronto
2.825 tracking
2.057 traditional
12.198 traducció
3.002 train
3.847 transactions
5.653 translation
3.629 translations
5.149 tutored
13.906 unam
1.754 units
9.861 universitari
11.044 universitat
3.840 university
4.398 unorthodox
0.963 usa
3.558 usable
0.946 used
3.207 useful
2.768 user
3.247 using
Count Word
2.957 valencia
6.000 vallduví
3.662 various
6.194 vol
2.790 web
3.294 windows
10.174 wittgenstein
4.184 work
2.866 works
2.846 workshop
1.881 world
4.124 xml
RIC Statistics:
Extraction Method: Keyword Count with Lexical Variants Added
Eliminated words list: MedlinePlus List
Similarity Method: Weighted keyword count
Weighting Method: Term Frequency * Inverse Document Frequency
Database: Medline abstracts (1967 - Present)
Publication Type: All
Score Calculation Method: Cosine Similarity Method
Sort by: Score
Submission date and time: 12-28-2006, 7:28:41
Computation time: 00:00:07
Last updated: Thursday, 28-Dec-2006 07:28:48 CST