Publikationen
Auf dieser Seite finden Sie eine Liste der Projektpublikationen sowie zusätzliche Materialien von gehaltenen Vorträgen.
Eine ausführlichere Bibliographie zum Thema "Computergestützte Transkription" finden Sie hier.
Projektpublikationen
| 2010 / in Vorbereitung | |||||
|
Schmidt, T. (2010) Another extension of the stylesheet metaphor â Visualising multi-layer annotations as musical scores. In: Witt, A. & Metzing, D. (ed.): Linguistic modelling of information and Markup Languages, 23-44. Dordrecht: Springer. [BibTeX] |
|||||
BibTeX:
@incollection{Schmidt2008a,
author = {Thomas Schmidt},
title = {Another extension of the stylesheet metaphor â Visualising multi-layer annotations as musical scores},
booktitle = {Linguistic modelling of information and Markup Languages},
address = {Dordrecht},
publisher = {Springer},
year = {2010},
pages = {23-44},
note = {EN}
}
|
|||||
|
Schmidt, T. (2010) EXMARaLDA : un systĂšme pour la constitution et l'exploitation de corpus oraux.. In: Actes du COLLOQUE INTERNATIONAL: POUR UNE EPISTEMOLOGIE DE LA SOCIOLINGUISTIQUE, Montpellier 2009., [BibTeX] [URL] |
|||||
BibTeX:
@inproceedings{ActesMontpellier,
author = {Thomas Schmidt},
title = {EXMARaLDA : un systĂšme pour la constitution et l'exploitation de corpus oraux.},
booktitle = {Actes du COLLOQUE INTERNATIONAL: POUR UNE EPISTEMOLOGIE DE LA SOCIOLINGUISTIQUE, Montpellier 2009.},
year = {2010},
url = {http://www.exmaralda.org/files/Montpellier.pdf}
}
|
|||||
|
Schmidt, T. (i.V.) GrundzĂŒge von EXMARaLDA â einem System zur computergestĂŒtzten Erstellung und Auswertung von Korpora gesprochener Sprache. In: Rehbein, J. & Kameyama, S. (ed.): Bausteine diskursanalytischen Wissens, Berlin: de Gruyter. [BibTeX] [URL] |
|||||
BibTeX:
@incollection{Schmidt2008b,
author = {Thomas Schmidt},
title = {GrundzĂŒge von EXMARaLDA â einem System zur computergestĂŒtzten Erstellung und Auswertung von Korpora gesprochener Sprache},
booktitle = {Bausteine diskursanalytischen Wissens},
address = {Berlin},
publisher = {de Gruyter},
year = {forthcoming},
note = {DE}
}
|
|||||
| 2009 | |||||
|
Schmidt, T. (2009) Creating and Working with Spoken Language Corpora in EXMARaLDA. In: Lyding, V. (ed.): LULCL II: Lesser Used Languages & Computer Linguistics II, 151-164. [BibTeX] [URL] |
|||||
BibTeX:
@inproceedings{Schmidt2009Bozen,
author = {Thomas Schmidt},
title = {Creating and Working with Spoken Language Corpora in EXMARaLDA},
booktitle = {LULCL II: Lesser Used Languages \& Computer Linguistics II},
year = {2009},
pages = {151-164},
url = {http://www.eurac.edu/Org/LanguageLaw/Multilingualism/Projects/LULCL_II_proceedings.htm}
}
|
|||||
|
Merkel, S. & Schmidt, T. (2009) Koprora gesprochener Sprache im Netz - eine Umschau. In: GesprÀchsforschung (10) 70-93. [BibTeX] [URL] |
|||||
BibTeX:
@article{MerkelSchmidt2009,
author = {Merkel, Silke and Schmidt, Thomas},
title = {Koprora gesprochener Sprache im Netz - eine Umschau},
journal = {GesprÀchsforschung},
year = {2009},
volume = {10},
pages = {70-93},
url = {http://www.gespraechsforschung-ozs.de/heft2009/px-merkel.pdf}
}
|
|||||
|
Schmidt, T.; Duncan, S.; Ehmer, O.; Hoyt, J.; Kipp, M.; Magnusson, M.; Rose, T. & Sloetjes, H. (2009) An Exchange Format for Multimodal Annotations. In: Michael Kipp, Jean-Claude Martin, P. P. & Heylen, D. (ed.): Multimodal Corpora, Lecture Notes in Computer Science 207-221. Springer. [BibTeX] [URL] |
|||||
BibTeX:
@incollection{MultiModalSpringer,
author = {Schmidt, Thomas and Duncan, Susan and Ehmer, Oliver and Hoyt, Jeffrey and Kipp, Michael and Magnusson, Magnus and Rose, Travis and Sloetjes, Han},
title = {An Exchange Format for Multimodal Annotations},
booktitle = {Multimodal Corpora},
publisher = {Springer},
year = {2009},
series = {Lecture Notes in Computer Science},
pages = {207-221},
url = {http://www.springer.com/computer/computer+imaging/book/978-3-642-04792-3}
}
|
|||||
|
Schmidt, T. & Wörner, K. (2009) EXMARaLDA â Creating, analysing and sharing spoken language corpora for pragmatic research. In: Pragmatics Allwood, J. (ed.): Corpus-based pragmatics, 19 [BibTeX] |
|||||
BibTeX:
@article{SchmidtWoerner2008,
author = {Thomas Schmidt and Kai Wörner},
title = {EXMARaLDA â Creating, analysing and sharing spoken language corpora for pragmatic research},
booktitle = {Corpus-based pragmatics},
journal = {Pragmatics},
year = {2009},
volume = {19}
}
|
|||||
| 2008 | |||||
|
Lehmberg, T. & Wörner, K. (2008) Annotation Standards. In: LĂŒdeling, A. & Kytö, M. (ed.): Corpus Linguistics - An international handbook, 1484-501. Walter de Gruyter. [BibTeX] |
|||||
BibTeX:
@incollection{LehmbergWoerner2008,
author = {Timm Lehmberg and Kai Wörner},
title = {Annotation Standards},
booktitle = {Corpus Linguistics - An international handbook},
publisher = {Walter de Gruyter},
year = {2008},
volume = {1},
pages = {484-501}
}
|
|||||
|
Schmidt, T. & Bennöhr, J. (2008) Rescuing Legacy Data. In: Language Documentation and Conservation 2109-129. [BibTeX] [URL] |
|||||
BibTeX:
@article{SchmidtBennoehr2008,
author = {Schmidt, Thomas and Bennöhr, Jasmine},
title = {Rescuing Legacy Data},
journal = {Language Documentation and Conservation},
year = {2008},
volume = {2},
number = {1},
pages = {109-129},
url = {http://hdl.handle.net/10125/1803}
}
|
|||||
|
Schmidt, T.; Duncan, S.; Ehmer, O.; Hoyt, J.; Kipp, M.; Magnusson, M.; Rose, T. & Sloetjes, H. (2008) An exchange format for multimodal annotations. In: Proceedings of the Language and Evalutation Conference 2008, [BibTeX] |
|||||
BibTeX:
@inproceedings{Schmidtetal2008,
author = {Schmidt, Thomas and Duncan, Susan and Ehmer, Oliver and Hoyt, Jeffrey and Kipp, Michael and Magnusson, Magnus and Rose, Travis and Sloetjes, Han},
title = {An exchange format for multimodal annotations},
booktitle = {Proceedings of the Language and Evalutation Conference 2008},
year = {2008}
}
|
|||||
|
Schmidt, T. (2008) GAT: Aspekte der computertechnischen Umsetzbarkeit. ms UniversitÀt Hamburg / IDS Mannheim, [BibTeX] [URL] |
|||||
BibTeX:
@techreport{GAT2008,
author = {Thomas Schmidt},
title = {GAT: Aspekte der computertechnischen Umsetzbarkeit},
year = {2008},
url = {http://www.exmaralda.org/files/GAT_Analyse2.pdf}
}
|
|||||
| 2007 | |||||
|
Schmidt, T. (2007) Transkriptionskonventionen fĂŒr die computergestĂŒtzte gesprĂ€chsanalytische Transkription. In: GesprĂ€chsforschung (8) 229-241. [BibTeX] [URL] |
|||||
BibTeX:
@article{Schmidt2007,
author = {Thomas Schmidt},
title = {Transkriptionskonventionen fĂŒr die computergestĂŒtzte gesprĂ€chsanalytische Transkription},
journal = {GesprÀchsforschung},
year = {2007},
volume = {8},
pages = {229-241},
url = {http://www.gespraechsforschung-ozs.de/heft2007/heft2007.htm}
}
|
|||||
|
Baumgarten, N.; Herkenrath, A.; Schmidt, T.; Wörner, K. & Zeevaert, L. (2007) Studying Connectivity with the Help of Computer-Readable Corpora: Some Exemplary Analyses from Modern and Historical, Written and Spoken Corpora. In: Rehbein, J.; Hohenstein, C. & Pietsch, L. (ed.): Connectivity in Grammar and Discourse, Hamburg Studies in Multilingualism 5Amsterdam: Benjamins. [Abstract] [BibTeX] |
|||||
| Abstract: This paper discusses methodological aspects of the use of electronic language corpora for the study of connectivity. We demonstrate how a corpus-based approach was used to investigate functional characteristics of coordinating elements in sentence- or utterance-initial position across different languages (English, German, Old Swedish and Turkish), across different modalities (written and spoken) and across the diachronic dimension (historic and modern languages). Our focus is on the difficulties we encountered in this study when attempting to transfer corpus-based methods developed for the analysis of corpora of modern, written language to the analysis of corpora of historic or spoken language. We suggest an abstract corpus-linguistic workflow and discuss where and how this workflow differs according to the corpus type, and how well its individual steps are supported by current corpus technology. | |||||
BibTeX:
@incollection{Baumgarten2007,
author = {Baumgarten, N. and Herkenrath, A. and Schmidt, T. and Wörner, K. and Zeevaert, L.},
title = {Studying Connectivity with the Help of Computer-Readable Corpora: Some Exemplary Analyses from Modern and Historical, Written and Spoken Corpora},
booktitle = {Connectivity in Grammar and Discourse},
address = {Amsterdam},
publisher = {Benjamins},
year = {2007},
series = {Hamburg Studies in Multilingualism},
volume = {5},
note = {EN}
}
|
|||||
|
Höder, S.; Wörner, K. & Zeevaert, L. (2007) Corpus-based investigations on word order change: The case of Old Nordic. In: Arbeiten zur Mehrsprachigkeit, Folge B 811 ff. [Abstract] [BibTeX] [URL] |
|||||
| Abstract: This paper presents results from an interdisciplinary cooperation within the Collaborative Research Centre on Multilingualism. First results of this cooperation were published in an earlier paper (BAUMGARTEN et al. 2007) concentrating on an investigation of functional characteristics of coordinating elements in English, German, Old Swedish and Turkish corpora. The aim of the second part of the cooperation was to develop corpus linguistic methods in order to be able to examine word order change in subordinate clauses in older Swedish and Danish texts in comparison to Old West Norse. The starting point for the investigation was the observation that the word order in Swedish main clauses is rather stable from the earliest written sources up to contemporary Swedish, whereas in subordinate clauses, from a diachronic perspective, far-reaching changes can be observed. Starting from the hypothesis that language contact triggered this change, a comparison of an Old Swedish, an Old Danish and an Old West Norse version of the Story of Charlemagne was performed. The West Norse version almost exclusively shows verb second order and no examples of verb late order. In the Danish and the Swedish versions, verb second is also the main option, but more examples of the finite verb in a later position can be found in both texts. In our opinion it seems to be reasonable to suggest that the development of new text types based on Latin models triggered the change that can be observed in the East Norse texts. | |||||
BibTeX:
@article{Hoeder2007,
author = {Steffen Höder and Kai Wörner and Ludger Zeevaert},
title = {Corpus-based investigations on word order change: The case of Old Nordic},
journal = {Arbeiten zur Mehrsprachigkeit, Folge B},
year = {2007},
volume = {81},
pages = {1 ff},
url = {http://www.exmaralda.org/files/azm81.pdf}
}
|
|||||
| 2006 | |||||
|
Rohlfing, K.; Loehr, D.; Duncan, S.; Brown, A.; Franklin, A.; Kimbara, I.; Milde, J.; Parrill, F.; Rose, T.; Schmidt, T.; Sloetjes, H.; Thies, A. & Wellinghoff, S. (2006) Comparison of multimodal annotation tools â workshop report. In: GesprĂ€chsforschung (7) 99-123. [BibTeX] [URL] |
|||||
BibTeX:
@article{Rohlfing2006,
author = {Rohlfing, Katharina and Loehr, Daniel and Duncan, Susan and Brown, Amanda and Franklin, Amy and Kimbara, Irene and Milde, Jan-Torsten and Parrill, Fey and Rose, Travis and Schmidt, Thomas and Sloetjes, Han and Thies, Alexandra and Wellinghoff, Sandra},
title = {Comparison of multimodal annotation tools â workshop report},
journal = {GesprÀchsforschung},
year = {2006},
volume = {7},
pages = {99-123},
note = {EN},
url = {http://www.gespraechsforschung-ozs.de/heft2006/tb-rohlfing.pdf}
}
|
|||||
|
Schmidt, T.; Chiarcos, C.; Lehmberg, T.; Rehm, G.; Witt, A. & Hinrichs, E. (2006) Avoiding Data Graveyards: From Heterogeneous Data Collected in Multiple Research Projects to Sustainable Linguistic Resources. In: Proceedings of the E-MELD 2006 Workshop on Digital Language Documentation: Tools and Standards: The State of the Art, Lansing, Michigan: [Abstract] [BibTeX] [URL] |
|||||
| Abstract: This paper describes a new research initiative addressing the issue of sustainability of linguistic resources. The initiative is a cooperation between three collaborative research centres in Germany â the SFB 441 âLinguistic Data Structuresâ in TĂŒbingen, the SFB 538 âMultilingualismâ in Hamburg, and the SFB 632 âInformation Structureâ in Potsdam/Berlin. The aim of the project is to develop methods for sustainable archiving of the diverse bodies of linguistic data used at the three sites. In the first half of the paper, the data handling solutions developed so far at the three centres are briefly introduced. This is followed by an assessment of their commonalities and differences and of what these entail for the work of the new joint initiative. The second part then sketches seven areas of open questions with respect to sustainable data handling and gives a more detailed account of two of them â integration of linguistic terminologies and development of best practice guidelines. | |||||
BibTeX:
@inproceedings{Schmidt2006,
author = {Schmidt, Thomas and Chiarcos, Christian and Lehmberg, Timm and Rehm, Georg and Witt, Andreas and Hinrichs, Erhard},
title = {Avoiding Data Graveyards: From Heterogeneous Data Collected in Multiple Research Projects to Sustainable Linguistic Resources},
booktitle = {Proceedings of the E-MELD 2006 Workshop on Digital Language Documentation: Tools and Standards: The State of the Art},
address = {Lansing, Michigan},
year = {2006},
note = {EN},
url = {http://www.exmaralda.org/files/EMELD_final.pdf}
}
|
|||||
|
Wörner, K.; Witt, A.; Rehm, G. & Dipper, S. (2006) Modelling Linguistic Data Structures. In: Proceedings of the Extreme Markup Languages 2006, Montréal, Canada: [Abstract] [BibTeX] [URL] |
|||||
| Abstract: Linguistic corpora have been annotated by means of SGML-based markup languages for almost 20 years. We can, very roughly, differentiate between three distinct evolutionary stages of markup technologies. (1) Originally, single SGML tree-based document instances were deemed sufficient for the representation of linguistic structures. (2) Linguists began to realize that alternatives and extensions to the traditional model are needed. Formalisms such as, for example, NITE were proposed: the NITE Object Model (NOM) consists of multi-rooted trees. (3) We are now on the threshold of the third evolutionary stage: even NITE's very flexible approach is not suited for all linguistic purposes. As some structures, such as these, cannot be modeled by multi-rooted trees, an even more flexible approach is needed in order to provide a generic annotation format that is able to represent genuinely arbitrary linguistic data structures. | |||||
BibTeX:
@inproceedings{Woerner2006,
author = {Wörner, Kai and Witt, Andreas and Rehm, Georg and Dipper, Stefanie},
title = {Modelling Linguistic Data Structures},
booktitle = {Proceedings of the Extreme Markup Languages 2006},
address = {Montréal, Canada},
year = {2006},
note = {EN},
url = {http://www.idealliance.org/papers/extreme/proceedings/html/2006/Witt01/EML2006Witt01.html}
}
|
|||||
| 2005 | |||||
|
Schmidt, T. (2005) ComputergestĂŒtzte Transkription - Modellierung und Visualisierung gesprochener Sprache mit texttechnologischen Mitteln. Frankfurt a. M.: Peter Lang. [BibTeX] [URL] |
|||||
BibTeX:
@book{Schmidt2005c,
author = {Schmidt, Thomas},
title = {ComputergestĂŒtzte Transkription - Modellierung und Visualisierung gesprochener Sprache mit texttechnologischen Mitteln},
address = {Frankfurt a. M.},
publisher = {Peter Lang},
year = {2005},
series = {Sprache, Sprechen und ComputerandComputer Studies in Language and Speech},
volume = {7},
note = {DE},
url = {http://www.exmaralda.org/files/Diss_INHALT.pdf}
}
|
|||||
|
Schmidt, T. (2005) Datenarchive fĂŒr die GesprĂ€chsforschung. Perspektiven, Probleme und LösungsansĂ€tze. In: GesprĂ€chsforschung (6) 103-126. [BibTeX] [URL] |
|||||
BibTeX:
@article{Schmidt2005a,
author = {Schmidt, Thomas},
title = {Datenarchive fĂŒr die GesprĂ€chsforschung. Perspektiven, Probleme und LösungsansĂ€tze},
journal = {GesprÀchsforschung},
year = {2005},
volume = {6},
pages = {103-126},
note = {DE},
url = {http://www.gespraechsforschung-ozs.de/heft2005/px-schmidt.pdf}
}
|
|||||
|
Schmidt, T. (2005) EXMARaLDA und die Datenbank "Mehrsprachigkeit" - Konzepte und praktische Erfahrungen. In: Dipper, S. & Stede, M. (ed.): Heterogeneity in Focus: Creating and Using Linguistic Databases, Interdisciplinary Studies on Information Structure (ISIS) 221-42. Potsdam: UniversitÀtsverlag Potsdam. [Abstract] [BibTeX] [URL] |
|||||
| Abstract: This paper presents some concepts and principles used in the devel-opment of a database of multilingual spoken discourse at the Univer-sity of Hamburg. The emphasis of the first part is on general consid-erations for the handling of heterogeneous data sets: After showing that diversity in transcription data is partly conceptually and partly technologically motivated, it is argued that the processing of transcrip-tion corpora should be approached via a three-level architecture which separates form (application) and content (data) on the one hand, and logical and physical data structures on the other hand. Such an archi-tecture does not only pave the way for modern text-technological ap-proaches to linguistic data processing, it can also help to decide where and how a standardization in the work with heterogeneous data is pos-sible and desirable and where it would run counter to the needs of the research community. It is further argued that, in order to ensure user acceptance, new solutions developed in this approach must take care not to abandon established concepts too quickly. The focus of the second part is on some practical experiences with users and technologies gained in the four yearsâ project work. Con-cerning the practical development work, the value of open standards like XML and Unicode is emphasized and some limitations of the âplatform-independentâ JAVA technology are indicated. With respect to users of the EXMARaLDA system, a predominantly conservative attitude towards technological innovations in transcription corpus work can be stated: individual users tend to stick to known functional-ities and are reluctant to adopt themselves to the new possibilities. Furthermore, an active commitment to cooperative corpus work still seems to be the exception rather than the rule. It is concluded that technological innovations can contribute their share to a progress in the work with heterogeneous linguistic data, but that they will have to be supplemented, in the long run, with an ade-quate methodological reflection and the creation of an appropriate in-frastructure. | |||||
BibTeX:
@incollection{Schmidt2005d,
author = {Schmidt, Thomas},
title = {EXMARaLDA und die Datenbank "Mehrsprachigkeit" - Konzepte und praktische Erfahrungen},
booktitle = {Heterogeneity in Focus: Creating and Using Linguistic Databases},
address = {Potsdam},
publisher = {UniversitÀtsverlag Potsdam},
year = {2005},
series = {Interdisciplinary Studies on Information Structure (ISIS)},
volume = {2},
pages = {21-42},
note = {DE},
url = {http://www.exmaralda.org/files/Paper_Potsdam.pdf}
}
|
|||||
|
Schmidt, T. (2005) Modellbildung und Modellierungsparadigmen in der computergestĂŒtzten Korpusanalyse. In: Fisseni, B.; Schmitz, H.; Schröder, B. & Wagner, P. (ed.): Sprachtechnologie, mobile Kommunikation und linguistische Ressourcen. BeitrĂ€ge zur GLDV-Tagung 2005 in Bonn, Sprache, Sprechen und Computer - Computer Studies in Language and Speech 8Frankfurt a. M.: [BibTeX] |
|||||
BibTeX:
@inproceedings{Schmidt2005b,
author = {Schmidt, Thomas},
title = {Modellbildung und Modellierungsparadigmen in der computergestĂŒtzten Korpusanalyse},
booktitle = {Sprachtechnologie, mobile Kommunikation und linguistische Ressourcen. BeitrÀge zur GLDV-Tagung 2005 in Bonn},
address = {Frankfurt a. M.},
year = {2005},
series = {Sprache, Sprechen und Computer - Computer Studies in Language and Speech},
volume = {8},
note = {DE}
}
|
|||||
|
Schmidt, T. (2005) Time-based data models and the Text Encoding Initiative's guidelines for transcription of speech. In: Arbeiten zur Mehrsprachigkeit, Folge B 621 ff. [BibTeX] [URL] |
|||||
BibTeX:
@article{Schmidt2005e,
author = {Schmidt, Thomas},
title = {Time-based data models and the Text Encoding Initiative's guidelines for transcription of speech},
journal = {Arbeiten zur Mehrsprachigkeit, Folge B},
year = {2005},
volume = {62},
pages = {1 ff},
note = {EN},
url = {http://www.exmaralda.org/files/SFB_AzM62.pdf}
}
|
|||||
|
Schmidt, T. & Wörner, K. (2005) Erstellen und Analysieren von GesprÀchskorpora mit EXMARaLDA. In: GesprÀchsforschung (6) 171-195. [Abstract] [BibTeX] [URL] |
|||||
| Abstract: Dieser Aufsatz gibt einen Ăberblick ĂŒber EXMARaLDA, ein System aus Daten-modell, Datenformaten und Software-Werkzeugen zum computergestĂŒtzten Erstellen und Analysieren von Korpora gesprochener Sprache. Der Schwerpunkt der Darstellung liegt auf der Nutzung der verschiedenen Softwarewerkzeuge â ein Partitur-Editor zum Erstellen von Transkriptionen, ein Corpus-Manager zumErstellen und Verwalten von Korpora und ein Suchwerkzeug zum Auswerten sol-cher Korpora â fĂŒr gesprĂ€chsanalytische Zwecke | |||||
BibTeX:
@article{Schmidt2005,
author = {Schmidt, Thomas and Wörner, Kai},
title = {Erstellen und Analysieren von GesprÀchskorpora mit EXMARaLDA},
journal = {GesprÀchsforschung},
year = {2005},
volume = {6},
pages = {171-195},
note = {DE},
url = {http://www.gespraechsforschung-ozs.de/heft2005/px-woerner.pdf}
}
|
|||||
| 2004 | |||||
|
Rehbein, J.; Schmidt, T.; Meyer, B.; Watzke, F. & Herkenrath, A. (2004) Handbuch fĂŒr das computergestĂŒtzte Transkribieren nach HIAT. In: Arbeiten zur Mehrsprachigkeit, Folge B 561 ff. [BibTeX] [URL] |
|||||
BibTeX:
@article{Rehbein2004,
author = {Rehbein, Jochen and Schmidt, Thomas and Meyer, Bernd and Watzke, Franziska and Herkenrath, Annette},
title = {Handbuch fĂŒr das computergestĂŒtzte Transkribieren nach HIAT},
journal = {Arbeiten zur Mehrsprachigkeit, Folge B},
year = {2004},
volume = {56},
pages = {1 ff},
note = {DE},
url = {http://www.exmaralda.org/files/azm_56.pdf}
}
|
|||||
|
Schmidt, T. (2004) EXMARaLDA - ein Modellierungs- und Visualisierungsverfahren fĂŒr die computergestĂŒtzte Transkription gesprochener Sprache. In: Buchberger, E. (ed.): Proceedings of Konvens 2004, Schriftenreihe der Ăsterreichischen Gesellschaft fĂŒr Artificial Intelligence 5Wien: [Abstract] [BibTeX] [URL] |
|||||
| Abstract: This paper attempts a new look at computer assisted transcription as it is commonly practised within the fields of discourseanalysis and language acquisition studies.The first part proposes a bridge between discourse analytical methodology and text technological methods with the concept ofmodelling as its central idea. The secondpart demonstrates the EXMARaLDA system, a set of formats and tools for computerassisted transcription that builds on the ideas developed in the first part and implements them in a way that can lead to significant improvement in current research practice. | |||||
BibTeX:
@inproceedings{Schmidt2004,
author = {Schmidt, Thomas},
title = {EXMARaLDA - ein Modellierungs- und Visualisierungsverfahren fĂŒr die computergestĂŒtzte Transkription gesprochener Sprache},
booktitle = {Proceedings of Konvens 2004},
address = {Wien},
year = {2004},
series = {Schriftenreihe der Ăsterreichischen Gesellschaft fĂŒr Artificial Intelligence},
volume = {5},
note = {DE},
url = {http://www.exmaralda.org/files/Konvens_Paper.pdf}
}
|
|||||
|
Schmidt, T. (2004) EXMARaLDA - ein System zur computergestĂŒtzten Diskurstranskription. In: Mehler, A. & Lobin, H. (ed.): Automatische Textanalyse. Systeme und Methoden zur Annotation und Analyse natĂŒrlichsprachlicher Texte, 203-218. Wiesbaden: Verlag fĂŒr Sozialwissenschaften. [BibTeX] |
|||||
BibTeX:
@incollection{Schmidt2004b,
author = {Schmidt, Thomas},
title = {EXMARaLDA - ein System zur computergestĂŒtzten Diskurstranskription},
booktitle = {Automatische Textanalyse. Systeme und Methoden zur Annotation und Analyse natĂŒrlichsprachlicher Texte},
address = {Wiesbaden},
publisher = {Verlag fĂŒr Sozialwissenschaften},
year = {2004},
pages = {203-218},
note = {DE}
}
|
|||||
|
Schmidt, T. (2004) Transcribing and annotating spoken language with EXMARaLDA. In: Proceedings of the LREC-Workshop on XML based richly annotated corpora, Lisbon 2004, Paris: ELRA. [Abstract] [BibTeX] [URL] |
|||||
| Abstract: This paper describes EXMARaLDA, an XML-based framework for the construction, dissemination and analysis of corpora of spoken language transcriptions. Departing from a prototypical example of a âpartiturâ (musical score) transcription, the EXMARaLDA âsingle timeline, multiple tiersâ data model and format is presented alongside with the EXMARaLDA Partitur-Editor, a tool for inputting and visualizing such data. This is followed by a discussion of the interaction of EXMARaLDA with other frameworks and tools that work with similar data models. Finally, this paper presents an extension of the âsingle timeline, multiple tiersâ data model and describes its application within the EXMARaLDA system. | |||||
BibTeX:
@inproceedings{Schmidt2004a,
author = {Schmidt, Thomas},
title = {Transcribing and annotating spoken language with EXMARaLDA},
booktitle = {Proceedings of the LREC-Workshop on XML based richly annotated corpora, Lisbon 2004},
address = {Paris},
publisher = {ELRA},
year = {2004},
note = {EN},
url = {http://www.exmaralda.org/files/Paper_LREC.pdf}
}
|
|||||
|
Schmidt, T.; MacWhinney, B.; Martell, C.; Wagner, J.; Wittenburg, P. & Hoffer, E. (2004) Collaborative Commentary: Opening Up Spoken Language Databases. In: Proceedings of the Language Resource and Evalutation Conference 2004, Lisbon, Paris: ELRA. [BibTeX] |
|||||
BibTeX:
@inproceedings{Schmidt2004c,
author = {Thomas Schmidt and B. MacWhinney and C. Martell and J. Wagner and P. Wittenburg and E. Hoffer},
title = {Collaborative Commentary: Opening Up Spoken Language Databases},
booktitle = {Proceedings of the Language Resource and Evalutation Conference 2004, Lisbon},
address = {Paris},
publisher = {ELRA},
year = {2004},
note = {EN}
}
|
|||||
| 2003 | |||||
|
Schmidt, T. (2003) Korpus "Skandinavische Semikommunikation" - ein mehrsprachiges Diskurskorpus auf XML-Basis. In: Sprachtechnologie fĂŒr die multilinguale Kommunikation - Textproduktion, Recherche, Ăbersetzung, Lokalisierung. BeitrĂ€ge der GLDV-FrĂŒhjahrstagung 2003 an der Hochschule Anhalt (FH) in Köthen, 421-427. [BibTeX] [URL] |
|||||
BibTeX:
@inproceedings{Schmidt2003,
author = {Schmidt, Thomas},
title = {Korpus "Skandinavische Semikommunikation" - ein mehrsprachiges Diskurskorpus auf XML-Basis},
booktitle = {Sprachtechnologie fĂŒr die multilinguale Kommunikation - Textproduktion, Recherche, Ăbersetzung, Lokalisierung. BeitrĂ€ge der GLDV-FrĂŒhjahrstagung 2003 an der Hochschule Anhalt (FH) in Köthen},
year = {2003},
pages = {421-427},
note = {DE},
url = {http://www.ldv-forum.org/2003_Doppelheft/421-427_Schmidt.pdf}
}
|
|||||
|
Schmidt, T. (2003) Visualising Linguistic Annotation as Interlinear Text. In: Arbeiten zur Mehrsprachigkeit, Folge B 461 ff.. [BibTeX] [URL] |
|||||
BibTeX:
@article{Schmidt2003a,
author = {Schmidt, Thomas},
title = {Visualising Linguistic Annotation as Interlinear Text},
journal = {Arbeiten zur Mehrsprachigkeit, Folge B},
year = {2003},
volume = {46},
pages = {1 ff.},
note = {EN},
url = {http://www.exmaralda.org/files/Visualising-final.pdf}
}
|
|||||
| 2002 | |||||
|
Schmidt, T. (2002) EXMARaLDA - ein System zur Diskurstranskription auf dem Computer. In: Arbeiten zur Mehrsprachigkeit, Folge B 341 ff.. [Abstract] [BibTeX] [URL] |
|||||
| Abstract: EXMARaLDA is a system for computer transcription of spoken discourse that is being developed at the SFB âMehrsprachigkeitâ as a basis of a multilingual discourse database into which the transcriptions in use at the SFB will be integrated at a later point in time. The present paper describes the theoretical background of the development â a formal model of discourse transcription based on the annotation graph formalism (Bird/Liberman (2001)) â and its practical realisation in the form of an XML-based data format and several tools for input, output and manipulation of the data. | |||||
BibTeX:
@article{Schmidt2002b,
author = {Schmidt, Thomas},
title = {EXMARaLDA - ein System zur Diskurstranskription auf dem Computer},
journal = {Arbeiten zur Mehrsprachigkeit, Folge B},
year = {2002},
volume = {34},
pages = {1 ff.},
note = {DE},
url = {http://www.exmaralda.org/files/AZM.pdf}
}
|
|||||
|
Schmidt, T. (2002) EXMARaLDA - un systĂšme de transcription computationelle comme base d'un corpus de la langue parlĂ©e multilingue. In: JournĂ©e dâĂtude de lâATALA, Paris: [BibTeX] [URL] |
|||||
BibTeX:
@inproceedings{Schmidt2002c,
author = {Thomas Schmidt},
title = {EXMARaLDA - un systÚme de transcription computationelle comme base d'un corpus de la langue parlée multilingue},
booktitle = {JournĂ©e dâĂtude de lâATALA},
address = {Paris},
year = {2002},
note = {FR},
url = {http://www.up.univ-mrs.fr/veronis/Atala/jecorpus/Schmidt.html}
}
|
|||||
|
Schmidt, T. (2002) GesprÀchstranskription auf dem Computer: das System EXMARaLDA. In: GesprÀchsforschung (3) 1-23. [BibTeX] [URL] |
|||||
BibTeX:
@article{Schmidt2002a,
author = {Schmidt, Thomas},
title = {GesprÀchstranskription auf dem Computer: das System EXMARaLDA},
journal = {GesprÀchsforschung},
year = {2002},
volume = {3},
pages = {1-23},
note = {DE},
url = {http://www.gespraechsforschung-ozs.de/heft2002/px-schmidt.pdf}
}
|
|||||
|
Schmidt, T. (2002) Stellungnahme zu Wolfgang Schneiders Artikel "Annotate in Transkriptionen aus DV-technischer Sicht". In: GesprÀchsforschung (3) 237-249. [BibTeX] [URL] |
|||||
BibTeX:
@article{Schmidt2002,
author = {Schmidt, Thomas},
title = {Stellungnahme zu Wolfgang Schneiders Artikel "Annotate in Transkriptionen aus DV-technischer Sicht"},
journal = {GesprÀchsforschung},
year = {2002},
volume = {3},
pages = {237-249},
note = {DE},
url = {http://www.gespraechsforschung-ozs.de/heft2002/px-schmidt-2.pdf}
}
|
|||||
| 2001 | |||||
|
Schmidt, T. (2001) The transcription system EXMARaLDA: An application of the annotation graph formalism as the Basis of a Database of Multilingual Spoken Discourse. In: Bird, S.; Buneman, P. & Liberman, M. (ed.): Proceedings of the IRCS Workshop On Linguistic Databases, 11-13 December 2001, 219-227. Philadelphia: Institute for Research in Cognitive Science, University of Pennsylvania. [Abstract] [BibTeX] [URL] |
|||||
| Abstract: This paper describes EXMARaLDA, a system for computer transcription of spoken discourse developed and used by the SFB "Mehrsprachigkeit" at the university of Hamburg. EXMARaLDA consists of several DTDs for XML coding of transcription data and some input and output tools for these formats. Apart from being a transcription system in its own right, EXMARaLDA also plays the role of a mediator between older existing data formats at the SFB and between these formats and a planned database of multilingual spoken discourse. | |||||
BibTeX:
@inproceedings{Schmidt2001,
author = {Schmidt, Thomas},
title = {The transcription system EXMARaLDA: An application of the annotation graph formalism as the Basis of a Database of Multilingual Spoken Discourse},
booktitle = {Proceedings of the IRCS Workshop On Linguistic Databases, 11-13 December 2001},
address = {Philadelphia},
publisher = {Institute for Research in Cognitive Science, University of Pennsylvania},
year = {2001},
pages = {219-227},
note = {EN},
url = {http://www.exmaralda.org/files/IRCS_Paper.pdf}
}
|
|||||
zurück zur Übersicht
Vorträge - Materialien
Alle Folien und Handouts zu den Vorträgen liegen in dem Datei-Format PDF vor.
FiSS Herbstschule
Universität Hamburg, November 2009
Folien von einem Vortrag über Standards für linguistische Korpora
Folien von einem Vortrag über EXMARaLDA und FOLKER
EXMARaLDA
IDS Fachmesse Korpustechnologie, Mannheim, März 2009
Poster
Transcription Tools, Transcription Conventions, and the TEI Guidelines for Transcriptions of Speech
TEI Members Meeting, London, November 2008
Folien
Processing Pipelines für Korpora gesprochener Sprache
Workshop 'Processing Pipelines', Darmstadt, Juli 2008
Folien
Möglichkeiten der computergestützten Erstellung und Analyse von Korpora gesprochener Sprache
Tagung der Gesellschaft für Angewandte Linguistik, Hildesheim, Sep 2007
Folien
EXMARaLDA â Creating, analysing and sharing spoken language corpora for pragmatic research
10th International Pragmatics Conference (IPrA), Göteborg, Jul 2007
Folien
Creating and Analysing Bilingual Language Corpora With EXMARaLDA
6th International Symposium on Bilingualism (ISB6), Hamburg, Mai 2007
Poster
Perspektiven und Probleme der Computergestützten Verarbeitung von Gesprächskorpora
Arbeitstagung zur Gesprächsforschung, Mannheim, Mär 2007
Folien
From Project Data To Sustainable Archiving of Linguistic Corpora
TEI Members Meeting, Victoria, BC, Okt 2006
Poster
Computergestützte Erstellung und Auswertung von Korpora gesprochener Sprache mit EXMARaLDA
Plenumsvortrag, Universität Bielefeld, Mai 2005
Folien
Database "Multilingualism" - Perspectives for collaborative corpus construction and collaborative commentary
LREC Conference, Lisbon, Mai 2004
Folien
Multilingual Data
Workshop, SFB "Mehrsprachigkeit", Universität Hamburg, Jul 2003
Folien: Einleitung
Handout: Einleitung
Folien: Vortrag
Handout: Vortrag
EXMARaLDA
Journée d'Etude de l'ATALA - Constitution et exploitation de corpus du français parlé, Paris, Mai 2002
Poster
IRCS Workshop on Linguistic Databases
Workshop, University of Philadelphia, Dez 2001
Folien
Handout
zurück zur Übersicht



