You are here : english > Research > Equipment resources > ICAR resources

ICAR resources

The Complex Corpus Cell (CCC) is a transversal structure of the ICAR laboratory. Its activities revolve around the constitution and exploitation of multimedia corpuses, qualified as "complex", i.e. combining video, audio, text, image, etc. contents. These activities require specific theoretical and/or methodological reflection, characteristic of the CCC's expertise. In addition, the CCC contributes to the pooling and development of practices and know-how within the laboratory through scientific exchanges, training and thematic seminars.

The team

The office, CCC's entry point, is made up of the permanent staff attached to CCC. It is composed of Justine Lascar, Matthieu Quignard and Daniel Valero (manager). The CCC also calls upon contract staff to cover various needs such as software development, transcriptions, corpus building, audio/video editing, etc. In particular, Laurie Boyer contributes to audio-visual data collection, transcription, and data analysis.

The questions raised by the production and processing of corpora are not limited to methodological problems, but involve a reflection on the articulation between the work of data collection and the requirements of analysis. In the field of linguistic analysis of interaction, this translates in particular into an attention to the linguistic and multimodal details produced, mobilised, interpreted by the participants and made available through adequate recording, transcription and analysis techniques.
In other words, the requirement for continuous accessibility of the relevant details of the interaction governs all stages of corpus building and analysis: from the field collection to the "building" phase, which includes audio-visual editing, transcription, alignment, annotation, up to the actual analysis phase.

Multimodal corpus collection

The first step in the work of interaction analysis is the collection of situational data.. Far from being a preliminary, secondary and marginal step that could be conceived independently of analytical objectives, data collection is an integral part of the overall analysis process. The quality of its realization depends not only on the quality of the corpuses that will be built from the primary data collected and the quality of the analyses that can be made, but also on the possibilities of dissemination of both.
Collecting data is therefore not a one-off and purely technical step, it is an undertaking that involves knowledge of the field and the collectors' relations with the various actors involved, the practical and technical dimensions of registration as well as various ethical and legal concerns.
Corpus recording is a material and technical operation that must be designed and carried out according to the objectives and objects of analysis. This operation aims at capturing audio/video data in order to make available, and therefore analyzable, the linguistic, multimodal and situational details (looks, gestures, movements, actions, objects, physical setting) relevant to the recorded interaction.
Corpus registration is a material and technical operation, and the CCC supports researchers in language sciences, education sciences, etc. in this process.

Transfer, montage - support and training

The montage phase is equally important to make the data available and intelligible. The Complex Corpus Unit, a transversal research support structure at the ICAR laboratory, uses its expertise in post-production data processing (synchronization of different sources, anonymization, audio and video editing, etc.), multimedia format management, and data archiving and distribution issues. She is in charge of training laboratory members and doctoral students on these methodological and technical aspects.

Exploitation of corpora

After the preparation of the audiovisual data, a step of sequencing the interactions and fine transcription of the oral phenomena is necessary.. This basic material for analysis involves a long and meticulous operation (about one hour of work for one minute of signal). The transcriptions are often made with specific software to align the audio-video signal with the written word (e.g. ELAN), and for some of them to make queries on the lexicon. (CAQDAS)


- Microphones HF
- Recorders (Edirol, Marantz, Zoom, Tascam..)
- Cameras 4K
- Action Cam (Go Pro, Sony)
- Cameras 360°
- Corpus room
- Computers

ICAR Audio Visual Reservation Site (RAVI) Access


Transcription and analysis software
- Praat
- Transcriber
- TransICOR
- Transana
- Nvivo

Audio-visual montage

- Audacity
- Handbrake
- Quicktime Pro
- Imovie
- Final Cut