10 `emuR` - package functions

This chapter gives an overview of the essential functions and central objects provided by the emuR package. It is not meant as a comprehensive list of every function and object provided by emuR, but rather tries to group the essential functions into meaningful categories for easier navigation. The categories presented in this chapter are:

Import and conversion routines (Section 10.1),
emuDB interaction and configuration routines (Section 10.2),
EMU-webApp configuration routines (Section 10.3),
Data extraction routines (Section 10.4),
Central objects in emuR (Section 10.5), and
Export routines (Section 10.6).

If a comprehensive list of every function and object provided by the emuR package is required, R’s help() function (see R code snippet below) can be used.

help(package="emuR")

10.1 Import and conversion routines

As most people that are starting to use the EMU-SDMS will probably already have some form of annotated data, we will first show how to convert existing data to the emuDB format. For a guide to creating an emuDB from scratch and for information about this format see Chapter 5.

10.1.1 Legacy EMU databases

For people transitioning to emuR from the legacy EMU system, emuR provides a function for converting existing legacy EMU databases to the new emuDB format. The R code snippet below shows how to convert a legacy database that is part of the demo data provided by the emuR package.

# load the package
library(emuR)

# create demo data in directory provided by the tempdir() function
create_emuRdemoData(dir = tempdir())

# get the path to a .tpl file of
# a legacy EMU database that is part of the demo data
tplPath = file.path(tempdir(),
                    "emuR_demoData",
                    "legacy_ae",
                    "ae.tpl")

# convert this legacy EMU database to the emuDB format
convert_legacyEmuDB(emuTplPath = tplPath, 
                    targetDir = tempdir())

This will create a new emuDB in a temporary directory, provided by R’s tempdir() function, containing all the information specified in the .tpl file. The name of the new emuDB is the same as the basename of the .tpl file from which it was generated. In other words, if the template file of the legacy EMU database has path A and the directory to which the converted database is to be written has path B, then convert_legacyEmuDB(emuTplPath = "A", targetdir = "B") will create an emuDB directory in B from the information stored in A.

10.1.2 TextGrid collections

A further function provided is the convert_TextGridCollection() function. This function converts an existing .TextGrid and .wav file collection to the emuDB format. In order to pair the correct files together the .TextGrid files and the .wav files must have the same name (i.e., file name without extension). A further restriction is that the tiers contained within all the .TextGrid files have to be equal in name and type (equal subsets can be chosen using the tierNames argument of the function). For example, if all .TextGrid files contain the tiers Syl: IntervalTier, Phonetic: IntervalTier and Tone: TextTier the conversion will work. However, if a single .TextGrid of the collection has the additional tier Word: IntervalTier the conversion will fail, although it can be made to work by specifying the equal tier subset equalSubset = c('Syl', 'Phonetic', 'Tone') and passing it into the function argument convert_TextGridCollection(..., tierNames = equalSubset, ...). The R code snippet below shows how to convert a TextGrid collection to the emuDB format.

# get the path to a directory containing
# .wav & .TextGrid files that is part of the demo data
path2directory = file.path(tempdir(),
                           "emuR_demoData",
                           "TextGrid_collection")

# convert this TextGridCollection to the emuDB format
convert_TextGridCollection(path2directory, 
                           dbName = "myTGcolDB",
                           targetDir = tempdir())

The above R code snippet will create a new emuDB in the directory tempdir() called myTGcolDB. The emuDB will contain all the tier information from the .TextGrid files but will not contain hierarchical information, as .TextGrid files do not contain any linking information. It is worth noting that it is possible to semi-automatically generate links between time-bearing levels using the autobuild_linkFromTimes() function. An example of this was given in Chapter 3. The above R code snippet creates a new emuDB in the directory tempdir() called myTGcolDB. The emuDB contains all the tier information from the .TextGrid files no hierarchical information, as .TextGrid files do not contain any linking information. Further, it is possible to semi-automatically generate links between time-bearing levels using the autobuild_linkFromTimes() function. An example of this was given in Chapter 3.

10.1.3 BPF collections

Similar to the convert_TextGridCollection() function, the emuR package also provides a function for converting file collections consisting of BPF and .wav files to the emuDB format. The R code snippet below shows how this can be achieved.

# get the path to a directory containing
# .wav & .par files that is part of the demo data
path2directory = file.path(tempdir(),
                           "emuR_demoData",
                           "BPF_collection")

# convert this BPFCollection to the emuDB format
# (verbose = F is only set to avoid additional output in manual)
convert_BPFCollection(path2directory, 
                      dbName = 'myBPF-DB',
                      targetDir = tempdir(), 
                      verbose = F)

As the BPF format also permits annotation items to be linked to one another, this conversion function can optionally preserve this hierarchical information by specifying the refLevel argument.

10.1.4 txt collections

A further conversion routine provided by the emuR package is the convert_txtCollection() function. As with other file collection conversion functions, it converts file pair collections but this time consisting of plain text .txt and .wav files to the emuDB format. Compared to other conversion routines it behaves slightly differently, as unformatted plain text files do not contain any time information. It therefore places all the annotations of a single .txt file into a single timeless annotation item on a level of type ITEM called bundle.

# get the path to a directory containing .wav & .par
# files that is part of the demo data
path2directory = file.path(tempdir(),
                           "emuR_demoData",
                           "txt_collection")

# convert this txtCollection to the emuDB format
# (verbose = F is only set to avoid additional output in manual)
convert_txtCollection(sourceDir = path2directory,
                      dbName = "txtCol",
                      targetDir = tempdir(),
                      attributeDefinitionName = "transcription",
                      verbose = F)

Using this conversion routine creates a bare-bone, single route node emuDB which either can be further manually annotated or automatically hierarchically annotated using the runBASwebservice_*²¹ functions of emuR.

10.2 `emuDB` interaction and configuration routines

This section provides a tabular overview of all the emuDB interaction routines provided by the emuR package and also provides a short description of each function or group of functions.

Table 10.1: Overview of the `emuDB` interaction routines provided by `emuR`.
Functions	Description
`add/list/remove_attrDefLabelGroup()`	Add / list / remove label group to / of / from `attributeDefinition` of `emuDB`
`add/list/remove_labelGroup()`	Add / list / remove global label group to / of / from `emuDB`
`add/list/remove_levelDefinition()`	Add / list / remove level definition to / of / from `emuDB`
`add/list/remove_linkDefinition()`	Add / list / remove link definition to / of / from `emuDB`
`add/list/ remove_ssffTrackDefinition()`	Add / list / remove SSFF track definition to / of / from `emuDB`
`add/list/rename/remove_attributeDefinition()`	Add / list / rename / remove attribute definition to / of / from `emuDB`
`add_files()`	Add files to `emuDB`
`autobuild_linkFromTimes()`	Autobuild links between two levels using their time information `emuDB`
`create_emuDB()`	Create empty `emuDB`
`duplicate_level()`	Duplicate level
`import_mediaFiles()`	Import media files to `emuDB`
`list_bundles()`	List bundles of `emuDB`
`list_files()`	List files of `emuDB`
`list_sessions()`	List sessions of `emuDB`
`load_emuDB()`	Load `emuDB`
`replace_itemLabels()`	Replace item labels
`set/get/remove_legalLabels()`	Set / get / remove legal labels of attribute definition of `emuDB`
`rename_emuDB()`	Rename `emuDB`

10.3 `EMU-webApp` configuration routines

This section provides a tabular overview of all the EMU-webApp configuration routines provided by the emuR package and also provides a short description of each function or group of functions. See Chapter 9 for examples of how to use these functions.

Table 10.2: Overview of the `EMU-webApp` configuration functions provided by `emuR`.
Functions	Description
`add/list/remove_perspective()`	Add / list / remove perspective to / of / from `emuDB`
`set/get_levelCanvasesOrder()`	Set / get level canvases order for `EMU-webApp` of `emuDB`
`set/get_signalCanvasesOrder()`	Set / get signal canvases order for `EMU-webApp` of `emuDB`

It is worth noting that the legal labels configuration of the emuDB configuration will also affect how the EMU-webApp behaves, as it will not permit any other labels to be entered except those defined as legal labels.

10.4 Data extraction routines

This section provides a tabular overview of all the data extraction routines provided by the emuR package and also provides a short description of each function or group of functions. See Chapter 6 and Chapter 7 for multiple examples of how the various data extraction routines can be used.

Table 10.3: Overview of the data extraction functions provided by `emuR`.
Functions	Description
`query()`	Query `emuDB`
`requery_hier()`	Requery hierarchical context of a segment list in an `emuDB`
`requery_seq()`	Requery sequential context of segment list in an `emuDB`
`get_trackdata()`	Get trackdata from loaded `emuDB`

10.5 Central objects

This section provides a tabular overview of the central objects provided by the emuR package and also provides a short description of each object. See Chapter 6 and 7 for examples of functions returning these objects and how they can be used.

Table 10.4: Overview of the central objects of the `emuR` package.
Object	Description
`emuRsegs`	A `emuR` segment list is a list of segment descriptions. Each segment descriptions describes a sequence of annotation items. The list is usually a result of an `emuDB` query using the `query()` function.
`trackdata`	A track data object is the result of `get_trackdata()` and usually contains the extracted signal data tracks belonging to segments of a segment list.
`emuRtrackdata`	A `emuR` track data object is the result of `get_trackdata()` if the `resultType` parameter is set to `emuRtrackdata` or the result of an explicit call to `create_emuRtrackdata`. Compared to the `trackdata` object it is a sub-class of a `data.table`/`data.frame` which is meant to ease integration with other packages for further processing. It can be viewed as an amalgamation of an `emuRsegs` and a `trackdata` object as it contains the information stored in both objects (see also `?create_emuRtrackdata()`).

10.6 Export routines

Although associated with data loss, the emuR package provides an export routine to the common TextGrid collection format called export_TextGridCollection(). While exporting is sometimes unavoidable, it is essential that users are aware that exporting to other formats which do not support or only partially support hierarchical annotations structures will lead to the loss of the explicit linking information. Although the autobuild_linkFromTimes() can partially recreate some of the hierarchical structure, it is advised that the export routine be used with extreme caution. The R code snippet below shows how export_TextGridCollection() can be used to export the levels Text, Syllable and Phonetic of the ae demo emuDB to a TextGrid collection. Figure 10.1 shows the content of the created msajc003.TextGrid file as displayed by Praat’s "Draw visible sound and Textgrid..." procedure.

# get the path to "ae" emuDB
path2ae = file.path(tempdir(), "emuR_demoData", "ae_emuDB")

# load "ae" emuDB
ae = load_emuDB(path2ae)

# export the levels "Text", "Syllable"
# and "Phonetic" to a TextGrid collection
export_TextGridCollection(ae,
                          targetDir = tempdir(),
                          attributeDefinitionNames = c("Text",
                                                       "Syllable",
                                                       "Phonetic"))

TextGrid annotation generated by the `export_TextGridCollection()` function containing the tiers (from top to bottom): *Text*, *Syllable*, *Phonetic*.

Figure 10.1: TextGrid annotation generated by the export_TextGridCollection() function containing the tiers (from top to bottom): Text, Syllable, Phonetic.

Depending on user requirements, additional export routines might be added to the emuR in the future.

10.7 Conclusion

This chapter provided an overview of the essential functions and central objects, grouped into meaningful categories, provided by the emuR package. It is meant as a quick reference for the user to quickly find functions she or he is interested in.

Functions contributed by Nina Pörner.↩︎

10 emuR - package functions