10 emuR
- package functions
This chapter gives an overview of the essential functions and central objects provided by the emuR
package. It is not meant as a comprehensive list of every function and object provided by emuR
, but rather tries to group the essential functions into meaningful categories for easier navigation. The categories presented in this chapter are:
- Import and conversion routines (Section 10.1),
emuDB
interaction and configuration routines (Section 10.2),EMU-webApp
configuration routines (Section 10.3),- Data extraction routines (Section 10.4),
- Central objects in
emuR
(Section 10.5), and - Export routines (Section 10.6).
If a comprehensive list of every function and object provided by the emuR
package is required, R’s help()
function (see R code snippet below) can be used.
help(package="emuR")
10.1 Import and conversion routines
As most people that are starting to use the EMU-SDMS will probably already have some form of annotated data, we will first show how to convert existing data to the emuDB
format. For a guide to creating an emuDB
from scratch and for information about this format see Chapter 5.
10.1.1 Legacy EMU databases
For people transitioning to emuR
from the legacy EMU system, emuR
provides a function for converting existing legacy EMU databases to the new emuDB
format. The R code snippet below shows how to convert a legacy database that is part of the demo data provided by the emuR
package.
# load the package
library(emuR)
# create demo data in directory provided by the tempdir() function
create_emuRdemoData(dir = tempdir())
# get the path to a .tpl file of
# a legacy EMU database that is part of the demo data
= file.path(tempdir(),
tplPath "emuR_demoData",
"legacy_ae",
"ae.tpl")
# convert this legacy EMU database to the emuDB format
convert_legacyEmuDB(emuTplPath = tplPath,
targetDir = tempdir())
This will create a new emuDB
in a temporary directory, provided by R’s tempdir()
function, containing all the information specified in the .tpl
file. The name of the new emuDB
is the same as the basename of the .tpl
file from which it was generated. In other words, if the template file of the legacy EMU database has path A
and the directory to which the converted database is to be written has path B
, then convert_legacyEmuDB(emuTplPath = "A", targetdir = "B")
will create an emuDB
directory in B
from the information stored in A
.
10.1.2 TextGrid collections
A further function provided is the convert_TextGridCollection()
function. This function converts an existing .TextGrid
and .wav
file collection to the emuDB
format. In order to pair the correct files together the .TextGrid
files and the .wav
files must have the same name (i.e., file name without extension). A further restriction is that the tiers contained within all the .TextGrid
files have to be equal in name and type (equal subsets can be chosen using the tierNames
argument of the function). For example, if all .TextGrid
files contain the tiers Syl: IntervalTier
, Phonetic: IntervalTier
and Tone: TextTier
the conversion will work. However, if a single .TextGrid
of the collection has the additional tier Word: IntervalTier
the conversion will fail, although it can be made to work by specifying the equal tier subset equalSubset = c('Syl', 'Phonetic', 'Tone')
and passing it into the function argument convert_TextGridCollection(..., tierNames = equalSubset, ...)
. The R code snippet below shows how to convert a TextGrid collection to the emuDB
format.
# get the path to a directory containing
# .wav & .TextGrid files that is part of the demo data
= file.path(tempdir(),
path2directory "emuR_demoData",
"TextGrid_collection")
# convert this TextGridCollection to the emuDB format
convert_TextGridCollection(path2directory,
dbName = "myTGcolDB",
targetDir = tempdir())
The above R code snippet will create a new emuDB
in the directory tempdir()
called myTGcolDB
. The emuDB
will contain all the tier information from the .TextGrid
files but will not contain hierarchical information, as .TextGrid
files do not contain any linking information. It is worth noting that it is possible to semi-automatically generate links between time-bearing levels using the autobuild_linkFromTimes()
function. An example of this was given in Chapter 3.
The above R code snippet creates a new emuDB
in the directory tempdir()
called myTGcolDB
. The emuDB
contains all the tier information from the .TextGrid
files no hierarchical information, as .TextGrid
files do not contain any linking information. Further, it is possible to semi-automatically generate links between time-bearing levels using the autobuild_linkFromTimes()
function. An example of this was given in Chapter 3.
10.1.3 BPF collections
Similar to the convert_TextGridCollection()
function, the emuR
package also provides a function for converting file collections consisting of BPF and .wav
files to the emuDB
format. The R code snippet below shows how this can be achieved.
# get the path to a directory containing
# .wav & .par files that is part of the demo data
= file.path(tempdir(),
path2directory "emuR_demoData",
"BPF_collection")
# convert this BPFCollection to the emuDB format
# (verbose = F is only set to avoid additional output in manual)
convert_BPFCollection(path2directory,
dbName = 'myBPF-DB',
targetDir = tempdir(),
verbose = F)
As the BPF format also permits annotation items to be linked to one another, this conversion function can optionally preserve this hierarchical information by specifying the refLevel
argument.
10.1.4 txt collections
A further conversion routine provided by the emuR
package is the convert_txtCollection()
function. As with other file collection conversion functions, it converts file pair collections but this time consisting of plain text .txt
and .wav
files to the emuDB
format. Compared to other conversion routines it behaves slightly differently, as unformatted plain text files do not contain any time information. It therefore places all the annotations of a single .txt
file into a single timeless annotation item on a level of type ITEM
called bundle.
# get the path to a directory containing .wav & .par
# files that is part of the demo data
= file.path(tempdir(),
path2directory "emuR_demoData",
"txt_collection")
# convert this txtCollection to the emuDB format
# (verbose = F is only set to avoid additional output in manual)
convert_txtCollection(sourceDir = path2directory,
dbName = "txtCol",
targetDir = tempdir(),
attributeDefinitionName = "transcription",
verbose = F)
Using this conversion routine creates a bare-bone, single route node emuDB
which either can be further manually annotated or automatically hierarchically annotated using the runBASwebservice_*
21 functions of emuR
.
10.2 emuDB
interaction and configuration routines
This section provides a tabular overview of all the emuDB
interaction routines provided by the emuR
package and also provides a short description of each function or group of functions.
Functions | Description |
---|---|
add/list/remove_attrDefLabelGroup()
|
Add / list / remove label group to / of / from attributeDefinition of emuDB
|
add/list/remove_labelGroup()
|
Add / list / remove global label group to / of / from emuDB
|
add/list/remove_levelDefinition()
|
Add / list / remove level definition to / of / from emuDB
|
add/list/remove_linkDefinition()
|
Add / list / remove link definition to / of / from emuDB
|
add/list/ remove_ssffTrackDefinition()
|
Add / list / remove SSFF track definition to / of / from emuDB
|
add/list/rename/remove_attributeDefinition()
|
Add / list / rename / remove attribute definition to / of / from emuDB
|
add_files()
|
Add files to emuDB
|
autobuild_linkFromTimes()
|
Autobuild links between two levels using their time information emuDB
|
create_emuDB()
|
Create empty emuDB
|
duplicate_level()
|
Duplicate level |
import_mediaFiles()
|
Import media files to emuDB
|
list_bundles()
|
List bundles of emuDB
|
list_files()
|
List files of emuDB
|
list_sessions()
|
List sessions of emuDB
|
load_emuDB()
|
Load emuDB
|
replace_itemLabels()
|
Replace item labels |
set/get/remove_legalLabels()
|
Set / get / remove legal labels of attribute definition of emuDB
|
rename_emuDB()
|
Rename emuDB
|
10.3 EMU-webApp
configuration routines
This section provides a tabular overview of all the EMU-webApp
configuration routines provided by the emuR
package and also provides a short description of each function or group of functions. See Chapter 9 for examples of how to use these functions.
Functions | Description |
---|---|
add/list/remove_perspective()
|
Add / list / remove perspective to / of / from emuDB
|
set/get_levelCanvasesOrder()
|
Set / get level canvases order for EMU-webApp of emuDB
|
set/get_signalCanvasesOrder()
|
Set / get signal canvases order for EMU-webApp of emuDB
|
It is worth noting that the legal labels configuration of the emuDB
configuration will also affect how the EMU-webApp
behaves, as it will not permit any other labels to be entered except those defined as legal labels.
10.4 Data extraction routines
This section provides a tabular overview of all the data extraction routines provided by the emuR
package and also provides a short description of each function or group of functions. See Chapter 6 and Chapter 7 for multiple examples of how the various data extraction routines can be used.
Functions | Description |
---|---|
query()
|
Query emuDB
|
requery_hier()
|
Requery hierarchical context of a segment list in an emuDB
|
requery_seq()
|
Requery sequential context of segment list in an emuDB
|
get_trackdata()
|
Get trackdata from loaded emuDB
|
10.5 Central objects
This section provides a tabular overview of the central objects provided by the emuR
package and also provides a short description of each object. See Chapter 6 and 7 for examples of functions returning these objects and how they can be used.
Object | Description |
---|---|
emuRsegs
|
A emuR segment list is a list of segment descriptions. Each segment descriptions describes a sequence of annotation items. The list is usually a result of an emuDB query using the query() function.
|
trackdata
|
A track data object is the result of get_trackdata() and usually contains the extracted signal data tracks belonging to segments of a segment list.
|
emuRtrackdata
|
A emuR track data object is the result of get_trackdata() if the resultType parameter is set to emuRtrackdata or the result of an explicit call to create_emuRtrackdata . Compared to the trackdata object it is a sub-class of a data.table /data.frame which is meant to ease integration with other packages for further processing. It can be viewed as an amalgamation of an emuRsegs and a trackdata object as it contains the information stored in both objects (see also ?create_emuRtrackdata() ).
|
10.6 Export routines
Although associated with data loss, the emuR
package provides an export routine to the common TextGrid collection format called export_TextGridCollection()
. While exporting is sometimes unavoidable, it is essential that users are aware that exporting to other formats which do not support or only partially support hierarchical annotations structures will lead to the loss of the explicit linking information. Although the autobuild_linkFromTimes()
can partially recreate some of the hierarchical structure, it is advised that the export routine be used with extreme caution. The R code snippet below shows how export_TextGridCollection()
can be used to export the levels Text, Syllable and Phonetic of the ae demo emuDB
to a TextGrid collection. Figure 10.1 shows the content of the created msajc003.TextGrid
file as displayed by Praat’s "Draw visible sound and Textgrid..."
procedure.
# get the path to "ae" emuDB
= file.path(tempdir(), "emuR_demoData", "ae_emuDB")
path2ae
# load "ae" emuDB
= load_emuDB(path2ae)
ae
# export the levels "Text", "Syllable"
# and "Phonetic" to a TextGrid collection
export_TextGridCollection(ae,
targetDir = tempdir(),
attributeDefinitionNames = c("Text",
"Syllable",
"Phonetic"))
Depending on user requirements, additional export routines might be added to the emuR
in the future.
10.7 Conclusion
This chapter provided an overview of the essential functions and central objects, grouped into meaningful categories, provided by the emuR
package. It is meant as a quick reference for the user to quickly find functions she or he is interested in.
Functions contributed by Nina Pörner.↩︎