.. _package-rst-audition:
======================
Package rst.audition
======================
Audio signal processing, sometimes referred to as audio processing,
is the intentional alteration of auditory signals, or sound.
This package contains data type definitions related to audio
processing.
.. seealso::
Wikipedia article containing the definition above
http://en.wikipedia.org/wiki/Audio_signal_processing
Messages
========
.. container:: mess4ge-multi
.. container:: mess4ge-graph
.. digraph:: message_graph
fontname="Arial";
fontsize=11;
stylesheet="../_static/graphs.css";
node [fontsize=11,fontname="Arial"]
edge [fontsize=11,fontname="Arial"]
"5" [label=<
| Utterance |
|
PhonemeCollection | phonemes |
SoundChunk | audio |
ASCII-STRING | textual_representation |
>,shape=box,style=filled,fillcolor="white"];
"6" [label=< | PhonemeCollection |
|
Phoneme | element |
>,shape=box,style=filled,fillcolor="white"];
"7" [label=< | Phoneme |
|
ASCII-STRING | symbol |
UINT32 | duration |
>,shape=box,style=filled,fillcolor="white"];
"1" [label=< | SoundChunkCollection |
|
SoundChunk | element |
>,shape=box,style=filled,fillcolor="white"];
"2" [label=< | SoundChunk |
|
OCTET-VECTOR | data |
UINT32 | sample_count |
UINT32 | channels |
UINT32 | rate |
SampleType | sample_type |
EndianNess | endianness |
>,shape=box,style=filled,fillcolor="white"];
"4" [label=< | EndianNess |
|
ENDIAN_LITTLE | 0 |
ENDIAN_BIG | 1 |
>,shape=box,style=filled,fillcolor="white"];
"3" [label=< | SampleType |
|
SAMPLE_S8 | 0 |
SAMPLE_U8 | 1 |
SAMPLE_S16 | 2 |
SAMPLE_U16 | 4 |
SAMPLE_S24 | 8 |
SAMPLE_U24 | 16 |
>,shape=box,style=filled,fillcolor="white"];
"5":audio -> "2"[];
"5":phonemes -> "6"[];
"6":element -> "7"[];
"1":element -> "2"[];
"2" -> "4"[dir=both,arrowtail=odiamond];
"2" -> "3"[dir=both,arrowtail=odiamond];
"2":endianness -> "4"[];
"2":sample_type -> "3"[];
.. container:: mess4ge-list
.. container:: messages
* :ref:`SoundChunkCollection `
* :ref:`Utterance `
* :ref:`SoundChunk `
* :ref:`PhonemeCollection `
* :ref:`Phoneme `
.. container:: clearer
clearer: should be made invisible via css
.. _message-rst-audition-soundchunkcollection:
Message SoundChunkCollection
----------------------------
.. container:: message-rst-audition-soundchunkcollection-multi
.. container:: message-rst-audition-soundchunkcollection-documentation
.. py:class:: rst.audition.SoundChunkCollection
Collection of :py:class:`SoundChunk ` instances.
Auto-generated.
.. py:attribute:: element
:type: array of :py:class:`rst.audition.SoundChunk`
The individual elements of the collection.
Constraints regarding the empty collection, sorting, duplicated
entries etc. are use case specific.
.. container:: message-rst-audition-soundchunkcollection-source
:download:`Download this file /home/jenkins/workspace/rst-manual-trunk/upstream/RST-0.19.0-Linux/share/rst0.19/proto/stable/rst/audition/SoundChunkCollection.proto>`
.. literalinclude:: //home/jenkins/workspace/rst-manual-trunk/upstream/RST-0.19.0-Linux/share/rst0.19/proto/stable/rst/audition/SoundChunkCollection.proto
:lines: 14-24
:language: protobuf
:emphasize-lines: 9-9
.. _message-rst-audition-utterance:
Message Utterance
-----------------
.. container:: message-rst-audition-utterance-multi
.. container:: message-rst-audition-utterance-documentation
.. py:class:: rst.audition.Utterance
Objects of this represent a single utterances of speech.
The data describes a single utterance in three different forms:
* :py:attr:`phonemes ` describes the utterance as a list of phone symbols
and durations (useful e.g. for lip animation).
* :py:attr:`audio ` is a that can be played back on audio
devices containing the realization (e.g. by a TTS system)
of the included phoneme list
* is a textual description of the utterance for
debugging purposes.
.. codeauthor:: Simon Schulz
.. py:attribute:: phonemes
:type: :py:class:`rst.audition.PhonemeCollection`
A collection of phonemes. Will be played back in the same
ordering as given by :py:class:`Phoneme `
.. py:attribute:: audio
:type: :py:class:`rst.audition.SoundChunk`
A chunk of audio data that can be played back containing the
realization (e.g. by a TTS system) of the included phoneme list
.. py:attribute:: textual_representation
:type: :py:class:`ASCII-STRING`
Textual representation of the utterance.
.. container:: message-rst-audition-utterance-source
:download:`Download this file /home/jenkins/workspace/rst-manual-trunk/upstream/RST-0.19.0-Linux/share/rst0.19/proto/stable/rst/audition/Utterance.proto>`
.. literalinclude:: //home/jenkins/workspace/rst-manual-trunk/upstream/RST-0.19.0-Linux/share/rst0.19/proto/stable/rst/audition/Utterance.proto
:lines: 27-46
:language: protobuf
:emphasize-lines: 7-7,13-13,18-18
.. _message-rst-audition-soundchunk:
Message SoundChunk
------------------
.. container:: message-rst-audition-soundchunk-multi
.. container:: message-rst-audition-soundchunk-documentation
.. py:class:: rst.audition.SoundChunk
**Constraint**: ``len(.data) == 8 * .channels * .sample_count * TODO(.sample_type)``
Objects of this represent a chunk of an audio stream.
The audio information for one or more :py:attr:`channels ` is stored in
:py:attr:`data ` as a sequence of :py:attr:`sample_count ` encoded samples, the
encoding of which is described by :py:attr:`endianness ` and :py:attr:`sample_type `.
Depending on the sample rate (:py:attr:`rate `), such a chunk of audio
corresponds to a certain amount of time during which its samples
have been recorded.
Interpretation of RSB timestamps:
create:
Capture time of the audio buffer. More precisely, the
timestamp should correspond to the first sample contained
in the buffer.
.. codeauthor:: David Klotz
@create_collection
.. py:attribute:: data
:type: :py:class:`OCTET-VECTOR`
The sequences of bytes representing the samples of this sound
chunk.
The value of this field must be interpreted according to the
values of the :py:attr:`sample_count `, :py:attr:`channels `, :py:attr:`sample_type ` and :py:attr:`endianness ` fields.
.. py:attribute:: sample_count
:type: :py:class:`UINT32`
**Unit**: number
The number of samples contained in :py:attr:`data `.
.. py:attribute:: channels
:type: :py:class:`UINT32`
**Unit**: number
The number of channels for which samples are stored in :py:attr:`data `.
.. py:attribute:: rate
:type: :py:class:`UINT32`
**Unit**: hz
The rate with which the samples stored in :py:attr:`data ` haven been
recorded or should be played.
.. py:attribute:: sample_type
:type: :py:class:`rst.audition.SoundChunk.SampleType`
The data type used for the representation of samples in :py:attr:`data `.
.. py:attribute:: endianness
:type: :py:class:`rst.audition.SoundChunk.EndianNess`
The Endianness used for the representation of samples in :py:attr:`data `.
.. container:: message-rst-audition-soundchunk-source
:download:`Download this file /home/jenkins/workspace/rst-manual-trunk/upstream/RST-0.19.0-Linux/share/rst0.19/proto/stable/rst/audition/SoundChunk.proto>`
.. literalinclude:: //home/jenkins/workspace/rst-manual-trunk/upstream/RST-0.19.0-Linux/share/rst0.19/proto/stable/rst/audition/SoundChunk.proto
:lines: 30-129
:language: protobuf
:emphasize-lines: 64-64,70-70,77-77,84-84,90-90,96-96
.. _message-rst-audition-soundchunk-sampletype:
Message SampleType
------------------
.. container:: message-rst-audition-soundchunk-sampletype-multi
.. container:: message-rst-audition-soundchunk-sampletype-documentation
.. py:class:: rst.audition.SoundChunk.SampleType
The possible data types for representing individual samples.
.. py:attribute:: SAMPLE_S8
= 0
Signed 8-bit samples.
.. py:attribute:: SAMPLE_U8
= 1
Unsigned 8-bit samples.
.. py:attribute:: SAMPLE_S16
= 2
Signed 16-bit samples.
.. py:attribute:: SAMPLE_U16
= 4
Unsigned 16-bit samples.
.. py:attribute:: SAMPLE_S24
= 8
Signed 24-bit samples.
.. py:attribute:: SAMPLE_U24
= 16
Unsigned 24-bit samples.
.. container:: message-rst-audition-soundchunk-sampletype-source
:download:`Download this file /home/jenkins/workspace/rst-manual-trunk/upstream/RST-0.19.0-Linux/share/rst0.19/proto/stable/rst/audition/SoundChunk.proto>`
.. literalinclude:: //home/jenkins/workspace/rst-manual-trunk/upstream/RST-0.19.0-Linux/share/rst0.19/proto/stable/rst/audition/SoundChunk.proto
:lines: 35-67
:language: protobuf
:emphasize-lines: 6-6,11-11,16-16,21-21,26-26,31-31
.. _message-rst-audition-soundchunk-endianness:
Message EndianNess
------------------
.. container:: message-rst-audition-soundchunk-endianness-multi
.. container:: message-rst-audition-soundchunk-endianness-documentation
.. py:class:: rst.audition.SoundChunk.EndianNess
The possible byte-orders for representing samples.
.. py:attribute:: ENDIAN_LITTLE
= 0
Samples are represented with little Endian byte-order.
.. py:attribute:: ENDIAN_BIG
= 1
Samples are represented with big Endian byte-order.
.. container:: message-rst-audition-soundchunk-endianness-source
:download:`Download this file /home/jenkins/workspace/rst-manual-trunk/upstream/RST-0.19.0-Linux/share/rst0.19/proto/stable/rst/audition/SoundChunk.proto>`
.. literalinclude:: //home/jenkins/workspace/rst-manual-trunk/upstream/RST-0.19.0-Linux/share/rst0.19/proto/stable/rst/audition/SoundChunk.proto
:lines: 72-83
:language: protobuf
:emphasize-lines: 6-6,11-11
.. _message-rst-audition-phonemecollection:
Message PhonemeCollection
-------------------------
.. container:: message-rst-audition-phonemecollection-multi
.. container:: message-rst-audition-phonemecollection-documentation
.. py:class:: rst.audition.PhonemeCollection
Collection of :py:class:`Phoneme ` instances.
Auto-generated.
.. py:attribute:: element
:type: array of :py:class:`rst.audition.Phoneme`
The individual elements of the collection.
Constraints regarding the empty collection, sorting, duplicated
entries etc. are use case specific.
.. container:: message-rst-audition-phonemecollection-source
:download:`Download this file /home/jenkins/workspace/rst-manual-trunk/upstream/RST-0.19.0-Linux/share/rst0.19/proto/stable/rst/audition/PhonemeCollection.proto>`
.. literalinclude:: //home/jenkins/workspace/rst-manual-trunk/upstream/RST-0.19.0-Linux/share/rst0.19/proto/stable/rst/audition/PhonemeCollection.proto
:lines: 14-24
:language: protobuf
:emphasize-lines: 9-9
.. _message-rst-audition-phoneme:
Message Phoneme
---------------
.. container:: message-rst-audition-phoneme-multi
.. container:: message-rst-audition-phoneme-documentation
.. py:class:: rst.audition.Phoneme
Objects of this represent a single phoneme-duration pair.
A list of elements of this type can be used to describe words or
whole sentences of speech.
.. codeauthor:: Simon Schulz
@create_collection
.. py:attribute:: symbol
:type: :py:class:`ASCII-STRING`
A single phone symbol (such as aI, E, C, R, _, ...).
e.g. see https://en.wikipedia.org/wiki/Phoneme
or http://www.phon.ucl.ac.uk/home/sampa/german.htm (german)
examples
.. py:attribute:: duration
:type: :py:class:`UINT32`
**Unit**: millisecond
The duration of this symbol.
.. container:: message-rst-audition-phoneme-source
:download:`Download this file /home/jenkins/workspace/rst-manual-trunk/upstream/RST-0.19.0-Linux/share/rst0.19/proto/stable/rst/audition/Phoneme.proto>`
.. literalinclude:: //home/jenkins/workspace/rst-manual-trunk/upstream/RST-0.19.0-Linux/share/rst0.19/proto/stable/rst/audition/Phoneme.proto
:lines: 16-33
:language: protobuf
:emphasize-lines: 10-10,16-16