.. _package-rst-audition: ====================== Package rst.audition ====================== Audio signal processing, sometimes referred to as audio processing, is the intentional alteration of auditory signals, or sound. This package contains data type definitions related to audio processing. .. seealso:: Wikipedia article containing the definition above http://en.wikipedia.org/wiki/Audio_signal_processing Messages ======== .. container:: mess4ge-multi .. container:: mess4ge-graph .. digraph:: message_graph fontname="Arial"; fontsize=11; stylesheet="../_static/graphs.css"; node [fontsize=11,fontname="Arial"] edge [fontsize=11,fontname="Arial"] "5" [label=<

Utterance

PhonemeCollection

phonemes

SoundChunk

audio

ASCII-STRING

textual_representation

>,shape=box,style=filled,fillcolor="white"]; "6" [label=<

PhonemeCollection

Phoneme

element

>,shape=box,style=filled,fillcolor="white"]; "7" [label=<

Phoneme

ASCII-STRING

symbol

UINT32

duration

>,shape=box,style=filled,fillcolor="white"]; "1" [label=<

SoundChunkCollection

SoundChunk

element

>,shape=box,style=filled,fillcolor="white"]; "2" [label=<

SoundChunk

OCTET-VECTOR

data

UINT32

sample_count

UINT32

channels

UINT32

rate

SampleType

sample_type

EndianNess

endianness

>,shape=box,style=filled,fillcolor="white"]; "4" [label=<

EndianNess

ENDIAN_LITTLE

ENDIAN_BIG

>,shape=box,style=filled,fillcolor="white"]; "3" [label=<

SampleType

SAMPLE_S8

SAMPLE_U8

SAMPLE_S16

SAMPLE_U16

SAMPLE_S24

SAMPLE_U24

>,shape=box,style=filled,fillcolor="white"]; "5":audio -> "2"[]; "5":phonemes -> "6"[]; "6":element -> "7"[]; "1":element -> "2"[]; "2" -> "4"[dir=both,arrowtail=odiamond]; "2" -> "3"[dir=both,arrowtail=odiamond]; "2":endianness -> "4"[]; "2":sample_type -> "3"[]; .. container:: mess4ge-list .. container:: messages * :ref:`SoundChunkCollection ` * :ref:`Utterance ` * :ref:`SoundChunk ` * :ref:`PhonemeCollection ` * :ref:`Phoneme ` .. container:: clearer clearer: should be made invisible via css .. _message-rst-audition-soundchunkcollection: Message SoundChunkCollection ---------------------------- .. container:: message-rst-audition-soundchunkcollection-multi .. container:: message-rst-audition-soundchunkcollection-documentation .. py:class:: rst.audition.SoundChunkCollection Collection of :py:class:`SoundChunk ` instances. Auto-generated. .. py:attribute:: element :type: array of :py:class:`rst.audition.SoundChunk` The individual elements of the collection. Constraints regarding the empty collection, sorting, duplicated entries etc. are use case specific. .. container:: message-rst-audition-soundchunkcollection-source :download:`Download this file ` .. literalinclude:: //home/jenkins/workspace/rst-manual-trunk/upstream/RST-0.19.0-Linux/share/rst0.19/proto/stable/rst/audition/SoundChunkCollection.proto :lines: 14-24 :language: protobuf :emphasize-lines: 9-9 .. _message-rst-audition-utterance: Message Utterance ----------------- .. container:: message-rst-audition-utterance-multi .. container:: message-rst-audition-utterance-documentation .. py:class:: rst.audition.Utterance Objects of this represent a single utterances of speech. The data describes a single utterance in three different forms: * :py:attr:`phonemes ` describes the utterance as a list of phone symbols and durations (useful e.g. for lip animation). * :py:attr:`audio ` is a that can be played back on audio devices containing the realization (e.g. by a TTS system) of the included phoneme list * is a textual description of the utterance for debugging purposes. .. codeauthor:: Simon Schulz .. py:attribute:: phonemes :type: :py:class:`rst.audition.PhonemeCollection` A collection of phonemes. Will be played back in the same ordering as given by :py:class:`Phoneme ` .. py:attribute:: audio :type: :py:class:`rst.audition.SoundChunk` A chunk of audio data that can be played back containing the realization (e.g. by a TTS system) of the included phoneme list .. py:attribute:: textual_representation :type: :py:class:`ASCII-STRING` Textual representation of the utterance. .. container:: message-rst-audition-utterance-source :download:`Download this file ` .. literalinclude:: //home/jenkins/workspace/rst-manual-trunk/upstream/RST-0.19.0-Linux/share/rst0.19/proto/stable/rst/audition/Utterance.proto :lines: 27-46 :language: protobuf :emphasize-lines: 7-7,13-13,18-18 .. _message-rst-audition-soundchunk: Message SoundChunk ------------------ .. container:: message-rst-audition-soundchunk-multi .. container:: message-rst-audition-soundchunk-documentation .. py:class:: rst.audition.SoundChunk **Constraint**: ``len(.data) == 8 * .channels * .sample_count * TODO(.sample_type)`` Objects of this represent a chunk of an audio stream. The audio information for one or more :py:attr:`channels ` is stored in :py:attr:`data ` as a sequence of :py:attr:`sample_count ` encoded samples, the encoding of which is described by :py:attr:`endianness ` and :py:attr:`sample_type `. Depending on the sample rate (:py:attr:`rate `), such a chunk of audio corresponds to a certain amount of time during which its samples have been recorded. Interpretation of RSB timestamps: create: Capture time of the audio buffer. More precisely, the timestamp should correspond to the first sample contained in the buffer. .. codeauthor:: David Klotz @create_collection .. py:attribute:: data :type: :py:class:`OCTET-VECTOR` The sequences of bytes representing the samples of this sound chunk. The value of this field must be interpreted according to the values of the :py:attr:`sample_count `, :py:attr:`channels `, :py:attr:`sample_type ` and :py:attr:`endianness ` fields. .. py:attribute:: sample_count :type: :py:class:`UINT32` **Unit**: number The number of samples contained in :py:attr:`data `. .. py:attribute:: channels :type: :py:class:`UINT32` **Unit**: number The number of channels for which samples are stored in :py:attr:`data `. .. py:attribute:: rate :type: :py:class:`UINT32` **Unit**: hz The rate with which the samples stored in :py:attr:`data ` haven been recorded or should be played. .. py:attribute:: sample_type :type: :py:class:`rst.audition.SoundChunk.SampleType` The data type used for the representation of samples in :py:attr:`data `. .. py:attribute:: endianness :type: :py:class:`rst.audition.SoundChunk.EndianNess` The Endianness used for the representation of samples in :py:attr:`data `. .. container:: message-rst-audition-soundchunk-source :download:`Download this file ` .. literalinclude:: //home/jenkins/workspace/rst-manual-trunk/upstream/RST-0.19.0-Linux/share/rst0.19/proto/stable/rst/audition/SoundChunk.proto :lines: 30-129 :language: protobuf :emphasize-lines: 64-64,70-70,77-77,84-84,90-90,96-96 .. _message-rst-audition-soundchunk-sampletype: Message SampleType ------------------ .. container:: message-rst-audition-soundchunk-sampletype-multi .. container:: message-rst-audition-soundchunk-sampletype-documentation .. py:class:: rst.audition.SoundChunk.SampleType The possible data types for representing individual samples. .. py:attribute:: SAMPLE_S8 = 0 Signed 8-bit samples. .. py:attribute:: SAMPLE_U8 = 1 Unsigned 8-bit samples. .. py:attribute:: SAMPLE_S16 = 2 Signed 16-bit samples. .. py:attribute:: SAMPLE_U16 = 4 Unsigned 16-bit samples. .. py:attribute:: SAMPLE_S24 = 8 Signed 24-bit samples. .. py:attribute:: SAMPLE_U24 = 16 Unsigned 24-bit samples. .. container:: message-rst-audition-soundchunk-sampletype-source :download:`Download this file ` .. literalinclude:: //home/jenkins/workspace/rst-manual-trunk/upstream/RST-0.19.0-Linux/share/rst0.19/proto/stable/rst/audition/SoundChunk.proto :lines: 35-67 :language: protobuf :emphasize-lines: 6-6,11-11,16-16,21-21,26-26,31-31 .. _message-rst-audition-soundchunk-endianness: Message EndianNess ------------------ .. container:: message-rst-audition-soundchunk-endianness-multi .. container:: message-rst-audition-soundchunk-endianness-documentation .. py:class:: rst.audition.SoundChunk.EndianNess The possible byte-orders for representing samples. .. py:attribute:: ENDIAN_LITTLE = 0 Samples are represented with little Endian byte-order. .. py:attribute:: ENDIAN_BIG = 1 Samples are represented with big Endian byte-order. .. container:: message-rst-audition-soundchunk-endianness-source :download:`Download this file ` .. literalinclude:: //home/jenkins/workspace/rst-manual-trunk/upstream/RST-0.19.0-Linux/share/rst0.19/proto/stable/rst/audition/SoundChunk.proto :lines: 72-83 :language: protobuf :emphasize-lines: 6-6,11-11 .. _message-rst-audition-phonemecollection: Message PhonemeCollection ------------------------- .. container:: message-rst-audition-phonemecollection-multi .. container:: message-rst-audition-phonemecollection-documentation .. py:class:: rst.audition.PhonemeCollection Collection of :py:class:`Phoneme ` instances. Auto-generated. .. py:attribute:: element :type: array of :py:class:`rst.audition.Phoneme` The individual elements of the collection. Constraints regarding the empty collection, sorting, duplicated entries etc. are use case specific. .. container:: message-rst-audition-phonemecollection-source :download:`Download this file ` .. literalinclude:: //home/jenkins/workspace/rst-manual-trunk/upstream/RST-0.19.0-Linux/share/rst0.19/proto/stable/rst/audition/PhonemeCollection.proto :lines: 14-24 :language: protobuf :emphasize-lines: 9-9 .. _message-rst-audition-phoneme: Message Phoneme --------------- .. container:: message-rst-audition-phoneme-multi .. container:: message-rst-audition-phoneme-documentation .. py:class:: rst.audition.Phoneme Objects of this represent a single phoneme-duration pair. A list of elements of this type can be used to describe words or whole sentences of speech. .. codeauthor:: Simon Schulz @create_collection .. py:attribute:: symbol :type: :py:class:`ASCII-STRING` A single phone symbol (such as aI, E, C, R, _, ...). e.g. see https://en.wikipedia.org/wiki/Phoneme or http://www.phon.ucl.ac.uk/home/sampa/german.htm (german) examples .. py:attribute:: duration :type: :py:class:`UINT32` **Unit**: millisecond The duration of this symbol. .. container:: message-rst-audition-phoneme-source :download:`Download this file ` .. literalinclude:: //home/jenkins/workspace/rst-manual-trunk/upstream/RST-0.19.0-Linux/share/rst0.19/proto/stable/rst/audition/Phoneme.proto :lines: 16-33 :language: protobuf :emphasize-lines: 10-10,16-16