baudline
Home
News
What is baudline?
Screenshots
Download
FAQ
Manual
Search
Solutions
Mystery Signal
Contact
open file
The open file window is used to load data sample files for spectral analysis and visualization.  Baudline supports many different audio file formats, data types, and codecs (see complete list and this FAQ entry on installing the input decoder helpers).  Files can be loaded from the open file window shown below or from the command line by typing "baudline file_name.wav".  Note that files can only be loaded from the open file window while in the Pause mode.



Directories
The current directory can be changed by double clicking on the names in the Directories box.  The dot-dot (..) entry takes you up a directory while the dot (.) does nothing.

Files
Doubling clicking on a name in the Files box opens it which is the same as single clicking on the filename and hitting the OK button.

File Format
This region contains the popup option menu shown on the right. The default value is auto magic which means that the format type is automatically determined based on a combination of file name suffix and magic bytes (see man magic for more information).  Also files that are compressed with zip, compress, gzip, bzip2, or flac are automatically decompressed prior to opening.  The raw file format option is useful for reading headerless audio files that contain binary or ASCII data.  Note that this option opens the raw parameters window which is described below.  The GSM 6.10 and MPEG are useful options if the auto magic file format method fails to determine the correct format type.  Think of them as manual override controls.

Selection
This box shows the current file or directory.  File names and directories can be changed from the keyboard by typing into this free form box.  Hitting the Enter key or pressing the OK button then opens the file.

Since baudline loads the entire file into RAM, attempting to load a file that is larger than your total memory would result in a lot of swapping.  So in order to prevent this, baudline clamps the maximum file size to be equal to the amount of physical RAM; it also pops up a message warning that this is happening.

raw parameters
The raw parameters window is a method for importing non-standard or raw headerless data files into baudline.  It is a collection of parameter fields that can be manually set to suit the specific data's sample rate, channel layout, and format type.  This window appears when a file is opened from the open file window with the raw file format option. 


Directory, File Name, File Size
These fields are all self-explanatory.

Magic Hints
The classification type determined by the Unix file command is displayed here.  This is for informational purposes only which may aid in the manual setting of the following raw parameters:

Decompression
The default auto magic case is similar to the File Format option in the open file window.  If a file is losslessly compressed with an algorithm such as gzip, bzip2, or flac, it is automatically decompressed prior to opening.  It is possible that random headerless binary data could have the magic bytes at the start of the file that would incorrectly identify the file as being wrapped in some lossless compression scheme.  Or it could be the desire of the user to analyze the raw data while it is in its compressed state. In either case the decompression option can be manually set to OFF.

Initial byte offset
Byte alignment is important for decode format types that are 16 or more bits long. An incorrect initial byte offset will render valid data to be loaded as garbage.  A useful technique is to incrementally try offsets and hit the Apply button until the data comes "in to focus."  Starting with an initial byte offset of zero and working up to one less than the number of bytes in the chosen decode format  is all that is necessary, since any further offsets wrap around to the zero alignment condition. (example: with the 32 bit float format try going from 0 ... 3)

Sample Rate
This option does not modify the data in any way.  It only affects the time (ms) and frequency (Hz) rulers, and the measurement displays.  If the option "custom" is chosen then an integer or floating point custom sample rate can be entered in the adjacent text box.

Channels
The multi-channel raw data samples can be interleaved just like is common with standard audio formats such as .aiff, .au, and .wav files.  The selectable range of this option is from one (mono) up to nine distinct channels.  An interesting and educational thing to try is to open mono audio data and set the channels option to a value other than one.  This has the effect of decimating the data without proper anti-aliasing.


Decode Format
Raw data is translated and read into baudline using conversion routines set by this option. Most of the decode formats are self-explanatory. 

The ASCII decimal format reads space, comma, tab, or return delimited integer, floating point, or scientific notation numeric text files.  The parsing routine is fairly robust and most numeric text files can be opened without any problems.  A two pass algorithm is used since ASCII decimal data, by it's nature, has a variable width.  This means an ASCII decimal data file is read from the storage media twice.  So when loading very large data files there will be extra delay at the start before loading commences and the load duration will be longer than it would if another Decode Format was used.  OpenOffice Calc data files that have been saved in the text .csv format can be read into baudline with this setting.

The nucleotide mono and RGB format reads DNA basepair text data like CACCGCTGAGAGACCCATACA into baudline.  The mono version reads the data in as a single channel.  The RGB version maps the basepairs into RGB color space in order to converse color coding in the spectrogram view.  Note that the nucleotide RGB version requires three times the RAM and CPU usage than the mono version.  So the mono version is good for minimizing resource usage.  File reading is fairly robust and ">" comment lines are allowed.

The A-law and u-law companding formats are logarithmic mappings between 8 bit space and 13 or 14 bit space as described in the CCITT G.711 encoding and compression recommendation. 

The 1 bit binary format can be either most (MSB) or least (LSB) significant bit first.  With (lsb) an 8 bit byte is assumed.

The 8 bit linear format can be either of the signed or unsigned flavor.  This corresponds to the "char" and "unsigned char" C language data types.

The 16, 24, and 32 bit linear formats are all signed integers.

The 32 bit float and 64 bit double formats are standard IEEE 754 floating point numbers.


Decode formats of 16 bits and larger can have byte orders that are little endian (x86, flipped) or big endian (most other microprocessors, the proper orientation).  Note that .wav files are little endian while .au and .aiff are typically big endian.  For raw files the endian byte order will be the same as the native byte order of  the machine that created the file.


Normalization
Baudline currently stores all internal sample data in 16 bit format.  This means that ASCII decimal, 24 and 32 bit linear, 32 and 64 bit float data all need to be scaled (normalized) so that they fit into 16 bits without over flow.  The auto measure option uses the largest minimum or maximum value in the file and scales that to be the 0 dB value.  Power of two numbers in the range from 1. to 2 billion (2^31) can be manually chosen as the maximum 0 dB value; any values larger than this get clamped.  Manually setting the normalization value is sometimes useful when the auto measure mistakenly uses a very large or very negative value that is representing some type of header information such as sample rate or file size.

Note that auto measure utilizes a two pass algorithm.  So for large data files, manually setting the Normalization factor to a fixed value will result in faster load times.

OK
This button opens the file with the above raw conversion parameters and closes this window.

Apply
This button opens the file with the above raw conversion parameters and unlike the OK button it keeps this window open.  This is useful for reducing mouse clicks and it lets the user quickly experiment with different raw parameter settings.

Bit View
This button opens the bit view window with the current file.  Note that changing the decompression option automatically updates the bit view display.

Cancel
This button closes the raw parameters window.  This could be useful if the user changes his/her mind or as a way of closing after hitting the Apply button.


bit view
The bit view window is a diagnostic view into the bit raster, hex dump, histogram, and poincare byte structure of a data file.  The purpose of this display is to be a visual aid which can help in choosing the correct settings in the raw parameters window.  This window appears when the bit view button is pressed in the raw parameters window.

Click any of the thumbnails for a larger image.



There are four sections, described below, which display useful information about the bit structure of the selected raw file. 

raster
This is a bit raster view of the file.  Bits from the raw data file are drawn most significant bit first (MSB),  starting on the left hand side and wrapping with a carriage return when the rightmost edge is reached.  The scroll bar on the right side navigates through the entire file up to a maximum of 64 megabytes.  The thick selector bar tracks the mouse and the scroll bar; it is linked to the hex dump view described below.  The bit width of the raster window can be changed by pressing the Left/Right arrow keys or Shift+scroll_wheel.

hex dump
This is a hexadecimal and ASCII view of the data currently pointed to by the selector bar in the raster view.

bit helper bars
Pressing and holding any of the 3 mouse buttons in the raster or hex dump windows will cause the bit helper bars to appear.  The first button has a width of 8 bits, the second is 16 bits wide, and the third button xor toggles at a 32 bit width. The bars better show the relationship between the raster and the hex dump windows, and they are also helpful for determining the bit width of the raw data file.  The example on the right is a 24 bit linear mono audio file.  Notice how the bars do not line up and the data syncing seems to alternate. This is what happens when you try overlaying 16 bit bars on 24 bit data.

histogram
Similar to the main histogram display, this view shows the 8 bit amplitude distribution of the current file. The X axis is the histogram bins of the value 0 to 255.  The Y axis is a dynamically scaling representation of the number of hits (probability) of each 8 bit sample value.

poincare
Think of this view as a two dimensional histogram that shows the relationship between neighbor bytes.  It is a X(n) vs X(n-1) plot.  In fact the X axis of the poincare view is the same as the X axis of the histogram view, and because of this, the two views are linked together.  The poincare view is also known as a phase plot in chaotic dynamics theory.  The patterns drawn here can give great insights into the bit structure of the data. 

measurement selector
Pressing and holding the first mouse button in either the histogram or the poincare windows will pop up the measurement selector line seen in the image on the right.  This selector line tracks the mouse motion and it shows the relationship between the histogram and the X(n) vs X(n-1) nature of the poincare window.  An "0x" hex number will appear showing the current bin that is being pointed to, and below it will be a "#" of hits (count) for that particular bin. 

entropy note
The plot in the above picture was created by a G.721 ADPCM 4 bit audio file.  Note that ADPCM is not optimum in the entropy sense like MP3 is, both are lossy compression methods, but ADPCM doesn't make good use of all the bits (see all empty space).  A good entropy encoder like gzip or bzip2 will have fairly flat histograms and mostly black (full) poincare plots.  The point of entropy encoding is to maximize the available bandwidth (bit width and bit randomness in this case).


file info
The file info window displays many self-explanatory inner details of the currently loaded file. The Codec field shows the type of compression, companding, or linear PCM data type used.  The Duration field is in the HH:MM:SS.mmm format.  The Comment section is a multiple line free-form field that displays the very non-standard user comment, tag, or info data.


Copyright © 2007 SigBlips.com - group - blog - site map